Histogram & 9 steps to implement a histogram

Steps to implement histogram

Histogram definition-What is a histogram?

A histogram is a statistical tool. Histograms are mostly used to analyze process behavior. The histogram looks similar to the bar chart, but there is a difference between both. The histogram shows continuous data on X-Axis, whereas in the bar-graph there is no need to indicate continuous data. In this blog, we will learn about types of histogram and 9 steps to implement a histogram.

Application of histogram
Histogram
image 4
Example of Histogram

Bar Graph

If the data is to be indicated in the categories bar-graph is the best option. In the bar-graph, there is no class interval required. In the following bar-graph, the rent of 1000 square-feet home is shown. No class interval has been shown between two columns.

image 3
Bar Chart – Rent of 1000 sq.ft home in metro Cities

Histogram Graph Vs. Bar Graph

HistogramBar Graph
The histogram indicates the frequency of occurrences.The bar graph indicates different data categories
Class Interval required in the histogramNo class interval is required in the bar graph.
In histogram columns/blocks cannot be re-arrangedIn a bar graph columns/blocks can be re-arranged in
ascending or descending order or lowest to highest.
In a histogram, column width may vary.The column width in the bar graph will always remain the same
Histogram useful in calculating process capability.Bar graph useful for comparison of different data categories in one graph.

When to use a histogram (Application of Histogram)

  1. To understand process behavior, whether the process is ‘normally distributed process’ or not?
  2. To analyze whether the process meets output/customer requirements or not?
  3. To analyze whether any data points fall beyond the control limit or not?
  4. Whether the process is subjected to drift/change due to environmental or other input parameters over time.
  5. To analyze special causes in the process.
  6. To establish verification and validation of the process.
  7. Histogram data can help to determine the key root causes of process failure/defects by using problem-solving tools like fish-bone/Ishikawa or 5-Why Analysis or CAPA.

Types of Histograms (histogram shapes)

1.Bar Graph

2.Column Graph

3.Normal Histogram

4.Bimodal histogram (Polymodal)

5.Negatively Skewed Distribution

6.Truncated Histogram

Bar Graph
Bar Graph
Column Graph
Column Graph

What is the meaning of ‘normally distributed process’?

Normally Distributed Process
Normally Distributed Process
  1. In a normally distributed process, data points generate a bell-shaped curve
  2. The highest point on the center-line of the normally distributed curve acts as an average. 
  3. Center-line divides a bell-shaped curve into two symmetrical sections.
  4. Most of the data points appear near average.
  5. Max and min points appear at less frequency.

How to make a histogram (Steps for drafting a histogram)

Step 1 Data collection: Collect data of the process which you are planning to analyze.  For better results, collect data points more than 50 to 150.

123456
A30.230.128.529.529.430.3
B29.830.329.830.129.128.7
C30.129.429.530.329.930.3
D28.730.2.29.830.128.528.8
E29.429.929.929.428.830.3
F30.130.130.328.929.129.8
G30.130.129.829.828.929.2
H30.328.830.12928.530.3
I2928.529.129.629.329.8
J3029.630.429.729.930
K30.429.43029.129.129.9
Histogram – Data Collection

Step 2 Calculate the number of data point ‘P’ as per following details,

No of Rows = 11 (A to k)

No of Columns = 6 (1 to 6)

Number of Data Points ‘P’ = No of Rows (11) X No. of columns (6) = 66

Step 3 Calculate the total range ‘R’ as per following details,

Range = Highest Value – Lowest Value

Range = 30.5 -28.5 = 2

Step 4 Choose the number of bins/columns: The shape of the histogram depends on the number of columns. As a rule of thumb, columns/bin should not be too large or too small. It’s a best practice to select number of column approximately square root of the data point (P)

Choose the number of bins/columns = Square root of data point ‘P’ (66) = 8.12

*Consider 8 columns for histogram.

Step 5 Calculate Column/bins width: Calculate column width as per the following formula

Column Width = Range (2) / number of columns (8)

Column Width = 0.25

Step 6 Calculate column/bins intervals as per the following formula

Column Intervals = Smallest Values + Column width

Eg. 28.5 – 28.75

Step 7 Draft a histogram count sheet as shown in the following table.

ColumnColumn intervalTally / Count
128.5 – 28.75
228.76 – 29.1
329.2 – 29.45
429.46 – 29.71
529.72 – 29.97
629.98 – 30.23
730.24 – 30.49
830.5
Histogram Count Sheet

Step 8 Draw and label X & Y-axis: Add characteristics on X-axis and frequency on Y-axis. Draw the histogram based on histogram count table/data.

Step 9 Connect each column/bins cell’s midpoints by a curve. Determine, whether histogram generated bell-shaped curve or not.

Histogram shapes and it’s analysis: The first phase of drafting histogram we have covered so far, now let’s understand different types of a histogram and it’s analysis.

Skewed Distribution:

There are two types of skewed distribution, right side skewed distribution, and left side distribution. If the tail exists towards the left side of the histogram, it is considered as a negatively skewed distribution. If the tail exists towards the right side of the histogram, it is considered as a positively skewed distribution. The skewed distribution indicates uneven quality/output of the process. If the skewness exists in the process, it means the process capability must be verified. Skewness left or right indicates that the process may go out of control. Without corrective action, process output/products will have defects.

Skewed Distribution - Positive skewness, negative skewness
Skewed Distribution
  • Left Side / Negative Skew – Mean is less than the median
  • Right Side / Positive Skew – Mean is greater than the median

Histogram with special causes:  

If you observe one or two-columns in the histogram is showing a shift/spike in frequency, it’s known as a histogram with special causes. To obtain a normally distributed process, special causes in the process must be eliminated. 

image 8
Histogram with special causes

Bimodal Histogram:

This histogram chart has unique features, it has two peaks and one valley. A bimodal histogram is useful to measure the machine performance/output in different shift-operation or in a slight variation in input parameters.

Polymodal / bimodal histogram
Polymodal / Bimodal Histogram

We hope, you have gained knowledge from this blog. If you liked this content and suggesting improvement in this content, please comment below. We are happy to improve this blog with your valuable suggestion.

One thought on “Histogram & 9 steps to implement a histogram

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.