Introduction to Statistics for Data Science

2. Grouping and Displaying Data to Convey Meaning (Tables and Graphs)

2.3. Arranging and Constructing a Frequency Distribution


A data array is one of the simplest ways to present data. It arranges the values in some order. See below tables as an example to data arrays.
Now, data arrays offer several advantages over raw data:
  1. One can quickly notice the lowest and highest values in the data
  2. The data can be divided into sections easily
  3. Values appearing more than once can be identified
  4. Distance can be measured between the succeeding values in the data
In spite of the data arrays advantages, sometimes a data array isn't helpful. Because it lists every observation, also it is cumbersome to display if have large quantities. In order to make real meaning of the data we need to compress the information and still be able to use it for interpretation and decision making, this can be done using the frequency distribution.

Do we have a Better Way to Arrange Data? A Frequency Distribution

One way we can compress the data is to use a frequency distribution or a frequency table. There is a difference in arranging the data in an array and in frequency table. Lets take an example to understand this.

A frequency distribution is a table that organizes data into classes i.e. into groups of values describing one characteristic of data.

Because we need to make the class intervals of equal size, the number of classes to determine the width of each class. To find the intervals, we can use the below equation: