CHAPTER 5- Measures of Central Tendency
Introduction:
In statistics, we often deal with large sets of data, which can be hard to understand in their raw form. To summarize this data and get a clear idea of its general trend, we use measures of central tendency. These measures give us a single value that represents the entire dataset. The three most commonly used measures of central tendency are the Arithmetic Mean, Median, and Mode.
Arithmetic Mean:
The Arithmetic Mean (or simply "mean") is the most common way to find the average of a dataset.
Definition: The mean is calculated by adding up all the values in a dataset and dividing by the number of values.
Formula:
Mean=∑XN\text{Mean} = \frac{\sum X}{N}
Mean=N∑Xwhere
∑X\sum X
∑X is the sum of all values, and
NN
N is the number of values.
1.1 Example of Arithmetic Mean:
If the monthly incomes of 6 families are ₹1600, ₹1500, ₹1400, ₹1525, ₹1625, and ₹1630, the mean income is:
Mean=1600+1500+1400+1525+1625+16306=₹1547\text{Mean} = \frac{1600 + 1500 + 1400 + 1525 + 1625 + 1630}{6} = ₹1547
Mean=61600+1500+1400+1525+1625+1630=₹1547
This means that, on average, each family earns ₹1547 per month.
1.2 Advantages:
The mean is easy to calculate and understand.
It takes all values into account, making it a reliable measure when there are no extreme values (outliers).
1.3 Disadvantages:
The mean can be affected by extremely high or low values, which may distort the true average.
Median:
The Median is the middle value of a dataset when the values are arranged in order. If the dataset has an odd number of values, the median is the middle one. If the dataset has an even number of values, the median is the average of the two middle values.
2.1 Example of Median:
If the weekly incomes of 10 families are ₹850, ₹700, ₹100, ₹750, ₹5000, ₹80, ₹420, ₹2500, ₹400, and ₹360, arranging the values in order gives:
₹80, ₹100, ₹360, ₹400, ₹420, ₹700, ₹750, ₹850, ₹2500, ₹5000
Since there are 10 values, the median is the average of the 5th and 6th values:
Median=420+7002=₹560\text{Median} = \frac{420 + 700}{2} = ₹560
Median=2420+700=₹560
2.2 Advantages:
The median is not affected by extreme values, making it a better choice when dealing with data that includes outliers.
2.3 Disadvantages:
The median does not take into account the actual values of all the data points, only their positions in the ordered list.
Mode:
The Mode is the value that appears most frequently in a dataset. It is useful when you want to know the most common value.
3.1 Example of Mode:
If a dataset contains the numbers 1, 2, 3, 4, 4, 5, the mode is 4, as it occurs twice, more than any other number.
3.2 Advantages:
The mode is easy to find and understand.
It is particularly useful for qualitative data, such as finding the most popular shoe size or the most common shirt color.
3.3 Disadvantages:
A dataset can have more than one mode (bimodal or multimodal), or no mode at all if no value repeats.
Comparison of the Three Measures:
Arithmetic Mean: Best for datasets without extreme values (outliers). It gives a balanced average.
Median: Ideal for skewed datasets or when there are outliers, as it is not affected by extreme values.
Mode: Useful when the most frequent value is of interest, especially for non-numerical data.
Conclusion:
In summary, measures of central tendency help us understand large datasets by giving us a single value that represents the data. The mean is the most commonly used average but is sensitive to extreme values. The median is useful when you want to avoid the influence of outliers, while the mode tells us the most common value in the dataset. Choosing the right measure depends on the nature of the data and the question you want to answer.
Recap:
Arithmetic Mean: The sum of all values divided by the number of values.
Median: The middle value of an ordered dataset.
Mode: The value that appears most frequently.
Each measure has its advantages and disadvantages, and the choice of which to use depends on the characteristics of the dataset.