Measures of Central Tendency: Mean

Definition of Central Tendency

Central tendency is a statistical measure that identifies a single value representing the center or typical value of a data set. It aims to provide a concise summary of the entire distribution by locating the point around which most values cluster. The three main measures of central tendency are the meanmedian, and mode.

From these three concepts, we will understand ‘the mean‘ in this article. We’ll explore the other two in individual articles, dedicated to each.

The Mean

The mean is the most commonly used measure of central tendency for data. The term “mean” typically refers to the arithmetic mean. The other two types are geometric mean and harmonic mean. You must have used the phrase “on an average” many times. Whenever you use the phrase, it means that you are considering the average, or mean, for understanding the situation related to a set of observations.

Let us explore the concept of mean by working out some examples, but before that, we will see all the formulas used for calculating the mean from a given dataset.

Formulas:
  1. Population Mean ( \mu )
    Used when you have data for every member of a group (the entire population).
     \mu = \frac{\sum_{i=1}^N {x_i}}{N}
    Where,
    μ (mu) = population mean
    N = total number of values in the population
    xᵢ = each individual value
    Σ (sigma) = sum of all values
  2. Sample Mean (x̄)
    Used when you have data from only a subset (sample) of a larger population. This is the most common formula in practice.
     \bar{x} = \frac{\sum_{i=1}^N {x_i}}{N}
    Where,
     (x-bar) = sample mean
    n = total number of values in the sample
    xᵢ = each individual value
  3. Weighted Mean
    Used when some values in the data set contribute more importance (weight) than others.
     \bar{x_w} = \frac{\sum_{i=1}^N {x_i}\times{w_i}}{\sum_{i=1}^N {w_i}}
    Where,
    wᵢ = weight assigned to each value xᵢ
    xᵢ = each value
    This method can be converted into integration for a continuous distribution, which I will show shortly.
  4. Combined Mean (Pooled Mean)
    Used to find the overall mean when you have the means of two or more separate groups.
     \overline{x_{combined}} = \frac{({n_1}\cdot \bar{x_1})+({n_2}\cdot \bar{x_2})+({n_3}\cdot \bar{x_3})+({n_4}\cdot \bar{x_4})+.....({n_k}\cdot \bar{x_k})}{{n_1}+{n_2}+{n_3}+{n_4}+.....+{n_k}}
    Where,
    n₁, n₂, … = sizes of each group
    x̄₁, x̄₂, … = means of each group
  5. Mean for Grouped Data (Frequency Distribution)
    Used when data is organized into class intervals (e.g., 0–10, 10–20) rather than raw values.
     \bar{x} = \frac{\sum_{i=1}^N {m_i}\times{f_i}}{\sum_{i=1}^N {f_i}}
    Where,
    fᵢ = frequency of each class interval
    mᵢ = midpoint of each class interval (calculated as lower limit+upper limit2​)

Now, let’s explore each formula with the help of an example for each.

Practice Numericals:
  1. Suppose in a classroom only 5 students are present and their heights are as follows (in cm):
    170, 160, 150, 165, 175. Find the mean height of the student.
    Solution:- For a class, this is the entire population data. So, we will calculate the population mean height for these values:
    $$ \mu = \frac{170+160+150+165+175}{5} \\[1.5 cm] \newline \mu = \frac{820}{5} = 164 \space cm $$
  2. Take the first three data points from the above list. Find the mean height of the student now.
    Solution:- As per instructions, 170, 160, and 150 are the data points, which form a sample.
    $$ \bar{x} = \frac{170+160+150}{3} \\[1.5 cm] \newline \bar{x} = \frac{480}{3} = 160 \space cm $$
  3. A student’s final grade is calculated as follows:

    Component Score (x) Weight (w)
    Homework 85 30% (0.3)
    Quizzes 90 20% (0.2)
    Midterm 80 20% (0.2)
    Final Exam 92 30% (0.3)

    Find the avg. marks for the student.
    Solution:- $$ \bar{x} = \frac{(85\times{0.3})+(90\times{0.2})+(80\times{0.2})+(92\times{0.3})} {0.3+0.2+0.2+0.3} \\[1.5 cm] \newline \bar{x} = \frac{25.5+18+16+27.6}{1.0} = \frac{87.1}{1.0} = 87.1% $$

  4. A teacher has two sections of the same course. She wants the overall mean score for all students combined.
ClassNumber of students Mean score ( \bar{x} )
A3078
B2085

Solution:- $$ \overline{x_{combined}} = \frac{(30\times78)+(20\times85)} {30+20} = \frac{2340+1700}{50} = \frac{4040}{50} = 80.8 $$
The pooled mean is just a weighted mean where the weights are the group sizes.

5. A survey asked 50 students how many minutes they spent on social media per day. The results are grouped into class intervals.

Time on social media in minutesFrequency (no. of students)
0 to 95
10 to 1912
20 to 2918
30 to 3910
40 to 495

Find the mean time spent on social media.
Solution:- Let’s find the mean value of each class:
 \overline{x_1} = \frac{0+9}{2} = 4.5 \\[2cm]
 \overline{x_2} = \frac{10+19}{2} = 14.5 \\[2cm]
 \overline{x_3} = \frac{20+29}{2} = 24.5 \\[2cm]
 \overline{x_4} = \frac{30+39}{2} = 34.5 \\[2cm]
 \overline{x_5} = \frac{40+49}{2} = 44.5 \\[2cm]
Now let’s find the mean value:
$$ \overline{x} = \frac{(4.5\times5)+(14.5\times{12})+(24.5\times{18})+(34.5\times10)+(44.5\times5)}{5+12+18+10+5} \\[1.5 cm] \newline \overline{x} = \frac{22.5 + 174 + 441 + 345 + 222.5}{50} = \frac{1205}{50} = 24.10 \space min $$

6. In case of continuous data, summation is replaced by integration for sample mean and weighted mean.
For example, if we have to calculate the mean value of exp(x) in the range 0 to 5, we should calculate it the following way:
$$ \int_{2}^{10} {e^x}dx = (\frac{1}{10-2})\times({e^{10} – e^2}) \\[1.5 cm] \newline \frac{(e^{10}-e^2)}{8} = 2752.38 $$

7. Geometric Mean
To calculate the geometric mean of numbers, we calculate it by using the following formula:
$$ G.M. = (\prod_{i=1}^{n} {x_i}) = \sqrt[n]{{x_1}\times{x_2}\times{x_3}\times{x_4}..\times{x_n}} $$
Reason: The formula uses multiplication and roots (instead of addition and division like the arithmetic mean) because the geometric mean is designed for quantities that are multiplied together or grow exponentially, not added together. So, the geometric mean is a “multiplicative average” for processes that are multiplicative (like growth, scaling, or compounding).

8. Harmonic Mean
We calculate the harmonic mean using the following formulae for various cases:
$$ \text{a. This is the grouped data case} \space H = \frac{\sum{f_i}}{\sum{\frac{f_i}{x_i}}} \\[1.2 cm] $$
$$ \text{b. This is the ungrouped data case} \space H = \frac{n}{\sum{\frac{1}{x_i}}} \\[1.2 cm] $$
$$ \text{c. This is the continuous data case} \space H = \frac{b-a}{\int_{a}^{b} \frac{1}{f(x)} dx} \\[1.2 cm] $$

Here are the most common methods that are used to find the mean. There are other advanced methods that we can employ to find the mean from a dataset.
Basic: Arithmetic, Weighted, Pooled, Grouped, Continuous
Pythagorean means: Arithmetic, geometric, and Harmonic
Robust means: Trimmed, Winsorized
Generalized means: Power, Kolmogorov, Lehmer, Chisini
Specialized means: Midrange, Contra-harmonic, Heronian, Stolarsky

For everyday statistics, the arithmetic, geometric, harmonic, weighted, and trimmed means are the most practical. The others are primarily used in advanced mathematics, engineering, or image processing.

I will post about them with worked-out examples in the future. Until then, practice and learn these methods by working out the problems on paper and applying the same logic in an Excel sheet. Stay curious, and keep learning.


Discover more from universeunlocks.in

Subscribe to get the latest posts sent to your email.

Leave a Reply