Class 11 Mathematics Notes Chapter 15 (Statistics) – Mathematics Book
Detailed Notes with MCQs of Chapter 15: Statistics from your NCERT Class 11 Mathematics textbook. This chapter is fundamental, not just for your Class 11 exams, but it forms the basis for many quantitative sections in government exams. We'll be looking primarily at Measures of Dispersion.
Why Study Dispersion?
Recall measures of central tendency (Mean, Median, Mode) from your earlier classes. They give us a single value that represents the center of the data. However, they don't tell us anything about how spread out the data is.
- Example: Consider two batsmen, A and B, with the following scores in 5 matches:
- Batsman A: 30, 91, 0, 64, 42 (Mean = 45.4)
- Batsman B: 53, 46, 48, 50, 53 (Mean = 50)
While Batsman B has a slightly higher average, Batsman A's scores are much more spread out (from 0 to 91), whereas Batsman B's scores are very consistent (clustered around 50). Measures of dispersion help us quantify this spread or variability.
Measures of Dispersion
Dispersion tells us the extent to which the values in a distribution differ from the average of the distribution. Key measures we study are:
-
Range:
- Definition: The simplest measure of dispersion. It's the difference between the maximum and minimum observations in the data.
- Formula: Range = Maximum Value (L) - Minimum Value (S)
- Pros: Easy to calculate and understand.
- Cons: Highly affected by extreme values (outliers). It only uses two data points and ignores the distribution of other points.
-
Mean Deviation (MD):
-
Definition: The arithmetic mean of the absolute deviations of the observations from a measure of central tendency (either mean or median). It tells us, on average, how far the data points are from the center.
-
Why absolute deviations? If we just took the deviations (xᵢ - x̄), their sum would always be zero. Taking the absolute value (|xᵢ - x̄|) ensures we measure the magnitude of the deviation, regardless of direction.
-
Formulas:
- Mean Deviation about the Mean (MD(x̄)):
- Ungrouped Data: MD(x̄) = Σ |xᵢ - x̄| / n
- Grouped Data (Discrete/Continuous): MD(x̄) = Σ fᵢ|xᵢ - x̄| / N (where N = Σfᵢ)
- Mean Deviation about the Median (MD(M)):
- Ungrouped Data: MD(M) = Σ |xᵢ - M| / n
- Grouped Data (Discrete/Continuous): MD(M) = Σ fᵢ|xᵢ - M| / N (where N = Σfᵢ)
- Mean Deviation about the Mean (MD(x̄)):
-
Steps for Calculation (e.g., MD about Mean for Grouped Data):
- Calculate the mean (x̄) of the distribution.
- Find the deviation of each observation (xᵢ) from the mean (xᵢ - x̄). For continuous data, use mid-points of classes as xᵢ.
- Find the absolute value of these deviations: |xᵢ - x̄|.
- Multiply each absolute deviation by its corresponding frequency: fᵢ|xᵢ - x̄|.
- Sum these products: Σ fᵢ|xᵢ - x̄|.
- Divide the sum by the total frequency (N): MD(x̄) = Σ fᵢ|xᵢ - x̄| / N.
-
Pros: Considers all observations. Provides a good measure of average spread from the center.
-
Cons: Ignoring the signs makes it mathematically less convenient for further analysis compared to standard deviation. Calculation can be lengthy. MD is minimum when calculated about the median.
-
-
Variance (σ²) and Standard Deviation (σ):
-
Definition: These are the most important and widely used measures of dispersion. Instead of taking absolute values to handle the signs of deviations, we square the deviations.
- Variance (σ²): The mean of the squared deviations from the arithmetic mean.
- Standard Deviation (σ): The positive square root of the variance. It's preferred over variance because it has the same units as the original data.
-
Why square deviations? Squaring eliminates negative signs and gives more weight to larger deviations, making the measure sensitive to points far from the mean. It also has desirable mathematical properties.
-
Formulas:
- Ungrouped Data:
- Variance (σ²) = Σ (xᵢ - x̄)² / n
- Standard Deviation (σ) = √[ Σ (xᵢ - x̄)² / n ]
- Grouped Data (Discrete/Continuous Frequency Distribution):
- Variance (σ²) = Σ fᵢ(xᵢ - x̄)² / N (where N = Σfᵢ)
- Standard Deviation (σ) = √[ Σ fᵢ(xᵢ - x̄)² / N ]
- Shortcut Method / Step-Deviation Method (Often faster for calculations, especially with large numbers or continuous data):
- Variance (σ²) = h² [ Σ fᵢdᵢ² / N - (Σ fᵢdᵢ / N)² ] (Using step-deviation dᵢ = (xᵢ - A)/h)
- Variance (σ²) = [ Σ fᵢxᵢ² / N - (Σ fᵢxᵢ / N)² ] = [ Σ fᵢxᵢ² / N - (x̄)² ] (Direct shortcut)
- Standard Deviation (σ) is the square root of the respective variance formula.
- Note: 'A' is the assumed mean, 'h' is the class width (for continuous data, usually). For discrete data, you can use the direct shortcut without 'h'.
- Ungrouped Data:
-
Pros: Mathematically tractable, widely used in statistical inference, considers all data points, reflects the spread around the mean effectively.
-
Cons: Sensitive to outliers (due to squaring). Variance is in squared units, making interpretation less direct than SD.
-
-
Coefficient of Variation (CV):
- Need: Measures like Range, MD, and SD are absolute measures of dispersion. They depend on the units of measurement and the magnitude of the data. We cannot directly compare the variability of two datasets with different units (e.g., heights in cm vs. weights in kg) or vastly different means (e.g., salaries of clerks vs. salaries of managers) using absolute measures.
- Definition: CV is a relative measure of dispersion. It expresses the standard deviation as a percentage of the mean.
- Formula: CV = (σ / x̄) * 100 (where x̄ ≠ 0)
- Interpretation:
- A lower CV indicates greater consistency or less variability in the data.
- A higher CV indicates less consistency or greater variability in the data.
- Use: Used to compare the variability, consistency, or stability of two or more series. The series with the lower CV is considered more consistent.
Key Formulas Summary:
Measure | Ungrouped Data Formula | Grouped Data Formula | Notes |
---|---|---|---|
Range | Max - Min | Max - Min | Simplest, affected by outliers. |
MD about Mean (MD(x̄)) | Σ |xᵢ - x̄| / n | Σ fᵢ|xᵢ - x̄| / N | Average absolute deviation from mean. |
MD about Median (MD(M)) | Σ |xᵢ - M| / n | Σ fᵢ|xᵢ - M| / N | Average absolute deviation from median. |
Variance (σ²) | Σ (xᵢ - x̄)² / n | Σ fᵢ(xᵢ - x̄)² / N OR [Σ fᵢxᵢ²/N] - (x̄)² | Average squared deviation from mean. |
Standard Deviation (σ) | √[ Σ (xᵢ - x̄)² / n ] | √[ Σ fᵢ(xᵢ - x̄)² / N ] OR √[ [Σ fᵢxᵢ²/N] - (x̄)² ] | Root mean square deviation. Same units. |
Coefficient of Var (CV) | (σ / x̄) * 100 | (σ / x̄) * 100 | Relative measure, for comparison. Unitless. |
Important Points for Government Exams:
- Understand the concept behind each measure, not just the formula.
- Be comfortable calculating Mean, Median first, as they are often needed for MD, Variance, SD.
- Practice shortcut methods for Variance/SD calculation, especially for grouped data, as they save time.
- Know when to use CV (comparison of variability/consistency).
- Remember: Lower CV means more consistency.
- Be aware of the properties: SD cannot be negative. Variance is the square of SD.
- Effect of change of origin and scale:
- Range, MD, SD, Variance are independent of change of origin (adding/subtracting a constant to all values doesn't change them).
- Range, MD, SD change by scale (multiplying/dividing all values by a constant 'b' multiplies/divides these measures by |b|). Variance changes by b².
Now, let's test your understanding with some multiple-choice questions.
Multiple Choice Questions (MCQs)
-
Which measure of dispersion is the quickest to compute but is affected most by extreme values?
(A) Standard Deviation
(B) Mean Deviation
(C) Range
(D) Coefficient of Variation -
The standard deviation of the data set {2, 4, 6, 8, 10} is:
(A) 2
(B) √8
(C) 8
(D) √10 -
If the variance of a dataset is 64, what is the standard deviation?
(A) 4
(B) 8
(C) 16
(D) 4096 -
The mean deviation is minimum when deviations are taken from the:
(A) Mean
(B) Median
(C) Mode
(D) Range -
Which measure is used to compare the consistency of two different datasets?
(A) Variance
(B) Standard Deviation
(C) Mean Deviation
(D) Coefficient of Variation -
If each observation in a dataset is multiplied by 5, the standard deviation of the new dataset will be:
(A) The same as the original standard deviation
(B) 5 times the original standard deviation
(C) 25 times the original standard deviation
(D) Increased by 5 -
For a frequency distribution, the mean is 50 and the standard deviation is 10. What is the Coefficient of Variation?
(A) 10%
(B) 20%
(C) 5%
(D) 50% -
The sum of squares of deviations of 10 observations from their mean 50 is 250. What is the variance?
(A) 5
(B) 25
(C) 50
(D) 2.5 -
Consider the scores of two cricketers A and B:
- Cricketer A: Mean = 60, SD = 12
- Cricketer B: Mean = 50, SD = 8
Who is the more consistent player?
(A) Cricketer A
(B) Cricketer B
(C) Both are equally consistent
(D) Cannot be determined
-
What is the variance of the first 5 natural numbers (1, 2, 3, 4, 5)?
(A) √2
(B) 2
(C) 3
(D) 1.5
Answer Key:
- (C) Range
- (B) √8 (Mean=6. Deviations: -4, -2, 0, 2, 4. Squared deviations: 16, 4, 0, 4, 16. Sum=40. Variance=40/5=8. SD=√8)
- (B) 8 (SD = √Variance = √64 = 8)
- (B) Median
- (D) Coefficient of Variation
- (B) 5 times the original standard deviation (SD changes by scale |b|)
- (B) 20% (CV = (σ / x̄) * 100 = (10 / 50) * 100 = 20%)
- (B) 25 (Variance = Σ (xᵢ - x̄)² / n = 250 / 10 = 25)
- (B) Cricketer B (CV(A) = (12/60)*100 = 20%. CV(B) = (8/50)*100 = 16%. Lower CV means more consistent, so B is more consistent.)
- (B) 2 (Mean = (1+2+3+4+5)/5 = 3. Deviations: -2, -1, 0, 1, 2. Squared deviations: 4, 1, 0, 1, 4. Sum=10. Variance = 10/5 = 2)
Make sure you revise these concepts thoroughly. Practice calculations for different types of data (ungrouped, discrete grouped, continuous grouped). Understanding the why and when for each measure is crucial for competitive exams. Good luck!