Class 11 Statistics Notes Chapter 7 (Correlation) – Statistics For Economics Book

Statistics For Economics
Detailed Notes with MCQs of Chapter 7: Correlation from your Statistics for Economics textbook. This is a fundamental concept, not just for your exams, but for understanding relationships between economic variables in the real world. Pay close attention, as questions from this chapter frequently appear in government exams requiring statistical knowledge.

Chapter 7: Correlation - Detailed Notes

1. Introduction and Meaning

  • What is Correlation? Correlation analysis is a statistical tool used to measure the relationship or association between two or more variables. In simpler terms, it tells us if and how two variables move together.
  • Focus in this Chapter: We primarily focus on bivariate correlation, meaning the relationship between two variables (like price and demand, or income and expenditure).
  • Key Question: Does a change in one variable lead to a change in the other variable? If yes, in what direction and how strongly?

2. Types of Correlation

Correlation can be classified based on direction and linearity:

  • (a) Based on Direction:

    • Positive Correlation: When two variables move in the same direction. If one increases, the other tends to increase; if one decreases, the other tends to decrease.
      • Example: Generally, income and consumption expenditure (higher income often leads to higher spending). Rainfall and crop yield (up to a point).
    • Negative Correlation: When two variables move in opposite directions. If one increases, the other tends to decrease, and vice versa.
      • Example: Price of a commodity and its quantity demanded (higher price usually leads to lower demand). Temperature and sale of woolen clothes.
    • Zero Correlation (or No Correlation): When there is no discernible relationship between the two variables. A change in one variable does not seem to influence the other.
      • Example: Shoe size and intelligence level. Height of students and marks obtained in economics.
  • (b) Based on Linearity (Though less emphasized in basic calculations, good to know):

    • Linear Correlation: When the change in one variable tends to be proportional to the change in the other, and plotting them on a graph results roughly in a straight line.
    • Non-linear (or Curvilinear) Correlation: When the change in one variable is not proportional to the change in the other, and plotting them results in a curve.

3. Degree of Correlation

Correlation doesn't just tell us the direction, but also the strength of the relationship. This is measured by the Correlation Coefficient (usually denoted by 'r').

  • Range: The value of the correlation coefficient 'r' always lies between -1 and +1 (inclusive).
    • r = +1: Perfect Positive Correlation (Variables move together perfectly proportionally).
    • r = -1: Perfect Negative Correlation (Variables move opposite perfectly proportionally).
    • r = 0: No Correlation (No linear relationship).
  • Interpreting Values between -1, 0, and +1:
    • High Degree: Values close to +1 or -1 (e.g., ±0.75 to ±1). Indicates a strong relationship.
    • Moderate Degree: Values somewhat away from 0 but not very close to ±1 (e.g., ±0.25 to ±0.75). Indicates a noticeable but not overwhelming relationship.
    • Low Degree: Values close to 0 (e.g., 0 to ±0.25). Indicates a weak relationship.

4. Methods of Studying Correlation

There are several ways to study and measure correlation:

  • (a) Scatter Diagram:

    • What it is: A graphical method. Each pair of (X, Y) values is plotted as a point on a graph.
    • Interpretation: The pattern of the plotted points reveals the type and approximate strength of correlation.
      • Points clustering upwards from left to right: Positive Correlation.
      • Points clustering downwards from left to right: Negative Correlation.
      • Points widely scattered with no clear pattern: Zero Correlation.
      • Points tightly clustered around a line (imaginary): High Degree of Correlation.
      • Points loosely scattered: Low Degree of Correlation.
    • Advantage: Simple, visual, gives a quick idea of the relationship, helps identify non-linear relationships or outliers.
    • Disadvantage: Does not give a precise numerical value for the correlation strength.
  • (b) Karl Pearson's Coefficient of Correlation (Product-Moment Correlation Coefficient):

    • What it is: The most widely used quantitative method for measuring linear correlation. Denoted by 'r'.
    • Assumption: Assumes a linear relationship between variables. Sensitive to outliers.
    • Formula (Conceptual): It measures the ratio of the covariance between the two variables to the product of their standard deviations.
      • While multiple calculation formulas exist (Direct Method, Assumed Mean Method, Step-Deviation Method), understand the concept: it quantifies the linear association.
    • Properties of 'r':
      • Lies between -1 and +1.
      • It is a pure number, independent of units of measurement (e.g., correlation between height in cm and weight in kg is the same as between height in inches and weight in pounds).
      • The sign (+ or -) indicates the direction of the relationship.
      • The magnitude indicates the strength.
      • Correlation between X and Y is the same as between Y and X (rxy = ryx).
      • It is unaffected by change of origin (adding/subtracting a constant from all values of a variable) and change of scale (multiplying/dividing all values by a constant).
  • (c) Spearman's Rank Correlation Coefficient:

    • What it is: A method used when data is qualitative (e.g., beauty, honesty, intelligence) or when the assumptions of Pearson's method are not met (e.g., non-linear relationship, presence of outliers). It measures the correlation between the ranks assigned to the observations. Denoted by 'R' or 'ρ' (rho).
    • Procedure:
      1. Assign ranks to the values of each variable (X and Y) separately (usually lowest value = rank 1, or highest = rank 1, consistently).
      2. Calculate the difference between the ranks (d = Rx - Ry) for each pair.
      3. Square the differences (d²).
      4. Sum the squared differences (Σd²).
      5. Apply the formula: R = 1 - [ (6 * Σd²) / (n * (n² - 1)) ] where 'n' is the number of pairs of observations.
    • Handling Tied Ranks: If two or more items have the same value, they are assigned the average of the ranks they would have occupied. A correction factor is sometimes applied to the formula if ties are numerous, but for basic understanding, the main formula is key.
    • Interpretation: Similar to Pearson's 'r', the value lies between -1 and +1, with the sign indicating direction and magnitude indicating strength of association between the ranks.

5. Significance of Correlation

  • Helps understand the degree and direction of relationship between variables crucial for economic theory (e.g., demand theory, investment theory).
  • Provides a basis for prediction (though correlation alone doesn't guarantee accurate prediction). If two variables are highly correlated, we can estimate the value of one based on the value of the other.
  • Useful for policymakers and businesses in decision-making.

6. Correlation vs. Causation - A Crucial Distinction!

  • Correlation simply indicates that two variables tend to move together. It DOES NOT necessarily mean that one variable causes the change in the other.
  • Causation implies a cause-and-effect relationship.
  • Spurious Correlation: A high correlation might exist between two variables simply by chance, or because both are influenced by a third, unobserved variable.
    • Classic Example: Ice cream sales and crime rates might be positively correlated. Does eating ice cream cause crime? No. Both are likely influenced by a third variable: hot weather.
  • Remember: Correlation is a necessary condition for causation, but it is not a sufficient condition.

Multiple Choice Questions (MCQs)

Here are 10 MCQs to test your understanding. Choose the best answer.

  1. Correlation analysis primarily aims to study:
    a) The difference between two variables
    b) The average of two variables
    c) The relationship or association between two variables
    d) The cause-and-effect link between two variables

  2. If the points on a scatter diagram tend to cluster in a band falling from the top-left corner to the bottom-right corner, the correlation is said to be:
    a) Positive
    b) Negative
    c) Zero
    d) Perfect Positive

  3. The value of Karl Pearson's coefficient of correlation ('r') always lies between:
    a) 0 and +1
    b) -1 and 0
    c) -1 and +1
    d) -∞ and +∞

  4. If the correlation coefficient 'r' between two variables X and Y is +0.95, it indicates:
    a) A very weak positive relationship
    b) A very strong positive relationship
    c) A very strong negative relationship
    d) No relationship

  5. Spearman's Rank Correlation coefficient is particularly useful when:
    a) The data is quantitative and normally distributed
    b) The relationship is perfectly linear
    c) The data is qualitative or has extreme outliers
    d) The number of observations is very large

  6. If the correlation coefficient between variable X (height in cm) and variable Y (weight in kg) is 0.7, what would be the correlation if height was measured in inches and weight in pounds?
    a) Lower than 0.7
    b) Higher than 0.7
    c) Exactly 0.7
    d) Cannot be determined

  7. A correlation coefficient of r = 0 between two variables indicates:
    a) A perfect relationship
    b) A strong inverse relationship
    c) Absence of a linear relationship
    d) That one variable causes the other

  8. Which method provides a visual representation of the relationship between two variables but not a precise numerical measure of correlation?
    a) Karl Pearson's Method
    b) Spearman's Rank Method
    c) Scatter Diagram Method
    d) Regression Analysis

  9. The statement "Correlation does not imply causation" means:
    a) If two variables are correlated, one must cause the other.
    b) A statistical relationship between two variables doesn't necessarily mean one is the direct cause of the other.
    c) Causation can only exist if correlation is perfect (+1 or -1).
    d) Correlation and causation are essentially the same concepts.

  10. If an increase in the price of petrol leads to a decrease in the demand for cars, the correlation between petrol price and car demand is likely:
    a) Positive
    b) Negative
    c) Zero
    d) Perfect Positive


Answer Key for MCQs:

  1. c
  2. b
  3. c
  4. b
  5. c
  6. c (Correlation coefficient is independent of change of scale)
  7. c
  8. c
  9. b
  10. b

Study these notes thoroughly. Understand the concepts, the differences between the methods, and especially the interpretation of the correlation coefficient and the distinction between correlation and causation. Good luck with your preparation!

Read more