Achievable logoAchievable logo
USMLE/1
Sign in
Sign up
Purchase
Textbook
Feedback
Community
How it works
Resources
Exam catalog
Mountain with a flag at the peak
Textbook
1. Anatomy
2. Microbiology
3. Physiology
4. Pathology
5. Pharmacology
6. Immunology
7. Biochemistry
8. Cell and molecular biology
9. Biostatistics and epidemiology
9.1 Measure of disease frequency
9.2 Measures of health status
9.3 Reportable diseases
9.4 Variables and distributions
9.5 Standard deviation and confidence intervals
9.6 Measures of association
9.7 Types of study design
9.8 Bias
9.9 Hypothesis testing
9.10 Sensitivity, specificity and predictive values
9.11 Phases of drug approval
9.12 Doctor patient relationships, ethics and decision-making capacity
9.13 Additional information
10. Genetics
11. Behavioral science
Achievable logoAchievable logo
9.5 Standard deviation and confidence intervals
Achievable USMLE/1
9. Biostatistics and epidemiology

Standard deviation and confidence intervals

4 min read
Font
Discuss
Share
Feedback

What is spread? Also called variation or dispersion. It refers to the distribution out from a central value. Measures of spread include the following:

  1. Range: The range of a set of data is the difference between its largest (maximum) value and its smallest (minimum) value.
  2. Quartiles: Data is grouped into four equal parts or quartiles. Each quartile includes 25% of the data. The cut-off for the second quartile is the 50th percentile, which is the median.
  3. Interquartile range: It represents the central portion of the distribution, from the 25th percentile to the 75th percentile. , i.e. it includes the second and third quartiles.
  4. Standard deviation or SD or sigma : It measures the dispersion of a dataset relative to its mean. Lower value of SD means less variability in data and vice versa. It works only when the data is normally or symmetrically distributed.

Steps to calculate the SD:

Step 1. Calculate the arithmetic mean.

Step 2. Subtract the mean from each observation.

Step 3. Square the difference.

Step 4. Sum the squared differences.

Step 5. Divide the sum of the squared differences by n − 1.

Step 6. Take the square root of the value obtained in Step 5. The result is the standard deviation.

Bell curve and distributions
Bell curve and distributions

Bell-shaped curve with the standard deviations equally distributed on the x-axis. 99.7% of the data falls between the minus 3 and plus 3 standard deviation. 95.5% of the data falls between the minus 2 and plus 2 standard deviation. 68.3% of the data falls between the minus 1 and plus 1 standard deviations.

Areas included in normal distribution

  • ±1 SD includes 68.3%
  • ±1.96 SD includes 95.0%
  • ±2 SD includes 95.5%
  • ±3 SD includes 99.7%

**Standard error of mean (SEM) **: It measures how far the sample mean of the data is likely to be from the true population mean. The SEM is always smaller than the SD.

SEM = Standard deviation/ square root of n , where “n” is the number of observations.

SEM is used in calculating confidence intervals around the arithmetic mean.

Confidence intervals or confidence limits (CI): It is a range of values which is most likely to contain a population parameter like mean. It is expressed as a percentage like 95%, 99% etc.

  • The percentage is related to how much confidence or how much is the probability that the mean obtained from a study is the real mean.
  • For example, 95% confidence interval is a range of values that you can be 95% certain contains the true mean of the population.
  • As sample size increases, the range will narrow and precision increases.
  • A narrow confidence interval indicates high precision whereas a wide confidence interval indicates low precision.

CI is given by the formula,

CI = Mean + or - Z x SEM

CI is given as a range with an upper and lower limit. To calculate 95% CI, the value of Z taken is 1.96 and to calculate 99% CI Z is taken as 2.58.

Problem 1: Find the 95% confidence interval for a mean total cholesterol level of 206, standard error of the mean of 3.

Using the formula above, upper limit = 206 + 1.96 X 3 = 211.88

Lower limit = 206 - 1.96 X 3 = 200.12

CI is 200.12 to 211.88.

In other words, the best estimate of the mean total cholesterol in the true population, from the given data is 206, but the mean can lie anywhere between 200.12 and 211.88.

If the CI of two groups does not overlap, then it means that a statistically significant difference exists. If the CI of two groups overlap, then it means that no significant difference exists.

Sign up for free to take 2 quiz questions on this topic

All rights reserved ©2016 - 2025 Achievable, Inc.