2.1 Understanding central tendencies

Achievable Praxis Core: Math (5733)

2. Data analysis, statistics, and probability

Our Praxis Core: Math course is currently in development and is a work-in-progress.

Understanding central tendencies

6 min read

Font

Discuss

Feedback

Measures of central tendency describe a typical value of a data set, while measures of spread describe how much the data varies. These summaries help you compare distributions and understand data at a glance. Here are the key statistical measures you need to know for data analysis.

Definitions

Measures of center

Mean (arithmetic average)

Formula

\overset{x}{ˉ} = \frac{\sum _{i = 1}^{n} x _{i}}{n}

. The balance point of the data. Sensitive to extreme values.

Median (middle value)

Order data from smallest to largest.

If $n$ is odd, the median is the $\frac{n + 1}{2}$ th value.
If $n$ is even, the median is the average of the $\frac{n}{2}$ th and $(\frac{n}{2} + 1)$ th values. Resistant to extreme values.

Mode (most frequent value)

If every value occurs exactly once, the data set has no mode. Otherwise, the mode is the value (or values, if tied) that occurs most often. A data set can be unimodal, bimodal, or multimodal.

Measures of spread

Range: The difference between the highest and lowest values in a data set. It gives a quick sense of how spread out the data are.
Interquartile range (IQR): The spread of the middle 50% of the data, found by subtracting the first quartile ( $Q_{1}$ ) from the third quartile ( $Q_{3}$ ): $IQR = Q_{3} - Q_{1}$ . The IQR is useful for identifying outliers. A common rule: a value is an outlier if it falls below $Q_{1} - 1.5 \times IQR$ or above $Q_{3} + 1.5 \times IQR$ .
Variance: The average squared deviation from the mean; mainly an intermediate step toward computing standard deviation on the Praxis.
Standard deviation: The square root of the variance: $σ = σ^{2}$ . A measure of the average distance of each data value from the mean. A smaller standard deviation means the values are closer to the mean, while a larger one means the data are more spread out.
Extreme values (outliers): Extremely large or extremely small numbers that are far away from the majority of the data.

The two examples below compare a data set without an extreme value to one with an extreme value, showing how each measure responds.

Example: Effect of an extreme value on center

Dataset A: $4, 6, 7, 8, 10$

Ordered: $4, 6, 7, 8, 10$ .

Mean: $\overset{x}{ˉ} = \frac{4 + 6 + 7 + 8 + 10}{5} = \frac{35}{5} = 7$ .

Median: the 3rd value in the ordered list $= 7$ .

Mode: all values are unique, so there is no mode.

Dataset B: $4, 6, 7, 8, 100$ (replace 10 with an extreme value)

Mean: $\frac{4 + 6 + 7 + 8 + 100}{5} = \frac{125}{5} = 25$ .

Median: the 3rd value is still $7$ .

Mode: no mode.

Answer: Dataset A - Mean $= 7$ , Median $= 7$ , no mode. Dataset B - Mean $= 25$ , Median $= 7$ , no mode.

Sidenote

Not all measures of central tendency are affected equally

Notice that the mean shifts from 7 to 25 because of the extreme value, while the median stays at 7. This shows the median’s resistance to extreme values (outliers).

Measures of spread

Measures of spread describe how much the values in a data set vary around the center. For the Praxis, focus on range, interquartile range, and standard deviation. Introductory statistics courses go deeper into variance and its role.

Sidenote

How to find quartiles: the exclusive method

To find $Q_{1}$ and $Q_{3}$ , we use the exclusive method: split the ordered data into a lower half and an upper half, then take the median of each half.

Odd $n$ : exclude the median value itself before splitting.
Even $n$ : the lower half is the first $\frac{n}{2}$ values and the upper half is the last $\frac{n}{2}$ values. Example: $1, 3, 3, 5, 7, 8, 9, 10$ ( $n = 8$ , median = $\frac{5 + 7}{2} = 6$ )
- Lower half: $1, 3, 3, 5$ → $Q_{1} = \frac{3 + 3}{2} = 3$
- Upper half: $7, 8, 9, 10$ → $Q_{3} = \frac{8 + 9}{2} = 8.5$
- $IQR = 8.5 - 3 = 5.5$

This split works the same way regardless of whether the two middle values happen to be equal.

Example: Computing range and IQR

$4, 6, 7, 8, 10$

To find the interquartile range (IQR), start by ordering the data and identifying the median, which is 7. Then split the data into a lower half and an upper half around the median.

Because $n = 5$ is odd, we use the exclusive method: exclude the median (7) before splitting.

The lower half is $4, 6$ , so the first quartile $Q_{1}$ is the average of those two values: $\frac{4 + 6}{2} = \frac{10}{2} = 5$ .

The upper half is $8, 10$ , so the third quartile $Q_{3}$ is $\frac{8 + 10}{2} = \frac{18}{2} = 9$ .

The IQR is then calculated as $Q_{3} - Q_{1} = 9 - 5 = 4$ .

Answer: $4$

Example: Computing standard deviation

$4, 6, 7, 8, 10$

Mean $= 7$ . Squared deviations: $(- 3)^{2} + (- 1)^{2} + 0^{2} + 1^{2} + 3^{2} = 9 + 1 + 0 + 1 + 9 = 20$ .

Variance: $σ^{2} = \frac{20}{5} = 4$ .

Standard deviation: $σ = 4 = 2$ .

Answer: $2$

Choosing the right measure

Now that you know how to compute both center and spread, here’s a guide for deciding which measure to use in a given situation.

Situation	Best measure of center	Best measure of spread	Why
Categorical data (e.g., favorite color)	Mode	Not applicable	Mean and median cannot be calculated, only mode makes sense.
Numerical data, no extreme values, symmetric	Mean	Standard deviation	Uses all values, accurate for well-behaved data.
Numerical data, no extreme values, skewed	Median	IQR	Median resists skew, IQR ignores extremes.
Numerical data with extreme values (outliers)	Median	IQR	Both are resistant to outliers, mean and standard deviation would be distorted.
Small data set	Median	Range or IQR	Range is quick to compute; IQR is still appropriate. Because range is sensitive to outliers, use IQR if any extreme values are present.

Effects of transformations

When a constant $c$ is added to each value in a data set, the mean, median, and mode all increase by $c$ . Measures of spread such as the range, interquartile range (IQR), and standard deviation stay the same, because the distances between values don’t change.

When each data value is multiplied by a positive constant $k$ , the mean, median, and mode are all multiplied by $k$ . The range, IQR, and standard deviation are also multiplied by $k$ , because all distances are scaled by the same factor.

Watch out: adding a constant does NOT change spread

A common Praxis trap is assuming that range and IQR increase when you add a constant to every value. They don’t. Adding 10 to every value shifts the whole data set up by 10, but the distances between values stay exactly the same - so range, IQR, and standard deviation are unchanged. Only the measures of center (mean, median, mode) shift.

For example, if a data set has range $= 6$ and IQR $= 4$ , and you add 100 to every value, the range is still 6 and the IQR is still 4.

Always order data before computing median or quartiles.
Use mean for balanced average but exercise caution with extreme values.
Use median when distribution is skewed or contains extreme values.
Mode is useful for categorical or discrete data.
Range gives quick sense of total spread but is sensitive to extremes.
IQR focuses on central spread ignoring extremes.
Standard deviation quantifies average deviation from the mean.
Remember how transformations affect each measure.

Understanding central tendencies

Definitions

Measures of center

Mean (arithmetic average)

Formula

\overset{x}{ˉ} = \frac{\sum _{i = 1}^{n} x _{i}}{n}

. The balance point of the data. Sensitive to extreme values.

Median (middle value)

Order data from smallest to largest.

If $n$ is odd, the median is the $\frac{n + 1}{2}$ th value.
If $n$ is even, the median is the average of the $\frac{n}{2}$ th and $(\frac{n}{2} + 1)$ th values. Resistant to extreme values.

Mode (most frequent value)

If every value occurs exactly once, the data set has no mode. Otherwise, the mode is the value (or values, if tied) that occurs most often. A data set can be unimodal, bimodal, or multimodal.

Measures of spread

Range: The difference between the highest and lowest values in a data set. It gives a quick sense of how spread out the data are.
Interquartile range (IQR): The spread of the middle 50% of the data, found by subtracting the first quartile ( $Q_{1}$ ) from the third quartile ( $Q_{3}$ ): $IQR = Q_{3} - Q_{1}$ . The IQR is useful for identifying outliers. A common rule: a value is an outlier if it falls below $Q_{1} - 1.5 \times IQR$ or above $Q_{3} + 1.5 \times IQR$ .
Variance: The average squared deviation from the mean; mainly an intermediate step toward computing standard deviation on the Praxis.
Standard deviation: The square root of the variance: $σ = σ^{2}$ . A measure of the average distance of each data value from the mean. A smaller standard deviation means the values are closer to the mean, while a larger one means the data are more spread out.
Extreme values (outliers): Extremely large or extremely small numbers that are far away from the majority of the data.

The two examples below compare a data set without an extreme value to one with an extreme value, showing how each measure responds.

Example: Effect of an extreme value on center

Dataset A: $4, 6, 7, 8, 10$

Ordered: $4, 6, 7, 8, 10$ .

Mean: $\overset{x}{ˉ} = \frac{4 + 6 + 7 + 8 + 10}{5} = \frac{35}{5} = 7$ .

Median: the 3rd value in the ordered list $= 7$ .

Mode: all values are unique, so there is no mode.

Dataset B: $4, 6, 7, 8, 100$ (replace 10 with an extreme value)

Mean: $\frac{4 + 6 + 7 + 8 + 100}{5} = \frac{125}{5} = 25$ .

Median: the 3rd value is still $7$ .

Mode: no mode.

Answer: Dataset A - Mean $= 7$ , Median $= 7$ , no mode. Dataset B - Mean $= 25$ , Median $= 7$ , no mode.

Sidenote

Not all measures of central tendency are affected equally

Notice that the mean shifts from 7 to 25 because of the extreme value, while the median stays at 7. This shows the median’s resistance to extreme values (outliers).

Measures of spread

Sidenote

How to find quartiles: the exclusive method

To find $Q_{1}$ and $Q_{3}$ , we use the exclusive method: split the ordered data into a lower half and an upper half, then take the median of each half.

Odd $n$ : exclude the median value itself before splitting.
Even $n$ : the lower half is the first $\frac{n}{2}$ values and the upper half is the last $\frac{n}{2}$ values. Example: $1, 3, 3, 5, 7, 8, 9, 10$ ( $n = 8$ , median = $\frac{5 + 7}{2} = 6$ )
- Lower half: $1, 3, 3, 5$ → $Q_{1} = \frac{3 + 3}{2} = 3$
- Upper half: $7, 8, 9, 10$ → $Q_{3} = \frac{8 + 9}{2} = 8.5$
- $IQR = 8.5 - 3 = 5.5$

This split works the same way regardless of whether the two middle values happen to be equal.

Example: Computing range and IQR

$4, 6, 7, 8, 10$

To find the interquartile range (IQR), start by ordering the data and identifying the median, which is 7. Then split the data into a lower half and an upper half around the median.

Because $n = 5$ is odd, we use the exclusive method: exclude the median (7) before splitting.

The lower half is $4, 6$ , so the first quartile $Q_{1}$ is the average of those two values: $\frac{4 + 6}{2} = \frac{10}{2} = 5$ .

The upper half is $8, 10$ , so the third quartile $Q_{3}$ is $\frac{8 + 10}{2} = \frac{18}{2} = 9$ .

The IQR is then calculated as $Q_{3} - Q_{1} = 9 - 5 = 4$ .

Answer: $4$

Example: Computing standard deviation

$4, 6, 7, 8, 10$

Mean $= 7$ . Squared deviations: $(- 3)^{2} + (- 1)^{2} + 0^{2} + 1^{2} + 3^{2} = 9 + 1 + 0 + 1 + 9 = 20$ .

Variance: $σ^{2} = \frac{20}{5} = 4$ .

Standard deviation: $σ = 4 = 2$ .

Answer: $2$

Choosing the right measure

Now that you know how to compute both center and spread, here’s a guide for deciding which measure to use in a given situation.

Situation	Best measure of center	Best measure of spread	Why
Categorical data (e.g., favorite color)	Mode	Not applicable	Mean and median cannot be calculated, only mode makes sense.
Numerical data, no extreme values, symmetric	Mean	Standard deviation	Uses all values, accurate for well-behaved data.
Numerical data, no extreme values, skewed	Median	IQR	Median resists skew, IQR ignores extremes.
Numerical data with extreme values (outliers)	Median	IQR	Both are resistant to outliers, mean and standard deviation would be distorted.
Small data set	Median	Range or IQR	Range is quick to compute; IQR is still appropriate. Because range is sensitive to outliers, use IQR if any extreme values are present.

Effects of transformations

Watch out: adding a constant does NOT change spread

For example, if a data set has range $= 6$ and IQR $= 4$ , and you add 100 to every value, the range is still 6 and the IQR is still 4.

Achievable Praxis Core: Math (5733)

2. Data analysis, statistics, and probability

Our Praxis Core: Math course is currently in development and is a work-in-progress.

Understanding central tendencies

6 min read

Font

Discuss

Feedback

Definitions

Measures of center

Mean (arithmetic average)

Formula

\overset{x}{ˉ} = \frac{\sum _{i = 1}^{n} x _{i}}{n}

. The balance point of the data. Sensitive to extreme values.

Median (middle value)

Order data from smallest to largest.

If $n$ is odd, the median is the $\frac{n + 1}{2}$ th value.
If $n$ is even, the median is the average of the $\frac{n}{2}$ th and $(\frac{n}{2} + 1)$ th values. Resistant to extreme values.

Mode (most frequent value)

If every value occurs exactly once, the data set has no mode. Otherwise, the mode is the value (or values, if tied) that occurs most often. A data set can be unimodal, bimodal, or multimodal.

Measures of spread

Range: The difference between the highest and lowest values in a data set. It gives a quick sense of how spread out the data are.
Interquartile range (IQR): The spread of the middle 50% of the data, found by subtracting the first quartile ( $Q_{1}$ ) from the third quartile ( $Q_{3}$ ): $IQR = Q_{3} - Q_{1}$ . The IQR is useful for identifying outliers. A common rule: a value is an outlier if it falls below $Q_{1} - 1.5 \times IQR$ or above $Q_{3} + 1.5 \times IQR$ .
Variance: The average squared deviation from the mean; mainly an intermediate step toward computing standard deviation on the Praxis.
Standard deviation: The square root of the variance: $σ = σ^{2}$ . A measure of the average distance of each data value from the mean. A smaller standard deviation means the values are closer to the mean, while a larger one means the data are more spread out.
Extreme values (outliers): Extremely large or extremely small numbers that are far away from the majority of the data.

The two examples below compare a data set without an extreme value to one with an extreme value, showing how each measure responds.

Example: Effect of an extreme value on center

Dataset A: $4, 6, 7, 8, 10$

Ordered: $4, 6, 7, 8, 10$ .

Mean: $\overset{x}{ˉ} = \frac{4 + 6 + 7 + 8 + 10}{5} = \frac{35}{5} = 7$ .

Median: the 3rd value in the ordered list $= 7$ .

Mode: all values are unique, so there is no mode.

Dataset B: $4, 6, 7, 8, 100$ (replace 10 with an extreme value)

Mean: $\frac{4 + 6 + 7 + 8 + 100}{5} = \frac{125}{5} = 25$ .

Median: the 3rd value is still $7$ .

Mode: no mode.

Answer: Dataset A - Mean $= 7$ , Median $= 7$ , no mode. Dataset B - Mean $= 25$ , Median $= 7$ , no mode.

Sidenote

Not all measures of central tendency are affected equally

Notice that the mean shifts from 7 to 25 because of the extreme value, while the median stays at 7. This shows the median’s resistance to extreme values (outliers).

Measures of spread

Sidenote

How to find quartiles: the exclusive method

To find $Q_{1}$ and $Q_{3}$ , we use the exclusive method: split the ordered data into a lower half and an upper half, then take the median of each half.

Odd $n$ : exclude the median value itself before splitting.
Even $n$ : the lower half is the first $\frac{n}{2}$ values and the upper half is the last $\frac{n}{2}$ values. Example: $1, 3, 3, 5, 7, 8, 9, 10$ ( $n = 8$ , median = $\frac{5 + 7}{2} = 6$ )
- Lower half: $1, 3, 3, 5$ → $Q_{1} = \frac{3 + 3}{2} = 3$
- Upper half: $7, 8, 9, 10$ → $Q_{3} = \frac{8 + 9}{2} = 8.5$
- $IQR = 8.5 - 3 = 5.5$

This split works the same way regardless of whether the two middle values happen to be equal.

Example: Computing range and IQR

$4, 6, 7, 8, 10$

To find the interquartile range (IQR), start by ordering the data and identifying the median, which is 7. Then split the data into a lower half and an upper half around the median.

Because $n = 5$ is odd, we use the exclusive method: exclude the median (7) before splitting.

The lower half is $4, 6$ , so the first quartile $Q_{1}$ is the average of those two values: $\frac{4 + 6}{2} = \frac{10}{2} = 5$ .

The upper half is $8, 10$ , so the third quartile $Q_{3}$ is $\frac{8 + 10}{2} = \frac{18}{2} = 9$ .

The IQR is then calculated as $Q_{3} - Q_{1} = 9 - 5 = 4$ .

Answer: $4$

Example: Computing standard deviation

$4, 6, 7, 8, 10$

Mean $= 7$ . Squared deviations: $(- 3)^{2} + (- 1)^{2} + 0^{2} + 1^{2} + 3^{2} = 9 + 1 + 0 + 1 + 9 = 20$ .

Variance: $σ^{2} = \frac{20}{5} = 4$ .

Standard deviation: $σ = 4 = 2$ .

Answer: $2$

Choosing the right measure

Now that you know how to compute both center and spread, here’s a guide for deciding which measure to use in a given situation.

Situation	Best measure of center	Best measure of spread	Why
Categorical data (e.g., favorite color)	Mode	Not applicable	Mean and median cannot be calculated, only mode makes sense.
Numerical data, no extreme values, symmetric	Mean	Standard deviation	Uses all values, accurate for well-behaved data.
Numerical data, no extreme values, skewed	Median	IQR	Median resists skew, IQR ignores extremes.
Numerical data with extreme values (outliers)	Median	IQR	Both are resistant to outliers, mean and standard deviation would be distorted.
Small data set	Median	Range or IQR	Range is quick to compute; IQR is still appropriate. Because range is sensitive to outliers, use IQR if any extreme values are present.

Effects of transformations

Watch out: adding a constant does NOT change spread

For example, if a data set has range $= 6$ and IQR $= 4$ , and you add 100 to every value, the range is still 6 and the IQR is still 4.

Always order data before computing median or quartiles.
Use mean for balanced average but exercise caution with extreme values.
Use median when distribution is skewed or contains extreme values.
Mode is useful for categorical or discrete data.
Range gives quick sense of total spread but is sensitive to extremes.
IQR focuses on central spread ignoring extremes.
Standard deviation quantifies average deviation from the mean.
Remember how transformations affect each measure.

Understanding central tendencies

Definitions

Measures of center

Mean (arithmetic average)

Formula

\overset{x}{ˉ} = \frac{\sum _{i = 1}^{n} x _{i}}{n}

. The balance point of the data. Sensitive to extreme values.

Median (middle value)

Order data from smallest to largest.

If $n$ is odd, the median is the $\frac{n + 1}{2}$ th value.
If $n$ is even, the median is the average of the $\frac{n}{2}$ th and $(\frac{n}{2} + 1)$ th values. Resistant to extreme values.

Mode (most frequent value)

If every value occurs exactly once, the data set has no mode. Otherwise, the mode is the value (or values, if tied) that occurs most often. A data set can be unimodal, bimodal, or multimodal.

Measures of spread

Range: The difference between the highest and lowest values in a data set. It gives a quick sense of how spread out the data are.
Interquartile range (IQR): The spread of the middle 50% of the data, found by subtracting the first quartile ( $Q_{1}$ ) from the third quartile ( $Q_{3}$ ): $IQR = Q_{3} - Q_{1}$ . The IQR is useful for identifying outliers. A common rule: a value is an outlier if it falls below $Q_{1} - 1.5 \times IQR$ or above $Q_{3} + 1.5 \times IQR$ .
Variance: The average squared deviation from the mean; mainly an intermediate step toward computing standard deviation on the Praxis.
Standard deviation: The square root of the variance: $σ = σ^{2}$ . A measure of the average distance of each data value from the mean. A smaller standard deviation means the values are closer to the mean, while a larger one means the data are more spread out.
Extreme values (outliers): Extremely large or extremely small numbers that are far away from the majority of the data.

The two examples below compare a data set without an extreme value to one with an extreme value, showing how each measure responds.

Example: Effect of an extreme value on center

Dataset A: $4, 6, 7, 8, 10$

Ordered: $4, 6, 7, 8, 10$ .

Mean: $\overset{x}{ˉ} = \frac{4 + 6 + 7 + 8 + 10}{5} = \frac{35}{5} = 7$ .

Median: the 3rd value in the ordered list $= 7$ .

Mode: all values are unique, so there is no mode.

Dataset B: $4, 6, 7, 8, 100$ (replace 10 with an extreme value)

Mean: $\frac{4 + 6 + 7 + 8 + 100}{5} = \frac{125}{5} = 25$ .

Median: the 3rd value is still $7$ .

Mode: no mode.

Answer: Dataset A - Mean $= 7$ , Median $= 7$ , no mode. Dataset B - Mean $= 25$ , Median $= 7$ , no mode.

Sidenote

Not all measures of central tendency are affected equally

Notice that the mean shifts from 7 to 25 because of the extreme value, while the median stays at 7. This shows the median’s resistance to extreme values (outliers).

Measures of spread

Sidenote

How to find quartiles: the exclusive method

To find $Q_{1}$ and $Q_{3}$ , we use the exclusive method: split the ordered data into a lower half and an upper half, then take the median of each half.

Odd $n$ : exclude the median value itself before splitting.
Even $n$ : the lower half is the first $\frac{n}{2}$ values and the upper half is the last $\frac{n}{2}$ values. Example: $1, 3, 3, 5, 7, 8, 9, 10$ ( $n = 8$ , median = $\frac{5 + 7}{2} = 6$ )
- Lower half: $1, 3, 3, 5$ → $Q_{1} = \frac{3 + 3}{2} = 3$
- Upper half: $7, 8, 9, 10$ → $Q_{3} = \frac{8 + 9}{2} = 8.5$
- $IQR = 8.5 - 3 = 5.5$

This split works the same way regardless of whether the two middle values happen to be equal.

Example: Computing range and IQR

$4, 6, 7, 8, 10$

To find the interquartile range (IQR), start by ordering the data and identifying the median, which is 7. Then split the data into a lower half and an upper half around the median.

Because $n = 5$ is odd, we use the exclusive method: exclude the median (7) before splitting.

The lower half is $4, 6$ , so the first quartile $Q_{1}$ is the average of those two values: $\frac{4 + 6}{2} = \frac{10}{2} = 5$ .

The upper half is $8, 10$ , so the third quartile $Q_{3}$ is $\frac{8 + 10}{2} = \frac{18}{2} = 9$ .

The IQR is then calculated as $Q_{3} - Q_{1} = 9 - 5 = 4$ .

Answer: $4$

Example: Computing standard deviation

$4, 6, 7, 8, 10$

Mean $= 7$ . Squared deviations: $(- 3)^{2} + (- 1)^{2} + 0^{2} + 1^{2} + 3^{2} = 9 + 1 + 0 + 1 + 9 = 20$ .

Variance: $σ^{2} = \frac{20}{5} = 4$ .

Standard deviation: $σ = 4 = 2$ .

Answer: $2$

Choosing the right measure

Now that you know how to compute both center and spread, here’s a guide for deciding which measure to use in a given situation.

Situation	Best measure of center	Best measure of spread	Why
Categorical data (e.g., favorite color)	Mode	Not applicable	Mean and median cannot be calculated, only mode makes sense.
Numerical data, no extreme values, symmetric	Mean	Standard deviation	Uses all values, accurate for well-behaved data.
Numerical data, no extreme values, skewed	Median	IQR	Median resists skew, IQR ignores extremes.
Numerical data with extreme values (outliers)	Median	IQR	Both are resistant to outliers, mean and standard deviation would be distorted.
Small data set	Median	Range or IQR	Range is quick to compute; IQR is still appropriate. Because range is sensitive to outliers, use IQR if any extreme values are present.

Effects of transformations

Watch out: adding a constant does NOT change spread

For example, if a data set has range $= 6$ and IQR $= 4$ , and you add 100 to every value, the range is still 6 and the IQR is still 4.