Textbook

Percentiles are an essential part of the GRE. You’ll even be interpreting percentiles to determine how well you did on the exam! For example, if you scored in the 81st percentile, it means you scored above 81 percent of the test takers. Let’s say you score precisely in the middle of the pack. What would your percentile be? If you guessed the 50th percentile, you would be correct. Scoring at the 50th percentile implies you beat 50% of the test takers. This also implies that 50% of the test takers beat you.

Percentiles can be a tricky topic, especially when applied to standard deviation problems. For the moment, just remember that a larger percentile means a better score.

A box and whisker plot is a way of illustrating the distribution of the measurements in a set. They relate very closely to percentiles. Here’s an example labeled box and whisker plot:

Let’s imagine the box and whisker plot above represents a set of 100 observations about the number of times people use public transit in a week. The limits of the whiskers represent the extreme measurements in the set of 100 observations. The minimum observation in this chart is 0, and the maximum observation is 8.

The *second quartile* Q2 is the next easiest point to explain. A *quartile* means 25% of the data, so the second quartile means up to 50%. This is the median of the dataset and always sits at the 50th percentile. In our example distribution, the people who rode public transport 4 times a week are precisely in the center of the distribution.

Can you guess what percentile Q1 represents?

(spoiler)

If you guessed the 25th percentile, nice job!

The people who rode 2 times a week took public transport more often than 25% of the others, but less often than the other 75% of people.

Similarly, Q3 represents the 75th percentile.

The **range** of this set is the maximum value minus the minimum value, which is $8−0=8$.

The **interquartile range (IQR)** is different from the full range. The interquartile range of a set is the difference between Q3 and Q1. In this case, the interquartile range is $6−2=4$.

Calculating a quartile is easier than it seems, once you understand the trick. Let’s walk through an example.

List $L$ consists of $11$ adjacent positive numbers, the largest being $33$. What is the IQR of $L$?

We can find the IQR by finding the difference between the 1st and 3rd quartiles. But how do we find those?

For a sorted list, the 1st quartile is in the middle of the range from 0% to 50%, and the 3rd quartile is in the middle of the range from 50% to 100%.

So let’s start by finding the 2nd quartile, a.k.a. the 50th percentile, that sits right at the median of the distribution. Here’s the list written out, and we’ll grab the middle value.

$L=[23,24,25,26,27,28,29,30,31,32,33]$

The median of the list is $Q_{2}=28$.

Now we split the original list into two parts, and find the median of each.

$L_{1}L_{3} =[23,24,25,26,27]=[29,30,31,32,33] $

So that means $Q_{1}=25$ and $Q_{3}=31$.

And now we’ve got all the information we need to wrap it up.

$IQR =Q_{3}−Q1=31−25=6 $

Not as hard as it seemed at first, right?

Now it’s your turn - try a similar question yourself:

List $M$ consists of $11$ adjacent positive numbers, the largest being $25$. What is the IQR of $M$?

Try solving it, and then check your answer below.

(spoiler)

Answer: 6

$L=[15,16,17,18,19,20,21,22,23,24,25]$

The median of the list is $Q_{2}=20$.

Now we split the original list into two parts, and find the median of each.

$L_{1}L_{3} =[15,16,17,18,19]=[21,22,23,24,25] $

So that means $Q_{1}=17$ and $Q_{3}=23$.

$IQR =Q_{3}−Q1=23−17=6 $

Notice how the IQR for $M$ is the same as for $L$ in the walkthrough above? That’s no coincidence. The lists have the same length and the same spacing between each number. They’re essentially the same list, just that $L$ ends at $33$, and $M$ is shifted over so that it ends at $25$. The distribution of the elements didn’t change, so neither did the IQR.

Let’s try one more to drill the core concept.

Quantity A: The IQR of a list of integers from 1 to 10

Quantity B: The IQR of a list of integers from 1 to 9

Try solving it, and then check your answer!

(spoiler)

The two quantities are equal.

The IQR for Quantity A $[1,2,3,4,5,6,7,8,9,10]$:

- The list has an even number of elements, so we split it in half to calculate Q1 and Q3
- Q1 is the median of $[1,2,3,4,5]=3$
- Q3 is the median of $[6,7,8,9,10]=8$
- So the IQR is $8−3=5$

The IQR for Quantity B $[1,2,3,4,5,6,7,8,9]$:

- The list has an odd number of elements, so we drop the middle number, and then split it in half to calculate Q1 and Q3
- Q1 is the median of $[1,2,3,4]=2.5$
- Q3 is the median of $[6,7,8,9]=7.5$
- So the IQR is $7.5−2.5=5$

There are two other “special” percentiles you should note:

- 84th percentile
- 16th percentile

These are special because they are one “*standard deviation*” (i.e. 34%) away from the center (50%) of the distribution.

- 84th percentile ($50+34=84$)
- 16th percentile ($50−34=16$)

Sometimes questions will give you an 84th percentile measurement and the standard deviation. From this information, you can calculate the mean!

For example, given a normal distribution with an 84th percentile measurement of 26 and a standard deviation of 3, the mean can be calculated as $26−3=23$.

The normal distribution is a well-known data distribution pattern most commonly known as the *bell curve*. Any time you see a problem that uses the term “*normal distribution*”, you should immediately think of this figure.

It’s essential to recognize that a normal distribution is perfectly mirrored; both the left and right sides of the distribution are identical, but flipped.

At the center is 0, since 0 simply means 0 deviations from the mean. The mean is the center of a distribution, and no deviation from the center is still the center.

The 34% represents the percent of the total distribution between 0 and 1 standard deviation from the mean. A standard deviation is essentially the “average distance from the mean” (although the technical definition is a bit different). You’ll never be asked to solve for a standard deviation in the GRE, so this simplification is good enough for our purposes.

Notice how all the percentiles on either side of the 0 standard deviation line add up to fifty: $2+14+34=50$. This is because there is 50% of the values above the mean, and 50% of the values below the mean. Some questions might only ask for the percentage of the total above 1 standard deviation (not just above the mean). The sum of the percentiles above one standard deviation is just the parts past the +1 line, i.e. $14+2=16$.

Many problems will give you a mean and a standard deviation. With this information, you can label the values at the bottom of the distribution. For example, if the mean is 82 and the standard deviation is 6, the distribution would look like the figure below. For this distribution, one standard deviation (6) below the mean (82) is 76.

Lastly, circling back to our initial percentiles discussion, we can see how a normal distribution relates closely to percentiles. For instance, if you scored 2 standard deviations above the mean, your percentile would be 98. This is because only 2% of the population is above the second percentile, and 98% is below it.

Let’s try an example question using a normal distribution.

A car tire manufacturer is testing the average number of miles their tires can be used before the treads are fully stripped. The experiment involves running 3,600 tires on a treadmill that simulates the conditions a tire may experience on the road. The experiment found that the mean miles driven before the treads were stripped was 60,000 miles. The distances were normally distributed with a standard deviation of 750 miles. How many tires survived beyond 58,500 miles in the experiment?

Using what you’ve learned so far, you should be able to come up with the exact number!

(spoiler)

Answer: 3,528

When solving normal distribution / standard deviation questions, sketching out the chart with the data you’re given is a good first step.

We’re told in the question that the mean miles was 60,000, and the standard deviation was 750, so the middle of the chart is at 60,000, and each interval is 750 apart. The value of 58,500 is exactly two standard deviations below the mean. That’s no coincidence - they gave us that information specifically because we’re meant to use it in conjunction with the normal distribution percentages. If only 2% of values are below two standard deviations from the mean, then the other 98% of values ($14+34+34+14+2=98%$) must be above that value. The answer is 98% of the 3,600 tires total.

$(0.98)3600=3528$

Of the 3,600 tires total, 3,528 survived beyond 58,500 miles.

This is a complex topic. Percentiles and normal distributions definitely take a few practice problems to get the hang of, so don’t feel discouraged if it takes a few tries before you feel comfortable!

Sign up for free to take 10 quiz questions on this topic

All rights reserved ©2016 - 2024 Achievable, Inc.