Download (direct link):
mi x m2 = — 1
Reference: Normal Distribution.
300 NORMAL DISTRIBUTION
A normal distribution is part of the study of probability and statistics. It can be best understood using an example, but at this level the example serves only as an introduction to the topic.
William attends a high school and there are 320 students in his year group. The dean of these students weighs them all and enters the data in a frequency distribution table. In this type of assessment, which has a large amount of data, it is usual to place the weights into class intervals, of, say, 5 kilograms. Figure a shows the frequency distribution table and a histogram and frequency polygon of the data.
Weight in kg Frequency (Number of Students)
Weight in kg
The weights of the students in William’s class show quite a wide variation, but they do follow a pattern. Most of the weights are clustered together in the middle of the graph and very few are at the ends of the graph. Also, the graph has roughly an axis of symmetry.
Some data, like the weights of people, are collected by measuring, and are called continuous data. Other examples of continuous data are the heights of people, the lengths of leaves on a tree, the life of a light bulb, the quantity of paint in a 10-liter can, and examination marks measured in percentages. Examination marks are not really continuous data, but are included here; see the note at the end of this entry. When we measure continuous quantities, which occur naturally in everyday life, and
NORMAL DISTRIBUTION 301
graph the data using frequency polygons, the results usually resemble the graph drawn in figure b. The characteristics of this graph are:
♦ The shape of the graph resembles a bell, so we say that this kind of curve is bell-shaped.
♦ If we draw in the axis of symmetry, it passes through the mean of the distribution.
♦ The mean weight in the example is /x = 57.5 kg. The method of calculating this mean is explained under the entry Mean.
♦ For a normal curve the mean, mode, and median for the distribution all have approximately the same value, which is at the axis of symmetry of the curve.
Since this type of bell-shaped curve is usually obtained when we collect and graph continuous data, we say it is a normal curve, and the distribution is called a normal distribution. But a bell-shaped curve will only occur when a large quantity of continuous data are graphed. In our example regarding William’s year group, 320 students were weighed, which is probably large enough to obtain only an approximately normal curve. If 1000 students had been in the year group, the curve would have been closer to a normal bell-shaped curve. We can say that “several hundred” is usually a sufficiently large figure.
The mean and the standard deviation of the weights of students in William’s year group can be calculated using a scientific calculator. Check the calculator handbook for the method appropriate to your calculator. The mean weight is calculated to be IX = 57.5 kg and the standard deviation is a = 13 kg. There is a relationship between the standard deviation a = 7.3 kg, the mean ^ = 57.5 kg, and the number of students in Jacob’s year group, which is true for all normal curves. This relationship is described below. Suppose you draw the axis of symmetry on a bell-shaped curve that is at the mean value \x. Measure one standard deviation a along the horizontal axis and shade in the area, as shown in figure c. The shaded area will represent 34% of all the students in William’s year group. Similarly, one standard deviation measured to the left of the mean will also represent 34% of the students. From the symmetry of the bell-shaped curve, we can say that 68% of the data (the students) lie within one standard deviation of the mean for a normal distribution (see figure d). Write
/x + a =57.5 + 7.3 Using the data from William’s group.
= 64.8 jx — a = 57.5 — 7.3
302 NORMAL DISTRIBUTION
-G (X G
This means that 68% of the students lie within the weights 50.2 and 64.8 kg. There are 320 students in Jacob’s year group, so we can calculate how many lie within one standard deviation of the mean:
We can expect that about 218 students weigh between 50.2 and 64.8 kg.
This can be extended further to assert that 95% of the data will lie within two standard deviations of the mean (see figure e):
This means that 95% of 320, or 304, students have weights between 42.9 and 72.1 kg. Further extending this concept, we assert that 99% of the data will lie within three standard deviations of the mean. This means that 99% of 320, or 317, students weigh between 35.6 and 79.4 kg. All the theory outlined above is what we expect to be true for a theoretical normal curve, but everyday life examples do not exactly obey this model. This is certainly true in the example of Jacob’s year group, where all the students lie in the range 40-75 kg. To obtain data that will more closely fit the theory of 68%, 95%, and 99%, we would have to consider a much larger group of students of that age. A normal curve containing the standard deviations and relevant percentages is shown in figure f.