O.02: Section 1 Part 1

Section 1: Compound models—adding two model formulas to model combined effects

Compound-model application 1: Modeling mixed populations In an earlier topic we discussed normal-distribution models, whose graph is a “bell-shaped” curve with a single peak. This model is appropriate in many situations where some characteristic of a group of generally-similar objects is distributed around an average value. But when two groups with different averages are mixed, the resulting distribution may have two partially-overlapping peaks, as illustrated below by a dataset on the height of a combined population of male and female students. A single normal distribution will not do a good job of modeling this data.

Example 1:

Height data from a co-educational population

The data that produced the above graph is shown in a table to the right. A plausible model for data of this kind is the sum of two normal distributions. This will entail six parameters: total, average, and width for each normal. The formula to be placed into C3 will thus be “=$G$3*NORMDIST(A3,$G$4,$G$5,FALSE)+ $G$6*NORMDIST(A3,$G$7,$G$8,FALSE)”; as always, it should be spread down column C beside all the data rows.As usual, we will set the initial values for the parameters to reasonable approximations before using Solver. In this case, we can do that by setting each “total” parameter to 5,000 (half the overall total), setting the “average” parameters to the approximately the x positions of the two peaks (64 and 70 are close enough), and setting both “width” parameters to a value, such as 2 or 3, that gives a model that is reasonably similar to the data. Then use Solver to minimize the sum of squared deviations, answering “G3:G8” in the “By changing cells” entry field so that all six parameters will be used. As shown below, a good fit is found.

*Parameters (two-normal sum)*
5529.128	Total 1
63.72487	Average 1
2.685645	Width 1
4569.312	Total 2
70.38119	Average 2
2.728443	Width 2

Goodness of fit of this model
1710.994	Sum of squared deviations
7.429222	Standard deviation

This best-fit model does more than give a formula for describing the height distribution of the combined male-female population. Because it fits the data so well, we can confidently deduce the characteristics of the male and female distributions from the parameters for the two normal distributions, even though we do not have any data that identifies gender, and even though both males and females contribute to each height total. Clearly the first normal describes the height distribution of females and the second normal describes the height distribution of males. Thus we can use these results to see that 45.7% is the answer to the question “How much of this data came from males?”, and that 63.7 inches is the answer to the question “What was the average height of the female part of this sample?”.

Height	Students
48	0
49	0
50	0
51	0
52	0
53	0
54	0
55	2
56	12
57	29
58	79
59	180
60	326
61	481
62	673
63	813
64	864
65	825
66	751
67	717
68	677
69	700
70	735
71	658
72	568
73	424
74	280
75	167
76	75
77	25
78	13
79	7
80	0
81	1
82	1
83	0
84	0

If we individually graph the two normal distributions found here along with their sum we can see the components that combined to form the data distribution:

This same sum-of-two-normal-distributions approach can be used in a variety of situations. Sometimes the component distributions have about the same average but greatly different widths, so that the resultant distribution has “fat tails”. Other times the distributions have substantially different totals but nearby averages, so that the result looks like the larger distribution with a bump added on one side. It is often the case that an investigator is interested in only one of the component distributions, with the other values being ignored after the fitting process. An example of each of these kinds of two-normal situations are given below, with the components shown (thin lines) as well as the sum (dots) that reflects the data that would be observed.


Examples of the sum of two differing-parameter normal-distribution functions

Licenses & Attributions

CC licensed content, Shared previously

Mathematics for Modeling. Authored by: Mary Parker and Hunter Ellinger. License: CC BY: Attribution.

Study Guides > Mathematics for the Liberal Arts

O.02: Section 1 Part 1

Section 1: Compound models—adding two model formulas to model combined effects

Example 1:

Licenses & Attributions

CC licensed content, Shared previously