![]() |
A FINAL STATISTICAL ANALYSIS OF THE |
The Rules of the Game
On Friday of each week beginning on May 29, 1998 and ending on May 11, 2001, a machine in Austin, Texas was used to randomly select four balls from a collection of 100 balls numbered from 0 to 99. Lottery players attempted to pre-select the winning numbers to win various amounts of money. Each Texas Million playslip had three games. Each game contained an array of numbers from zero through ninety-nine. Four numbers could be selected on any or all the games. Provision was made for these numbers to be entered into more than one drawing by marking a multi-draw number from two to ten. On the playslip, it said players could win in the following ways:
When a player chose four numbers and bought a ticket, the ticket machine automatically chose two more sets of four numbers. If either of them matched all four of the numbers drawn, the player won $25,000. The odds of this were 1 to 1,960,613. The ticket machine also chose four more sets of four numbers. If any of them matched all four of the numbers drawn, the player won $10,000. The odds of this were 1 to 980,306. If any of these seven sets of four numbers matched three of those drawn, the player won $300. If two were matched, the player won $10. The over-all odds of winning for each game played were 1 in 20.
Probability of Winning or Losing
The probabilities of the preceding events occurring are calculated as follows. The probability of selecting all four numbers correctly is 1/C(100,4) times [C(4,4) times C(96,0)] which is 1/3,921,225 which is approximately .000000255. This probability times two is the probability of winning $25,000 by matching all four numbers drawn by one of the two sets of four numbers selected by the ticket machine. This probability times four is the probability of winning $10,000 by matching all four numbers drawn by one of the four sets of four numbers selected by the ticket machine.
The probability of selecting three correctly is 1/C(100,4) times [C(4,3) times C(96,1)] times 7 since there seven sets of four numbers. This is approximately 1/1,459 which is approximately .0006854. Finally, the probability of selecting two correctly is 1/C(100,4) times [C(4,2) times C(96,2)] times 7 which is approximately 1/21 which is approximately .04762. The sum of these probabilities is .048307 which is approximately 1/20 -- the probability of winning anything.
Odds Versus Probability
On the playslips, it states the odds of winning. However, as we have seen above, the numbers actually printed on the playslips are the probabilities of winning. These numbers are usually quite different. If p is the probability of winning an event, then 1 - p is the probability of losing that event. The odds of winning that event are the probability of winning divided by the probability of losing or p/(1 - p). Suppose the probability of winning an event were 1/3. Then the probability of losing the event would be 2/3, so the odds of winning that event would be (1/3)/(2/3) = 1/2 which is quite different from the probability of winning. Fortunately, when the probabilities for winning an event are very small, the probabilities and odds are very close to the same number. In the case of the Texas Million Lottery, we have the following:
ODDS VERSUS PROBABILITIES |
|||
| Matching 4 | 0.000000255022408 | 0.000000255022343 | 0.000000000000065 |
| Matching 3 | 0.000685871066509 | 0.000685400969815 | 0.000000470096694 |
| Matching 2 | 0.050000000000000 | 0.047619050000000 | 0.0023809500000000 |
| Matching 4 | 0.000000510044832 | 0.000000510044572 | 0.000000000000260 |
| Matching 4 | 0.000001020090639 | 0.000001020089599 | 0.000000000001041 |
The difference column would seem to indicate that there would be no problem using the terms odds and probabilities interchangeably when discussing the Texas Million Lottery with the possible exception of matching two of the four. Randomness of the Lottery
The most important property of any lottery is that the numbers be chosen randomly. In order to test the Lotto numbers, the following measures were used: frequency of the numbers chosen, the mean, standard deviation and the Chi square test.
Frequency of Numbers Chosen
Theoretically, the probability P(x) that any given number x will be one of the four drawn is:
which is the hypergeometric probability formula. So the number of times we expect x to occur in n drawings is n times .04. Since there were 155 drawings, x should have occurred 155 multiplied by .04 times or 62 times. Compare this theoretical frequency with the actual frequencies given in the following table:
FREQUENCY OF OCCURRENCE OF THE |
||||||||
NUMBER | TIMES | NUMBER | TIMES | NUMBER | TIMES | |||
| 0 | 9 | 33 | 4 | 66 | 5 | |||
| 1 | 4 | 34 | 9 | 67 | 6 | |||
| 2 | 3 | 35 | 6 | 68 | 10 | |||
| 3 | 9 | 36 | 6 | 69 | 7 | |||
| 4 | 8 | 37 | 3 | 70 | 8 | |||
| 5 | 6 | 38 | 6 | 71 | 3 | |||
| 6 | 3 | 39 | 9 | 72 | 9 | |||
| 7 | 4 | 40 | 7 | 73 | 4 | |||
| 8 | 7 | 41 | 7 | 74 | 7 | |||
| 9 | 5 | 42 | 7 | 75 | 5 | |||
| 10 | 6 | 43 | 5 | 76 | 8 | |||
| 11 | 4 | 44 | 9 | 77 | 8 | |||
| 12 | 9 | 45 | 4 | 78 | 5 | |||
| 13 | 5 | 46 | 6 | 79 | 9 | |||
| 14 | 6 | 47 | 8 | 80 | 6 | |||
| 15 | 9 | 48 | 13 | 81 | 3 | |||
| 16 | 7 | 49 | 5 | 82 | 6 | |||
| 17 | 5 | 50 | 7 | 83 | 6 | |||
| 18 | 13 | 51 | 2 | 84 | 5 | |||
| 19 | 5 | 52 | 7 | 85 | 5 | |||
| 20 | 6 | 53 | 9 | 86 | 11 | |||
| 21 | 1 | 54 | 3 | 87 | 4 | |||
| 22 | 7 | 55 | 9 | 88 | 4 | |||
| 23 | 7 | 56 | 6 | 89 | 7 | |||
| 24 | 3 | 57 | 6 | 90 | 3 | |||
| 25 | 8 | 58 | 4 | 91 | 2 | |||
| 26 | 12 | 59 | 2 | 92 | 4 | |||
| 27 | 4 | 60 | 7 | 93 | 7 | |||
| 28 | 7 | 61 | 6 | 94 | 5 | |||
| 29 | 7 | 62 | 12 | 95 | 7 | |||
| 30 | 8 | 63 | 3 | 96 | 6 | |||
| 31 | 7 | 64 | 4 | 97 | 8 | |||
| 32 | 5 | 65 | 5 | 98 | 4 | |||
| 99 | 8 |
Mean, Standard Deviation and Distribution of Numbers Chosen
If the machine chose the numbers randomly, the average number chosen from the numbers 0 to 99 should have been 49.5 and the standard deviation should have been 29. The actual average number chosen by the Texas Million Lottery machine over the 155 drawings was 48.76 and the actual standard deviation was 28.52.
For the Chi-square test, the 100 Texas Million numbers were grouped into 10 intervals containing ten numbers each as follows: 0, 1, . . . , 9 and 10, 11, . . . , 19 and so on up to 90, 91, . . . , 99. The following table shows the total number of times the ten numbers occurred in their respective intervals:
TEXAS MILLION LOTTERY |
|
| INTERVAL | NUMBER OF NUMBERS |
| 0 to 9 | 58 |
| 10 to 19 | 69 |
| 20 to 29 | 62 |
| 30 to 39 | 63 |
| 40 to 49 | 71 |
| 50 to 59 | 55 |
| 60 to 69 | 65 |
| 70 to 79 | 66 |
| 80 to 89 | 57 |
| 90 to 99 | 54 |
Because 620 numbers were chosen and there are ten intervals, the average number of numbers in each interval is 620/10 = 62. Note that this number is also ten times the expected occurrence of each number found earlier, (4/100) times 155. The Chi-square test can now be run on the data in the intervals for 1 drawings as follows:
| X2 | = | (58 - 62)2/62 + (69- 62)2/62 + |
| (62- 62)2/62 + (63- 62)2/62 + | ||
| (71 - 62)2/62 + (55- 62)2/62 + | ||
| (65 - 62)2/62 + (66 - 62)2/62 + | ||
| (57 - 62)2/62 + (54- 62)2/62 | ||
| = | 5. |
According to a table of critical values of Chi square1, the Chi square value needs to be at least 14.7 to indicate non-randomness with a probability of at least .9, so it can not be concluded that the number selections were non-random with an error of 10% or less for the 155 drawings of the Texas Million Lottery.
Mathematical Expectation
The Texas Million prizes are awarded as follows:
Matching all four numbers drawn by the numbers chosen by the player - $1,000,000. Matching all four numbers drawn by either set of four numbers chosen by the ticket machine in the two-set - $25,000. Matching all four numbers drawn by any of the sets of four numbers chosen by the ticket machine in the four-set - $10,000. Matching three of the numbers drawn in any of the seven sets of four numbers - $300. Matching two of the numbers drawn in any of the seven sets of four numbers - $10.
The mathematical expectation for an event is the product of the probability of that event times the value of the prize for winning the event. For the Texas Million Lottery mathematical expectation, we have the following table:
MATHEMATICAL EXPECTATION |
||
| Probability | Prize | Product |
| 1/3921225 | 1000000 | .255 |
| 1/1960613 | 25000 | .02175 |
| 1/980306 | 10000 | .0102 |
| 1/1459 | 300 | .2056 |
| 1/21 | 10 | .4762 |
The sum of these expectations is .96875. Thus, a player can expect to win about 97 cents for every two dollars spent on a ticket.
Conclusions:
It was interesting to keep track of the number behavior for the Texas Million Lottery over its life time of 155 drawings. There was no indication of non-randomness in the number selection process and players could expect to win about 48 cents on the dollar.
1. Mendenhall, William and Beaver, Robert J. Introduction to Probability and Statistics. PWS-Kent Publishing Co., Boston, 1991, pp 670-671.
2. Lamb, Jr., John F., Huffstutler, Ron, Brock, Archie and Aslan, Farhad (Bill). "A Statistical Analysis of the Texas Lottery," Texas Mathematics Teacher, Vol. XLI (1) January, 1994.
![]() |
![]() |
![]() | |
![]() |
![]() |
![]() |
![]() |