Friday, February 01, 2008

Does the distribution of Touchdowns scored fit the Poisson Distribution?

For Super Bowl props, I have been using the poisson distribution to help describe the distribution of touchdowns. For more on applying the poisson distribution to sports betting, get Sharp Sports Betting by Stanford Wong...note he is also my publisher, but believe me, I wouldn't recommend his book unless I thought it was good.

Here is an example of how I used the poisson distribution. In Las Vegas, one of the sportsbooks had a contest prop with multiple possibilities. It was on the exact number fo TD passes that Tom Brady would throw in the Super Bowl. They also had the same prop (with different prices) for Eli Manning.

For both players, I found there to be positive-EV in betting that they would throw exactly zero TDs. I bet Brady to throw zero TDs at 25 to 1 and Manning to throw zero TD passes at 4 to 1. I used two methods to value these props. The first was a simulation method which simulated the results of drives for the game for each team. I ran the simulation 10,000 times. The second method was using the Poisson Distribution. Both methods needed an accurate expected number of TD passes for the two QBs as the main input. Assuming I was accurate on that mean (if I wasn't, all results would be off), I found both 25 to 1 on Brady and 4 to 1 on Manning to be positive-EV bets.

Today I started thinking: does the poisson distribution really describe Touchdowns in the NFL well? Or probably better phrased: is the distribution of touchdowns in NFL games similar to a poisson distribution? (remember, I am not a statistician, just a gambler who tries to use techniques to get better I apologize if this is not the correct technical way to say it)

That's a tough question to answer. First I would need to know the true mean of TD passes for the QBs. But with such small sample sizes (just 16 regular season games in a year), and other factors (quality of opponent's defenses for example), it is really difficult to peg it down too closely. So instead, I decided to throw a big net on the NFL and look at all games and see if the distribution of the number of TDs matched the poisson distribution. I think it does. Here are the results:

I took all games from 1989 to the end of the regular season of 2007 (including all playoff games except this year's playoff games as I had not inputed them yet). I have the number of rushing TDs, passing TDs and defensive/special teams TDs in all games during that span. I lumped defensive TDs with special teams TDs in one category.

Here are the averages both both teams combined. I did not separate out to individual teams.

Rushing TD: 1.61
Passing TD: 2.59
Def/ST TD: 0.44

Next, I added up the number of games with exactly 0 rushing TDs, exactly 1 rushing TDs, exactly 2 rushing TDs, etc. etc. and repeated it for the other two ways to score TDs.

Next, I plugged in the mean for each way to score a TD and had the poisson distribution spit out the expected number for each exact number of TD.

These two methods (the actual exact number of TDs in real games and the expected exact number of TDs using the poisson distribution) were very similar. Here are the results. The first number is the exact number of TDs, the second is the percentage of games that actually had that exact number of TDs, the third is the poisson distribution's prediction of the expected percentage of games that had that exact number of TDs.

Rushing TDs
0 19.6% 20.1%
1 32.6% 32.2%
2 25.8% 25.9%
3 14.6% 13.9%
4 4.8% 5.6%
5 1.8% 1.8%
6 0.5% 0.5%
7 0.1% 0.1%

Passing TDs
0 7.9% 7.5%
1 20.3% 19.4%
2 24.0% 25.1%
3 20.4% 21.7%
4 14.7% 14.1%
5 7.7% 7.3%
6 3.2% 3.2%
7 1.2% 1.2%
8 0.3% 0.4%
9 0.2% 0.1%

Defensive and Special Teams TDs
0 64.8% 64.3%
1 27.6% 28.4%
2 6.3% 6.3%
3 1.2% 0.9%
4 0.1% 0.1%

As you can see, the percentage numbers match up very closely. As a first pass, my opinion is that the possion distribution describes the distribution of TDs quite well, but one needs to be correct with the expected mean numbers. For example, it would be incorrect to assume that Brady's distribution of TD passes is the same as Manning's. Brady has a much higher average. So the crucial step is still estimating the average number of TD passes they will throw.