By: Ethan Allavarpu and Kyle Boal
Introduction
A coin toss is just a flip of the coin, but in recent years-especially during the NFL playoffs-it has seemingly gained heightened importance. A prime example of this is the coin toss of the Super Bowl: in the early years, only the select few captains and a single referee met at midfield in a very quick exchange. Today, however, things are drastically different, as a small village of camera crews and important individuals accompany the captains to film the result of the coin toss. Moreover, mantras about which option to choose ("tails never fails") have emerged, the coin is specially engraved for the occasion, and Las Vegas sportsbooks create a prop bet on whether the result of the coin toss will be heads or tails, indicating the grandiosity of what should be an insignificant event.
These decisions, in turn, have led to an increase in the talk about "coin toss strategy," if such a thing even exists. We wanted to see if there is any validity to this talk, or if people were just blowing smoke. The coin toss doesn't seem like the type of event to make or break a game, so we were wondering why people pay so much attention to its outcome and their team's decision. For the purpose of this article, we have limited the games we will be investigating to the playoff games between the 2002 and 2019 NFL seasons to impose somewhat of an equal playing field (i.e. there are no 0-15 teams facing a 13-2 team).
Part of the reason we chose to compare the 2002-2006 era to 2015-2019 was because coin toss data was not easy to find before this era. In addition, deferring seems to have become the prevalent choice in recent years, but was this the case 18 years ago?
NFL Playoff Coin Tosses
As shown by the graph above, when comparing the results of the coin toss decision (defer vs. receive) between two different eras (2002-2006 and 2015-2019) for the NFL Playoffs, the percent of decisions resulting in a deferral drastically increased, from around 5% to around 85%. In fact, when performing a test for difference of proportions with a null hypothesis of no difference and an alternative that there is a difference (either positive or negative), we obtained a p-value of , indicating that we would reject the null hypothesis. It appears that the percentage of coin tosses that resulted in a deferral decision has statistically significantly increased from the era of 2002 - 2006 to 2015 - 2019, as we expected.
Going back to 2002 playoff games, we calculated the overall deferral percentage in the playoffs for that year and compared it to the win rate of the teams that deferred to understand if there was a correlation between deferring the coin toss and winning the game. On the x-axis we plotted the year up to the 2019 season and on the y-axis the win rate as a proportion.
In terms of deferral percentage, the data plotted clearly suggest an upward trend in recent years. In fact, to our surprise whereas up until 2006 no team deferred more than three times, in recent years no team has received more than three times. Perhaps misleadingly, the win percentage for those years is either at 1.0 or 0.0 as there was not a large sample size of defers to choose from, thus polarizing the figures. Interestingly, as the deferral percentage among teams rises, the win percentage caps at roughly 0.5. In the early years of the deferral revolution (2010-2012), winning percentages see a significant increase -- perhaps suggesting that deferring contributed to a team's overall success in the game. However, as the trend takes off from 2015 onward, nearly every team defers if they win the toss. Consequently, the win percentage tumbles, before leveling off around 0.5.
This is because teams today no longer have an advantage when every single team is doing the same thing.
To confirm this belief we took to plotting the opposite trend: going back to 2002 playoff games, we calculated the overall receive percentage in the playoffs for that year and compared it to the win rate of the teams that received. While the receive percentage is simply the inverse of the deferral percentage plot (blue), the receive win percentage draws on a whole new sample size.
Assuming the previous hypothesis is correct, it is expected that from 2002-2006 while the receive percentage is at a league high, the win percentage should be stagnant around 0.5. However, from 2010-2012, the percentage should be lower, as teams transition into a new meta of deferring before finally jumping between the two extremes of 0.0 and 1.0 with a small sample size of teams receiving from 2015 onward.
Supporting the hypothesis, the 2002-2006 sector corroborates the idea that when all teams are doing the same thing, in this case receiving, the win percentage is roughly even. The year 2010 illuminates seemingly a major turning point for the league. Recall that in 2010 60% of teams deferred and of those teams they won 40% of their games. However, of the teams that received (the remaining 40%), they won 0% of the games for the only time in the 18 year window. Additionally, skipping to 2015 when all the teams begin to defer, this leaves a small group going against the grain by receiving. Just as in the defer group from 2002-2006, this group of those who receive from 2015 onward have an average 0.8 win rate.
By this point we've proven that from the years 2002-2006, NFL teams were likely to receive should they win the toss. Moreover, from 2015 onward, the reverse is true: NFL teams are more likely to defer than to receive. We decided to plot the deferral rate as a proportion on the timeline by round. We expected that as the games got "bigger," the teams would become more likely to either receive or defer -- dependent on what was in style at the time. In other words, from 2015 onward, it's assumed that the deferral rate should increase from the wild card to the super bowl -- not only because the games are more important but also due to the sample size. On the x-axis we plotted the year and on the y-axis the deferral proportion.
As expected, in those two major time periods, the trend is true: for all ten of those years in the conference championship and super bowl, teams picked the popular trend at the time (meaning 30/30 times the team picked the popular choice). Additionally, for the last decade the team that has won the toss at the super bowl has chosen to defer.
Ultimately, we wanted to see why teams switched from receiving the football to deferring to the second half upon winning the coin toss. Was it because that strategy was "better"? Bill Belichick is renowned for deferring to the second half, as it provides his team with the opportunity to "double up" just before and just after half time: if his team can score at the end of the first half, then, by receiving the ball to start the second half, they have the chance to score again; this provides the team with the chance at a big momentum swing of up to two possessions (i.e. sixteen points). By considering this strategy, it would appear that deferring is the better decision and that the teams which defer would be expected to have a higher win percentage than the teams which receive after winning the coin toss.
The above bar graph displays the win percentages for teams that won the coin toss separated by their decisions to (1) defer and (2) receive for all years in the data set (2002 - 2019). As depicted, there was no significant difference in win percentage between teams that deferred versus teams that received on the whole, indicating that the sole decision of deferring versus receiving does not help a team in the playoffs. However, as some of the graphics displayed earlier in the article convey, this percentage could vary depending on the year and whether deferring was considered "popular". When further looking at the data, it is interesting to note that the win percentages hover around 0.45 for teams that won the coin toss regardless of decision, leading us to wonder whether teams that won the coin toss in the playoffs had a statistically lower win percentage than teams that lost the coin toss.
When comparing the win percentages for teams that won the coin toss against teams that lost it, we performed a hypothesis test for difference of proportions to see whether the observed difference was statistically significant. Since the p-value for this two-sided test was 0.0562, at the significance level, we fail to reject the null hypothesis that the percentages are different. What this means is that the likelihood of observing this difference of win percentage in a sample given that there really is no difference in the population is 5.62%; since our cutoff for that assumption being wrong is 5%, we say that we cannot reject the statement that the win percentage is different for these two categories. However, this p-value is still pretty low (almost significant), so we may want to investigate this issue further. While winning or losing the coin toss will not directly determine whether or not a team wins the game, there could be confounding variables inherent to the data and to playoff games that could impact the relationship between the coin toss outcome and win percentage. One such example would be overthinking, as the team which won the coin toss would overcomplicate the decision and ultimately make a poor one. Another potential confounding variable would be the peril of herd mentality: as more and more teams are opting to defer, they may be opting to defer for the sole reason that other teams are making the same choice.
Conclusion
Teams have increasingly analyzed the coin toss and its effect on the game, in turn flipping the script from almost every team receiving the ball to almost every team choosing to get the ball in the second half. This change has become so prevalent that many football fans are outraged when their team doesn't defer after winning the coin toss. However, blindly choosing to defer has its own pitfalls: teams are not thinking about their decision at all but merely going with the flow, which could have negative consequences if they are not prepared. Just like football, making a decision from the coin toss is dynamic and adaptive: each opponent is different, each team is different, and each game is different. Making the decision just because everyone else does it has perils of its own, and when looking at the data, the teams that generally saw an increase in their success were the ones that began this deferral trend and deferred when all anyone considered was receiving the ball and having the first possession. Maybe with an increasing number of teams choosing to defer, the right decision is to go against the grain and choose to receive.
But then again, it's all just the flip of a coin.
Sources: ESPN, Pro Football Reference