Sometimes when I watch hockey on television, the broadcast will display a stat that makes me cringe. One of my (least) favourites is a stat like the one displayed just under the score in the screenshot below:
Most of us have noticed these stats on broadcasts before. I imagine they are common because they match the game state (i.e. the Leafs are leading after the first period), so broadcasters probably believe we find them insightful. However, we are all smart enough to understand that teams should theoretically have a better record in games that saw them outscore their opponents in the first period. In this case, whatever amount of insight the broadcasters believe they are providing us with is merely an illusion. Perhaps they also saw value in the fact that the Leafs were undefeated in those 13 games, but that is not what I want to focus on today.
More generally, my primary objective for this post is to shed light on the context behind this type of stat, mostly because broadcasts rarely provide it for us. Ultimately, I will examine 11 seasons worth of data to understand how the outcome of a specific period effects the number of standings points a team should expect to earn in that game. Yes, this means there will be binning*. And yes, I acknowledge that binning is almost always an inappropriate approach in any meaningful statistical analysis. The catch here is that broadcasters continue to display these binned stats without any context, and I believe it is important to understand the context of a stat we see on television many times each season.
* Binning is essentially dividing a continuous variable into subgroups of arbitrary size called “bins.”In this case, we are dividing a 60-minute hockey game into three 20-minute periods.
A particular team wins a period by scoring more goals than their opponent. I looked at which teams won, lost, or tied each period by running some Python code through a data set provided by moneypuck.com. The data includes 13057 regular season games between the 2007-2008 and 2017-2018 seasons, inclusive. (Full disclosure: I’m pretty sure four games are missing here. My attempts to figure out why were unsuccessful, but I went ahead with this article because the rest of my code is correct, and 4 games out of over 13K is virtually insignificant anyways). The table below displays our sample sizes over those eleven seasons:
Remember that when the home team loses, the away team wins, so the table with our results will be twice as large at the table above. I split the data into home and away teams because of home-ice advantage; Home teams win more games than the visitors, which suggests that home teams win specific periods more often too. We can see this is true in the table shown above. In period 1, for example, the home team won 4585 times and lost only 3822 times. The remaining 4650 games saw first periods that ended in ties.
We want to know the average number of standings points the home team earned in games after winning, tying, or losing period 1. This will give us three values: One average for each outcome of the first period. We also want to find the same information for the away team, giving us atotal of six different values for period 1. (This step is not redundant because of the “Pity Point”system, which awards one point to the losing team if they lost in overtime or the shootout. The implication is that some games result in two standings points but others end in three, so knowing which team won the game still does not tell us exactly how many points the losing team earned). Repeating this process for periods 2 and 3 brings our total to 18 different values. The results are shown below:
The first entry in the table (i.e. the top left cell) tells us that when home teams win period 1, they end up earning an average of 1.65 points in the standings. We saw earlier that the home team has won the first period 4585 times, and now we know that they typically earn 1.65 points in the standings from those specific games. But if we ignore the outcome of each period, and focus instead on the outcomes of all 13057 games in our sample, we find that the average team earns 1.21 points in the standings when playing at home. (This number is from the sentence below the table —the two values there suggest the average NHL team finishes an 82-game season with around 91.43 points, which makes sense). So, we know that home teams win an average of 1.21 points in general, but if they win the first period they typically earn 1.65 points. In other words, they jumped from an expected points percentage of 60.5% to 82.5%. That is a significant increase.
However, in those 4585 games, the away team lost the first period because they were outscored by the home team. It is safe to say that the away team experienced a similar change, but in the opposite direction. Indeed, their expected gain decreased from 1.02 points (a general away game) to 0.54 points (the condition of losing period 1 on the road). Every time your favourite team is playing a road game and loses period 1, they are on track to earn 0.48 less standings points than when the game started; That is equivalent to dropping from a points percentage of 51% to 27%. Losing period 1 on the road is quite damaging, indeed.
Another point of interest in these results, albeit an unsurprising one, is the presence of home-ice advantage in all scenarios. Regardless of how a specific period unfolds, the home team is always better off than the away team would be in the same situation.
I also illustrated these results in Tableau for those of you who are visual learners. The data is exactly the same as in the results table, but now it’s illustrated relative to the appropriate benchmark (1.21 points for home teams and 1.02 points for away teams).
Now, let’s reconsider the original stat for a moment. We know that when the Leafs won the first period, they won all 13 of those games. Clearly, they earned 26 points in the standings from those games alone. How many points would the average team have earned under the same conditions? While the broadcast did not specify which games were home or away, let’s assume just for fun that 7 of them were at home, and 6 were on the road. So, if the average team won 7 home games and 6 away games, and also happened to win the first period every time, they would have: 7(1.65) + 6(1.53) = 20.73 standings points. Considering that the Leafs earned 26, we can see they are about 5 points ahead of the average team in this regard. Alternatively, we can be nice and allow our theoretical “average team”to have home-ice advantage in all 13 games. This would bump them up to 13(1.65) = 21.45 points, which is still a fair amount below the Leafs’ 26 points.
One issue with this approach is that weighted averages like the ones I found do not effectively illustrate the distributionof possible outcomes. All of us know it is impossible to earn precisely 1.65 points in the standings —the outcome is either 0, 1, or 2. An alternative approach involves measuring the likelihood of a team coming away with 2 points, 13 times in a row, given that all 13 games were played at home and that they won the first period every time. We know the average is 13(1.65) = 21.45 standings points, but how likely is that? It took a little extra work, but I calculated that the average team would have only a 3.86% chance to earn all 26 points available in those games. (I did this by finding the conditional probability of winning a specific game after winning the first period at home, and then multiplying that number by itself 13 times). Although the probability for the Leafs is a touch lower than this, since there is a good chance a bunch of those 13 games were not played at home, you should not allow such a low probability to shock you; 13 games is a small sample, especially for measuring goals. There is definitely lots of luck mixed in there.
This brings us back to my original anecdote about cringing whenever I encounter this type of stat. Even if we acknowledge its fundamental flaw —scoring goals leads to wins, no matter when those goals occur in a game —the stat is virtually meaningless in a small sample. Goals are simply too rare to provide us with much insight in a sample of 13 games. Nevertheless, broadcasters will continue displaying these numbers without context. This article will not change that. So, the next time it happens, you can now compare that team to league average over the past eleven seasons. Even if the stat is not shown on television, all you need to know is the outcome of a specific period to find out how the average team has historically performed under the same condition. At the very least, we have a piece of context that we did not have before.