Playoffs

The Stanley Cup Formula: An Investigation Through Machine Learning by Scott Schiffner

By: Owen Kewell

NHL seasons follow a formulaic plotline.

Entering training camp, teams share a common goal: win the Stanley Cup. The gruelling 82-game regular season separates those with legitimate title hopes from those whose rosters are insufficient, leaving only the sixteen most eligible teams. The attrition of playoff hockey gradually whittles down this number until a single champion emerges victorious, battle-tested from the path they took to win hockey’s top prize. Two months off, then we do it all again.

Teams that have won the Stanley Cup share certain traits. Anecdotally, it’s been helpful to have a dominant 1st line centre akin to Sidney Crosby, Jonathan Toews or Anze Kopitar. Elite puck-moving defensemen don’t hurt either, nor does a hot goalie. Delving deeper, though, what do championship teams have in common?

I decided to answer this question systematically with the help of some machine learning.

Some Background on Classification

Classification is a popular branch of supervised machine learning where one attempts to create a model capable of making predictions on new data points. We do this by building up, or ‘training’, the model using historical data, explicitly telling the model whether each past data point achieved the target class that we’re trying to predict. In the context of hockey, this data point could be some number of team statistics produced by the 2015 Chicago Blackhawks. The target here would be whether they won the Stanley Cup, which they did.

Sufficiently robust classification models can identify a number of statistical trends that underpin the phenomenon that they’re observing. The models can then learn from these trends to make reasonably intelligent predictions on the outcome of future data points by comparing them to the data that the classifier has already seen.

Building a Hockey Classifier

We can apply these techniques to hockey. We have the tools to train a model to learn which team statistics are most predictive of playoff success. To do this, we must first decide which stats to include in our dataset. To create the most intelligent classifier, we decided to include as many meaningful team statistics as possible. Here’s what we came up with:

team stats.jpg

It’s worth noting that we engineered the ‘Div Avg Point’ feature by calculating the average number of points contained by all teams in a given team’s division. The remaining statistics were sourced from Corsica and Natural Stat Trick. An explanation of each of these stats can be found on the glossaries for the two websites.

Our dataset included 210 data points: 30 teams per season over the 7 seasons between 2010-11 and 2016-17. Each data point included team name, the above 53 team stats, and a binary variable to indicate whether the team in question won the Cup. Using this data, we trained nine different models to recognize the statistical commonalities between the 7 teams whose seasons ended with a Stanley Cup championship. The best-performing model was a Logistic Regression model trained on even-strength data, and so all further analysis was conducted using this model.

Results: Team Stats that Matter Most

To evaluate which team stats were most strongly linked to winning a Cup, we created a z-score standardized version of our team data. We then calculated the estimated coefficients that our logistic regression model assigned to each team stat. The size of these coefficients indicates the relative importance of different team stats in predicting Stanley Cup champions. The 5-highest ranking team stats can be seen below:

top 5 team stats.jpg

Of all team statistics, ‘Goals For Per 60 Minutes’, or GF/60, is most predictive of winning a Stanley Cup. Of the 7 champions in the dataset, 4 ranked within the top 5 league-wide in GF/60 in their respective season, with 2016-17 Pittsburgh most notably leading the league in the statistic. Impressive results in ‘High Danger Chances For’ and ‘Team Wins’ both strongly correlate to playoff success, while ‘Scoring Chance For Percentage’ and ‘Shots on Goal For Percentage’ round out the top 5.    

What Does It Mean?

Generating a list of commonalities among past champions allows us to comment on what factors impact a team’s likelihood of going all the way. Most apparent is the importance of offense. It is more important to generate goals and high-danger chances than it is to prevent them, as GA/60 and HDCA rank 36th and 13th among all statistics, respectively (their corollaries are 1st and 2nd). In the playoffs, the best team offense tends to trump the best team defense, which we saw anecdotally in last year’s Pittsburgh v Nashville Final. If you want to win a Stanley Cup, the best defense is a good offense.

offense vs defense.jpg

We can see that a team’s ability to generate scoring chances, both high-danger and otherwise, is more predictive of playoff success than their ability to generate shots. Although hockey analytics pioneers championed the use of shot metrics as a proxy for puck possession, recent industry sentiment has shifted towards the belief that shot quality matters more than shot volume. The thinking here, which is supported by the above results, is that not all shots have an equal chance of beating a goalie, and so it is more important to generate a shot with a high chance of going in than it is to generate a shot of any kind. Between a team who can consistently out-chance opponents and a team who can consistently out-shoot opponents, the former is more likely to win a hockey game, and therefore playoff series.  

Application: The 2017-18 Season

A predictive model isn’t very helpful unless it can make predictions. So let’s make some predictions.

By feeding our model the team stats produced by the recently-completed 2017-18 regular season, we can output predictions of each team’s likelihood of winning the 2018 Stanley Cup. Since this is the fun part, let’s get right to the probability estimates for all 31 NHL teams:

probability estimates.jpg

The rankings above essentially indicate how similar each team’s season was to the regular season of teams that went on to win it all. In doing so, they hope to identify the teams most likely to replicate this success The model favours the Boston Bruins to win the 2018 Stanley Cup, predicting a victory over the Nashville Predators in the Final.

The above data highlights a few curiosities. Notably, we can see that some non-playoff teams had 5-on-5 numbers that were relatively comparable to past Cup champions. Specifically, the Blues, Stars, and Flames played 5-on-5 hockey well enough this season to qualify for the playoffs. The Blues and Flames can attribute their disappointingly long off-seasons to the 30th and 29th-ranked power plays, respectively. The Stars’ implosion is more of a statistical anomaly, and while conducting an autopsy would be interesting it would be better served as a subject for another article.

The lowest-ranked teams to have made the playoffs in the real world are the New Jersey Devils and the Washington Capitals. While their offensive star power might have been enough to get these squads to the dance, the model predicts a quick exit for them both.

A Computer-Generated Bracket:

2018bracket.jpg

For fun, I’ve filled out the above bracket using the class probability rankings generated by our model. Of the 8 teams who have won or are winning their first-round playoff series, the model picked 7 of them as at the winner, with Philadelphia being the exception. While it’s far too early to comment on the model’s accuracy, as only a single playoff series has been completed, it’s an encouraging start.

Limitations of the Analysis

The above results must be considered in the appropriate context. The model was trained and tested using only 5-on-5 data, which would explain the lack of love for teams with strong special teams like Pittsburgh and Toronto. The model is also blind to the NHL’s playoff format. Due to the NHL’s decision to have teams play against their divisional foes during the first two playoff rounds, teams in strong divisions have a much harder road to winning a Cup. Consider that Minnesota’s path to the conference final would likely involve Winnipeg and Nashville in the first two rounds, who finished 2nd and 1st in NHL standings in the regular season. Divisional difficulty is not reflected in the probabilities listed above, though incorporating divisional difficulty either probabilistically or through a strength of schedule modifier could be areas of further analysis.

A final limitation of the model is that it is trained using only 7 champions. In an ideal world, we would have access to dozens or hundreds of Stanley Cup positive instances, but due to the nature of the game there can only be one champion per year. We considered extending the dataset backwards past 2011 but ultimately decided against doing so. The NHL is different today than it was in the past. Training a model on a champion from 2000 tells us little about what it takes to have success in 2018. Using 2010-11 onwards represented a happy medium in the trade-off between data relevance and quantity.

What next?

Winning a Stanley Cup remains an inexact science. While it’s valuable to identify trends among past winners, there is no guarantee that what’s worked before will work again. It’s a game of educated guesses.

I believe that the most legitimate way to build a Stanley Cup winner is a combination of the past and the future. Analyzing historical data to identify team traits that are predictive of a championship is half the battle. The rest is anticipating what the future of the NHL will look like. The champions of the next few years will be lead by managers who are best able to identify what it’ll takes to win in the modern NHL. While the above framework approaches the first half in a systematic way, the latter remains much harder to crystallize.

In the meantime, let’s turn to what’s in front of our eyes. The playoffs have been tremendously entertaining thus far, and that’ll only pick up as teams are threatened by elimination. Let’s enjoy some playoff hockey. Let’s see which playing styles, tactics, and matchups seem to work. Let’s learn.

Even if your team gets eliminated, just remember that this season’s playoffs are just a couple months away from being data points to train next season’s model.

Then we do it all again.

Playoff Preview: Toronto Maple Leafs vs. Boston Bruins by Anthony Turgelis

By: Kurt Schulthies

Monday May 13, 2013:

The city of Toronto was electric. Competing in the Stanley Cup Playoffs for the first time in 12 seasons, the Toronto Maple Leafs inched their way to game 7 against the heavily favoured Boston Bruins. Continuing an improbable run led by Phil Kessel, Nazem Kadri, James Van Riemsdyk, Cody Franson, Dion Phaneuf, and James Reimer.

I was with a dozen of my closest friends, sitting at the head of the table in a Shoeless Joe’s party room. Every detail of that night is vivid in my mind -- for what was about to come can only be described as demoralizing. The Leafs held a 3 goal lead with less than 11 minutes to go in regulation time.

The lead evaporated. The Bruins’ eventual overtime winner became an inevitability.

Without a word, I immediately got up from my seat and stormed out of the bar. I glanced over at the patrons -- and to this day, I have never seen so many people simultaneously unsure how to react.

Present Day

Toronto is a dramatically different team. Now led by their sophomore phenom Auston Matthews, the Leafs look for revenge against the team that crushed the hopes of an entire fanbase five years ago.  

Taking an analytics-focused view, let’s see how Toronto and Boston compare now.

Offensive Matchup

Screen Shot 2018-04-12 at 5.23.24 PM.png
Screen Shot 2018-04-12 at 4.40.22 PM.png
All data used is courtesy of Corsica and NaturalStatTrick

The Leafs are superior to the Bruins in every major offensive category. Toronto is one of the highest paced teams in the league, relying on their high-end offensive talent to best opponents. Boston had a similarly strong offensive season, but failed to generate a significant amount of high danger scoring chances per 60 minutes of play. This can likely be attributed to the Bruins' slower paced style of play.

                               Toronto                                                                       Boston

Screen Shot 2018-04-12 at 4.31.41 PM.png
Screen Shot 2018-04-12 at 4.31.56 PM.png
Screen Shot 2018-04-12 at 5.14.40 PM.png

 

The visuals above show the league rank of each forward in 5v5 primary points per 60 minutes. This metric is highly repeatable year over year, and gives a somewhat accurate depiction of a player’s offensive prowess. However, numbers are somewhat skewed by factors such as the quality of their linemates and the quality of competition faced.

The first thing that stands out about the Leafs’ chart is Auston Matthews. He ranks first league wide in 5v5 P1/60. Fans can expect him to be a constant threat, and the biggest ‘X-factor’ player in the series. Boston is led by what is likely the league’s most dominant first line. It is one of the only lines that is capable of dominating the overpowering combination of Auston Matthews and William Nylander.

Heat maps created and available on HockeyViz.com

Heat maps created and available on HockeyViz.com

Toronto is incredible at generating high danger scoring chances. This metric is much more predictive of goal scoring than stats such as ‘shots’. In contrast, Boston is far below league average at generating scoring chances right in front of the net, but remain a threat in the high slot. Toronto outperforms metrics such as Corsi for and scoring chances due to their admirable scoring talent, and high number of odd man rushes per game. Boston has slightly above average shot quality, meaning they likely score near their expected results according to Corsi and scoring chances.

Defensive Matchup

Screen Shot 2018-04-12 at 5.23.35 PM.png
Screen Shot 2018-04-12 at 4.40.33 PM.png

                 Boston

Zdeno Chara - Charlie McAvoy

Torey Krug - Kevan Miller

Matt Grzelcyk - Adam McQuaid

               Toronto

Morgan Reilly - Ron Hainsey

Jake Gardiner - Nikita Zaitsev

Travis Dermott - Roman Polak

Boston has been an excellent defensive team this season, beating Toronto in every major defensive category. The Bruins are one of the best shot suppression teams in the NHL, forcing teams to shoot from unfavourable scoring positions. In contrast, the Leafs allow a high concentration of dangerous scoring chances from the slot, leading to a much worse defensive performance. Shots against location heat maps for each team can be seen below:

Heat maps created and available on HockeyViz.com

Heat maps created and available on HockeyViz.com

Toronto gives up a lot of high danger chances, leading to a higher expected goals against per game. It also means the team underperforms metrics such as corsi and scoring chances. Boston, in contrast, is excellent at shot suppression. This leads to outperforming metrics such as corsi and scoring chances, and results in a very low expected goals against per game.

Goaltending Matchup

Both the Leafs and Bruins boast top tier goaltenders with Frederik Andersen and Tuuka Rask. Using a goalie comparison tool created by Tyler Kelley (@DocKelley41), we are able to compare each goalie by key metrics:

Compare other goalies at: https://public.tableau.com/profile/tyler7457#!/vizhome/GoalieTool/2017-18ComparisonTool

Compare other goalies at: https://public.tableau.com/profile/tyler7457#!/vizhome/GoalieTool/2017-18ComparisonTool

For more on what each metric means, read here. The values on the x-axis of the graph are the percentile ranks that each of their stats fall on. Frederik Andersen is near the top of the charts with his Goals Saved Above Average. This is unsurprising considering the aforementioned shaky Leafs defense and the great play of Andersen so far this year. The stat highlights that if an average goalie were to be placed in the Leafs net in front of Andersen, they would be expected to concede a lot more goals. By this metric among others, it appears Andersen has a small edge over Tuuka Rask this season.

Prediction

The team statistics would suggest the Boston Bruins are the favourites in this series. However, in head-to-head matchups in the Toronto Maple Leafs have been the better team with a 7-1-0 record in 8 games over the past 2 seasons. This series should be a war, and one of the most likely first round matchups to go to 7 games. With that being said, my final prediction is Leafs in 7 games.


Keep up to date with the Queen's Sports Analytics Organization. Like us on Facebook. Follow us on Twitter. For any questions or if you want to get in contact with us, email qsao@clubs.queensuca, or send us a message on Facebook.

Playoff Preview: Winnipeg Jets vs. Minnesota Wild by Scott Schiffner

By: Owen Kewell and Scott Schiffner

The calm before the storm.

The brackets have been setup, the matchup strategies developed, and the razors hidden away. For the first time since June, playoff hockey is here. We are mere hours from the puck drop that’ll kick off the 2017-18 Stanley Cup Playoffs, the starting pistol for a two-month long marathon where only one team can cross the finish line. In anticipation of this, we at the Queen’s Sports Analytics Organization decided to tee up the matchups featuring Canadian teams. We start with the Winnipeg Jets, who will play host to the Minnesota Wild on Wednesday night. The first round playoff series between the Central division rival Winnipeg Jets (2nd, 52-20-10) and the Minnesota Wild (3rd, 45-26-11) is an exciting matchup that is sure to feature a high level of speed, talent, and physicality from both sides. Both squads have enjoyed productive seasons, with the Jets posting the best record of any Canadian team, finishing with 114 points.

Offensive Matchup

Winnipeg enters the series with the reputation of having one of the most lethal forward groups in the league. Lead by a rejuvenated Blake Wheeler (91 points) and 44 goals from sophomore winger Patrik Laine, the Jets possess high-end offensive firepower that has torched the league for the better part of the season. Minnesota, meanwhile, enjoyed strong seasons from Eric Staal (76 points), Mikael Granlund (67 points) and Jason Zucker (64 points). Let’s take a quick look at some summary statistics from the regular season.

Stats from Corsica.hockey

Stats from Corsica.hockey

The Jets scored 23 more goals than the Wild over the season, though much of this can be explained by their superior power play. Jets skaters had a higher shooting percentage, though the difference is too small to reasonably infer superior shooting ability. The Jets outperformed the Wild at generating shot attempts and scoring chances, though the Wild were able to create more high-danger scoring chances. While individual point totals suggest Winnipeg has more high-end forwards, we can examine depth charts to clarify the picture.

depth chart.jpg

The graphic above shows the current depth charts (courtesy of Daily Faceoff) and each player’s rank among NHL forwards in even-strength primary points per 60 minutes. Here we confirm our belief that Winnipeg’s forward group is much deeper than Minnesota’s, as we can see that six Jets produced at a top-line rate compared to just three Wild players. To understand how the above results were achieved, we turn to heat maps.

Heat maps created and available on HockeyViz.com

Heat maps created and available on HockeyViz.com

The red areas indicate locations where a team shoots more frequently than league average, while blue is the inverse. In these maps we can see two teams who have a very different approach to generating offence. The Jets set up a triangle of attack, which results in a high volume of shots coming from the points and the mid-high slot. Being able to attack the slot with such regularity doubtlessly contributed to the success that the Jets experienced this season. The Wild, meanwhile, seem to play more on the perimeter with the goal of funneling pucks towards the crease. This explains why Minnesota produced more high-danger chances than the Jets despite generating less total scoring chances.

The offence matchup clearly favours Winnipeg. The Jets have the top-end firepower and the depth to roll scoring threats on every line. Throw in a dangerous power play, and the Jets are dangerous enough to make life miserable for anyone attempting to contain them.

Defensive Matchup

Winnipeg Jets:

Josh Morrissey – Jacob Trouba

Joe Morrow – Dustin Byfuglien

Ben Chiarot – Tyler Myers

Minnesota Wild:

Jonas Brodin – Matthew Dumba

Carson Soucy – Jared Spurgeon

Nick Seeler – Nate Prosser

The Winnipeg Jets allowed 216 goals in 2017/18, with 144 coming at even strength, while Minnesota allowed 229 goals (144 at 5v5). Winnipeg gave up an average of 31.9 shots per game, while Minnesota surrendered 31.3 on average. In terms of possession metrics, Winnipeg controlled 51.42% of shot attempts over the course of the 2017/18 season, good for 10th in the league, while Minnesota sits 29th with only 47.17% of shot attempts.

Comparing the top pairing defencemen for both teams using HERO charts:

http://ownthepuck.blogspot.ca/2017/05/hero-charts-player-evaluation-tool.html

http://ownthepuck.blogspot.ca/2017/05/hero-charts-player-evaluation-tool.html

The Minnesota Wild’s defence corps has taken a significant blow going into the postseason with the loss of number 1 defenseman Ryan Suter, who logged an average of 26:46 minutes of ice time per game before suffering a season-ending ankle injury on March 31. Veteran defender Jared Spurgeon remains a game-time decision due to an injured hamstring. The burden to cover these minutes will fall squarely on the shoulders of young defensemen Jonas Brodin and Matt Dumba, who will be counted on in key defensive situations. The Winnipeg Jets boast a tough lineup of physical defencemen, including Dustin Byfuglien and Tyler Myers, who will look to shut down the Wild’s top offensive lines. The Winnipeg Jets have the edge when it comes to top-tier defencemen, as well as much stronger depth on the blueline overall.

Finally, let’s compare the heat maps for both Winnipeg and Minnesota in their own defensive zones.

Heat maps created and available on HockeyViz.com

Heat maps created and available on HockeyViz.com

Taking a look at these maps, both teams are effectively limiting the number of scoring chances from high-danger scoring areas around the net (<25 feet) and in the slot. Minnesota’s heat map clearly indicates that the majority of chances are coming from the point (>40 feet out from the net) and down the right side, a potential weakness that Winnipeg’s quick wingers will look to exploit. Winnipeg’s defence is managing to limit almost all chances from high-scoring areas directly in front of their net, keeping the majority of shot attempts to the outside perimeter of the rink.

Goaltending Matchup:

We close our positional matchups by considering goaltending. Winnipeg will rely on Connor Hellebuyck, who broke out this year to post the winningest season ever by an American goalie. The young upstart will go toe to toe with Devan Dubnyk, the waiver-wire reclamation project that Minnesota has turned into a competent starter. Dubnyk has the qualitative advantage of playoff experience, but let’s see how the numbers stack up.

goalies.jpg

Unless otherwise specified, the above percentages reflect even-strength play. We see that Hellebuyck and Dubnyk performed similarly at even strength, as their save percentages for low, medium and high danger shots are all within a single percentage point. Where we see a difference, however, is on the special teams. While these stats are influenced by the quality of special team units, we see that Hellebuyck has significantly outperformed Dubnyk on both power plays and penalty kills. We also see that Hellebuyck saved about 2 goals more than expected given the quality of the shots being faced, whereas Dubnyk was over 7 goals in the hole on this metric.

If there had to be a choice between the two to start a Game 7, Connor Hellebuyck would be a safe choice. Despite his inexperience, his exceptional season played a huge role in Winnipeg’s ascension to 2nd place in the NHL’s overall standings. He’s shown to be better than Dubnyk at stopping the puck, and for that reason, he gives his team a better chance to win.

In summary, the numbers indicate that Winnipeg has the advantage in terms of offense, defense, and goaltending. The Jets enter the playoffs on an absolute tear, having won 11 of their last 12 games. They are 3-1-0 vs. the Wild in their season series. We are predicting that the Winnipeg Jets will be victorious in their first-round series against the Minnesota Wild, likely in 5 or 6 games.