A Data Based Analysis on Salaries and Marginal Revenue Products of National Hockey League Players.

I had the opportunity to write my final thesis for my Bachelor's degree in Business Administration and Statistics at the University of Hamburg in the field of sports data analysis. The paper examines whether players in the National Hockey League (NHL) are paid fairly in the Salary Cap Era based on their Marginal Revenue Product (MRP). The concept of the analysis is derived from existing work in baseball, of which Scully (1974) laid the foundation. In my thesis, the theories were transferred and expanded to an ice hockey context. The result of the analysis shows that players at all position levels are underpaid in relation to their MRP. There are differences in the degree of underpayment between positions, but also between individual contract groups of players.

In the following, I will highlight the main points of the analysis. Do not hesitate to contact me with questions or for access to my full paper to get all the facts explained in more detail.

reading time: about 15 mins

Introduction

Since the introduction of the so-called “hard” salary cap as a result of the lockout in the 2004/2005 season, fans and re-searchers of the NHL have repeatedly asked about fair pay for players. Especially compared to other sports leagues in the US, salaries for NHL players are far below those in the Major League Baseball (MLB) or National Basketball Association (NBA).

The aim of the analysis is to use the MRP as an economic measure to identify possible salary imbalances in the NHL and to identify the possible causes of these imbalances. The MRP is often referred as the surplus income resulting from the recruitment of an additional worker. In principle, a firm would be inclined to offer a wage to a worker commensurate with their MRP. In a competitive labor market, it is reasonable to expect that workers receive wages that align with their MRP. The marginal income generated by one unit of input is the product of two factors: the change in physical output which is called Marginal Product (MP) and the Marginal Revenue (MR) earned per unit of physical output. The MRP is therefore simply calculated by the following formula:

MRP = MP * MR

The data set used for the analysis includes relevant salary and player data from the 2011/2012 to 2021/2022 season. Applying a three-step method, multiple linear regression is first used to relate team revenue to team winning percentage and other explanatory variables. In the second step, another multiple linear regression estimates the relationship between winning percentage and team statistics. Based on the estimates of these two models, in a third step, the MRP of players will be calculated by first predicting how player statistics affect the team's winning percentage and then relating the effect of winning percentage to team revenue. The results of the MRP calculation are then used to calculate the MRP-Salary-Ratio (MSR) of players. These ratios are used to evaluate the degree of under- or overpayment of the players.

An MSR for player i of 0 signifies a complete alignment between MRP and actual salary, indicating that a player is receiving fair compensation, as would be expected in a perfectly competitive labor market. As the MSR increases, disparities between MRP and salary grow, indicating potential underpayment. Conversely, a negative MSR suggests that a player is being paid more than their contributions warrant. It is important to note that, by definition, MSR is capped at an upper limit of 1.

The Model in Theory

As already mentioned, the structure of the analysis can be divided into three steps. In the first step, a multiple linear regression used to relate team revenue to team winning percentage and other explanatory variables.

For each team t and season j, WIN% is the winning percentage which is calculated by the total wins divided by the total games in a season. The playing performance of a team is divided by positional groups as forwards have other statistical metrics on how to evaluate their games as defenders e.g. The following equation shows the regression model used to calculate the estimates for the first step of the analysis.

I will not go into too much detail in this version of my work and will thus not explain all the variables of this and the following models as most are variables commonly used in the ice hockey world and are available for internet research. All variables were adjusted when transferring the models from the baseball to the ice hockey context and comparison metrics from ice hockey were used for baseball metrics. Metrics that could not be transferred were also omitted and supplemented with special ice hockey metrics. For example, the playing performance of a team’s forwards is measured by separating their offensive output with their goals per 60 minutes (G/60) and their assists per 60 minutes (A/60), which both can be seen equivalent to the RUN metric in baseball. The two variables that should be mentioned in the model for WIN% are the CONT and OUT variables. These are two dummy variables and are intended to quantify team morale. CONT is trying to measure the effect of “hustle“ and other factors important to contending for a playoff spot. This variable exists with the assumption that teams that are in contention for a playoff spot have a higher Win Percentage than teams that are no longer in contention for the spots. CONT will be 1 if a team finishes 10 or less points out of playoff contention and 0 otherwise. The OUT variable, on the other hand, measures the other side. It measures the demoralization of teams that are at the bottom of the table. These teams often let younger players play or trade their star players to other teams at the trade deadline. OUT will be 1 if a team finishes 10 or less points to the last team and 0 otherwise.

The second step of the analysis estimates the influence of Team Performance as well as other relevant factors on Team Revenue. The revenue equation models the variation in Team Revenue by treating it as a linear function of both Win Percentage (Win%) and a set of team-specific characteristics.

With the idea of measuring effects on revenue with performance and urban based variables, this model uses the win percentage as an performance indicator and two metropolitan area and two year specific dummy variables as urban control variables. To control for region-dependent influences on revenue, the attendance (ATT) and population (POP) variables are used. The population variable is intended to capture the size to which a team can sell its product. The assumption is that the majority of the inhabitants in a metropolitan area also support the team in the metropolitan area. The Attendance variable is intended to quantify interest in the team, with the assumption that higher Attendance numbers represent a generally greater interest in the team. A linear time trend is defined as 1 in the 2011/12 season and 2 in the 2012/13 season up to 11 in the 2021/22 season. The coefficient (TREND) is intended to measure the increase in team revenues which are independent of the WIN% and can be found in all teams. A dummy variable (LOCK) for the 2012/13 lockout is intended to control for the shortened season and the resulting loss of revenue from teams due to fewer games played that year.

Determining whether NHL players are paid their MRP requires an independent calculation of individual MRP’s derived from the previous two regression models. In this case, the players MRP's are calculated differently according to their positions. In the following I will show the equation used to calculate the MRP of a defensemen as an example. The idea and the composition of the equations for all position groups are quite similar and only differ by the different variables used to evaluate a player. The MRP equation for defensemen is formed as follows:

The equation can be divided into two components, which relate the offensive play and the defensive play of a defender to the MR. The first part calculates the offensive output of a defensemen as the result between goal contributions as MP, measured in points per 60 minutes (AP/60), and the MR. For this, the coefficient for AP/60 ( ) from the first model is multiplied by the MR coefficient ( ) and the product of this is then multiplied by the actual AP/60 by the respective defender. The second part measures the defensive output by a defensemen by using the xGA/60 as the metric. This component is based on MacDonalds and Reynolds (1994) way of calculating a MLB pitchers MRP with the help of his ERA. Since an increase in the given MP, in our case the xGA/60, decreases the total MRP, the equation is rearranged using the intercept of the WP model so that for a theoretical value of xGA/60 = 0, the intercept serves as the MP and is then multiplied by the MR ( ). The higher the xGA/60 of a player, the lower his MRP will be at the end. However, a teams xGA/60 is not simply the sum of the defenders' individual xGA/60. It is a weighted average based on the percentage of minutes played (%Min) by a defender. The %Min is calculated by dividing the ATOI of a player by 60 minutes. Just as with an attacker's MRP, the sum of the two components for defenders forms their MRP.

The final step of the entire MRP model is to compare the calculated MRP’s of the individual players with their actual salaries. For this, the at the beginning of this chapter discussed MSR will be calculated for all individual players to calculate the servity of under- or overpayment.

Results

The results of the regression models suggest, that all models make an explanatory contribution according to their F-statistics . The goodness-of-fit ratios of all models can be classified as good when most of the explanatory variables are statistically significant, as all R2 are >0,5. Thus, all of the models have a very high model quality.

The model below shows the results of the Win% regression model:

Mean MSR for all goalies was at 0.06, meaning that around 6% of all MSR generated by goalies are expropriated by teams in NHL. According to Scully (1972), positive MSR numbers indicate underpayment of players. In this case, only a small amount of underpayment is recognizable when looking at all players. However, the mean is heavily skewed by the relative abundance of negative free agent contracts, as the median of the sample is 0.30. The mean MSR for entry-level players was 0.33, meaning that around 33% of the MSR generated by entry-level status goalkeepers was expropriated by teams. This figure shows an increase in underpayment of players with entry-level status compared to the total. This increase goes even further for goalkeepers with RFA status. The mean MSR for RFA status goalkeepers was 0.6, meaning that around 60% of the MSR generated by goalkeepers with RFA status was expropriated by teams. Only UFA status goalkeepers are on average not dealing with underpayment under the model, but very slight overpayment. The mean MSR of -0.06 means that around 6% of the MSR generated by the goalies was expropriated by them.

The table below shows the same statistics for defensemen:

The mean MSR for forwards was 0.65 which is strongly positive and a significant raise compared to the mean MSR of defensemen and goalkeepers. It means that around 65% of all MRP generated by forwards are expropriated by NHL clubs and that forwards are heavily underpaid according to this model. The MSR figure is even more drastic when looking at entry-level and RFA status forwards. Entry-level forwards have a mean MSR of 0.81, meaning that around 81% of the MRP generated by forwards is expropriated by NHL clubs. RFA status forwards have a MSR of 0.70 which means that 70% of the MRP generated by forwards is expropriated by NHL clubs. Compared to the two other positions, even UFA status players have a high MSR in the forwards group. A mean MSR of 0.65 for UFA status forwards means that around 65% of the MRP generated by UFA forwards is expropriated by NHL clubs.

The current data and the results obtained through this analysis show that players are underpaid regardless of their position in the NHL salary cap era. The division of players into contract groups makes sense to the extent that it becomes clear that there are also systematic reasons for the underpayment of players. As in previous baseball studies, it is above all the younger players with relatively inflexible contract statuses who are affected by salaries that don't mirror their real value when looking at playing performance. Players with RFA contract status are particularly affected by disproportionately low salaries. This can be attributed to the CBA. Theoretically, players can also immediately collect top salaries as RFAs, because the regulations do not prohibit them from doing so. In practice, however, this is usually not the case due to the clubs' position of power and control, which they gain through the regulations. As a result, it is hardly possible for players to earn appropriate salaries, or at least not to the full extent. In addition, players usually reach their peak form while they are still contractually bound to a team as an RFA. This reinforces the effect calculated by the model. Similar effects can be observed with entry-level status players. The analysis shows that, according to the MSR, entry- level status players are on average not far behind RFA players in terms of underpayment, regardless of position. However, entry-level players are usually rookies or players who have just come into the league and have to get used to the speed and quality at the NHL level.

Limitations of the analysis

While the findings regarding the equitable compensation of NHL players are intriguing and generally align with theoretical underpinnings drawn primarily from baseball literature, certain concerns arise regarding the methodology employed to estimate MRP. The calculation of the MRP does not follow a pattern or framework in ice hockey, unlike in baseball. Although the calculations in this study were based on comparable works from baseball or ice hockey, the variables selected in the models for estimating the regressions are more or less freely chosen and there are few benchmarks to take for example. This can lead to biased or one-sided results. A different result with other variables to evaluate player performance or team revenue would be conceivable. Furthermore, evaluating the validity of MRP estimates proves challenging due to the absence of a standardized criterion for assessment. In a hypothetical scenario with a perfectly competitive player market, one could gauge judgment based on the proximity of MRP to MSR. However, the reality of limited player mobility introduces a divergence from this expectation.

The G/60 variable of forwards has a coefficient of 0.143, indicating that for every increase of 1.0 G/60, the win percentage is expected to increase by 0.143, assuming all other variables are held constant. The coefficient of 0.110 on the A/60 variable of forwards of our regression means that, holding all other variables constant, win percentage is expected to increase 0.110 with every 1.0 increase of A/60. Both coefficients are statistically significant at the 99% level. Ff is the third and last variable for forwards and has a coefficient of 0.00002 meaning that a FF or FA increase of one would increase or decrease the win percentage by 0.00002 assuming all other variables are held constant. Ff is statistically significant at the 95% level. In our equation, P/60 and xGA/60 represent the variables used to evaluate defenders. The coefficient of P/60 is 0.072 which means that a change of 1.0 in P/60 increases the win rate by 0.072 when holding all other variables constant. The xGA/60 variable has a negative coefficient of -0.047 meaning that a 1.0 increase in xGA/60 decreases the teams win percentage by 0.047 when holding all other variables constant. Both coefficients are statistically significant at the 99% level. The goalkeeper performance in our model is evaluated with the aGAA. aGAA has a negative coefficient of -0.073 meaning that a 1.0 increase in aGAA decreases the teams win percentage by 0.073. The coefficient is statistically significant at the 99% level. The two dummy variables CONT and OUT measure player morale in our model and are statistically significant at the 99% level. CONT has a coefficient of 0.068 which means that teams that are in the playoff race have a higher win percentage of 0.068 than if they were not in the playoff race.

The parameter estimates of the team revenue function (REV) are shown below in the output model:

The coefficient of 35,613,119 on team winning percentage means that, holding all other variables constant, revenue is expected to increase by $ 35,613,119 for every 0.1 increase in winning percentage. Win Percentage is statically significant at the 95% level. All of the other variables in the model are statistically significant at the 99%. It can be seen, that a higher attendance and metropolitan area population increases the revenue. The lockout in the 2012/13 season lead to revenue losses for the teams and teams generated $ 7,219,095 every year in the time trend.

Now that all coefficients have been estimated, they can be plugged into the respective MRP equations to calculate the individual players' MRPs. The graph below shows the differences between the mean salaries and the mean MRP based on the position groups.

It can be see that the mean values of MRP and real salary on all positions differ in such a way that, on average, MRP is higher than real salary on all positions. In order to evaluate players based on their contract status as well, the MSR is calculated for each player. The table below summarizes the goalkeepers MSR statistics based on each contract type.

The average MSR for defenders was 0.36 which means that around 36% of all MRP generated by defensemen are expropriated by NHL clubs. Compared to goalkeepers, this figure shows a much higher ratio and means that the general level of underpayments for all defenders is even higher than for goalkeepers. If you divide all defenders according to their contract types, it becomes clear that entry-level as well as RFA players with their MSRs are on average far above the actual mean, similar to the goalkeepers. Entry-Level status players have a mean MSR of 0.50, meaning that around 50% of the MRP generated by entry-level defenseman in expropriated by NHL clubs. RFA status defenseman have a higher mean MSR with 0.58, meaning that around 58% of the MRP generated by RFA defenders is expropriated by the NHL clubs. Unlike goalkeepers, under the present model UFA status defenseman are also underpaid according to their median MSR. The average MSR of 0.30 for UFA status defensemen means that around 30% of the MRP generated by these players is expropriated by NHL clubs.

The table below shows the MSR statistics for forwards: