Predicting the 2024-2025 Most Improved Player

The Most Improved Player award in the NBA is a narrative-based award at its core. With machine learning models to guide our predictions, we wanted to investigate both the narrative and improvement aspects of this award. It is important to note that Victor Wembenyama is far and away the betting favorite for this award. For the sake of the argument, we are working with the assumption that Wembenyama is most likely to win this award and therefore, left him out of our analysis. 

Identifying MIP Candidates

We first wanted to identify a shortlist of players who, given that they have a good 2024-2025 season, will likely be considered for the award. Over the last 5 years, players who finished top 5 in MIP voting had average per-game stats that looked like this in the previous season:

For reference, here are Jonathan Kuminga and Austin Reaves’ stats from last season:

These are the caliber of players who typically find themselves in MIP conversations.

To begin assembling this shortlist, we analyzed the advanced stats of the 25 players who finished top 5 in MIP voting over the last five years, in comparison to the rest of the league. To balance out the data, player that are above 30 and player that averaged more than 25ppg were removed, since they have little room for improvement and have never won MIP before. We created a Logistic Regression model, KNN model, and SVM model to predict the players with the highest probability of finishing top 5 in MIP voting next season. After logically eliminating players that the model liked such as Tyrese Maxey, who just won last season, or players who are already bonafide stars, here’s what our models yielded

Using these results we established a shortlist of:

Immediately we noticed that no player in top 5 voting ever had a negative offensive win share the previous season. From our shortlist, Jeremy Sochan did. We also noticed that no player had a negative offensive and overall plus/minus. From our shortlist, Christian Braun and Andrew Nembhard did. This left us with 15 players that we felt were eligible to win the MIP award, given they had a big upcoming season. 

Predicting Improvement

The final step was creating tiers out of these 15 players, ranking them based on how likely we felt that they would have an explosive season and win the award. We took three steps to do this:

1. We looked at the main stats where players improve in the season they are voted for MIP. 

To compare to earlier, here are the average stats of a player who receives top 5 MIP votes, over the last 5 years:

We find that the biggest jump in stats comes in: Minutes Played, Field Goals Attempted, and Assists. These 5 players had the best upwards trends in those stats throughout last season.

2. Understanding the context of these players’ roles last season vs this season was also important.

Understanding which players on this list will have expanded roles is important to their chances to win this award. 

  • Players like Podziemski and Kuminga will now have a larger role with the departure of Klay Thompson. 

  • RJ Barrett began shooting his career best numbers after his trade to Toronto, where he is a main piece of the offense.

  • Chet Holmgren and Jalen Williams already took great leaps last season, and are not the main options on a team with title aspirations. This seems negative for their chances to make much more improvement next season.

3. Lastly, we created an improvement model, based on win shares, to quantify each player’s predicted improvement

Zhengfeng Liu analyzed data from 1978 to 2015 to create a model to predict player improvement from one season to another. He did this by modeling several key statistics onto an “improvement” number, which he established as the change in Win Share from the next season to the current one. For example, if a player in 2010 had a win share of 3.2 and in 2011 had a win share of 2.1, their improvement score for 2010 would be a -1.2. We reconstructed his dataset for the years 2010-2023, by scraping from BasketballReference, and retrained his models to apply to last season's data. After trying a Linear Model, an SVR Model, and a Random Forest Model, the SVM Model yielded us the best mean squared error(3.08) on our test data.

After applying this model to our 15 players, we were given these improvement scores:

Final Predictions

Previous
Previous

Poetry in Motion: Leveraging Pre-Snap Motion

Next
Next

European Championship vs. COPA America