-
23 votes
-
Predicting the NBA MVP with Machine Learning
Predicting the NBA MVP with Machine Learning Thesis Every season, basketball fans debate who deserves the MVP award. We built 3 machine learning models that attempt to answer that question using...
Predicting the NBA MVP with Machine Learning
Thesis
Every season, basketball fans debate who deserves the MVP award. We built 3 machine learning models that attempt to answer that question using box score statistics. At the end of each season, this award is determined by a panel of voters.
Methodology
Each model is trained on every NBA season from 1974 to 2017. For each player season, it looks at nine statistics:
- Points, assists, blocks, defensive rebounds, and field goals per game the core production numbers
- Win Shares (WS): an estimate of how many wins a player contributed to their team
- Value Over Replacement Player (VORP): how much better a player is than a league average replacement
- Box Plus/Minus (BPM): a player's net impact per 100 possessions
- Usage Rate (USG%): what share of team plays run through that player
From those nine numbers, the model learns what a typical MVP season looks like versus a non MVP season, then applies that knowledge to current players. Each model outputs an independent probability that a given player wins MVP, not a share of a single pool, so the values do not sum to 1. Think of it as each player's individual odds.
Three Models, One Question
Rather than relying on a single approach, the system runs three different models and lets you compare:
Logistic Regression
The simplest of the three. It draws a straight line through the data, each statistic gets a weight, and a player's score is the weighted sum of their stats. It's easy to interpret (a higher coefficient means that stat matters more).
Win Shares (WS) is by far the most influential feature, with an absolute coefficient of ~1.85, nearly double the next most important feature. Box Plus/Minus (BPM) ranks second at ~1.0, followed by Defensive Rebounds per Game (DRBPG, ~0.85) and Assists per Game (ASTPG, ~0.70). VORP and Field Goals per Game (FGPG) contribute moderately at ~0.50. Blocks per Game (BLKPG), Points per Game (PTSPG), and Usage Rate (USG%) have minimal weight, all under 0.15.
Random Forest
Builds hundreds of decision trees, each one asking a series of "is this stat above or below X?" questions and averages their answers. It handles complex relationships between stats well and is less sensitive to any one unusual data point. Think of it as a large committee of simple rules voting together.
WS again dominates at ~0.31, accounting for roughly twice the importance of the next feature. VORP (~0.15) and BPM (~0.125) rank second and third. DRBPG (~0.10), PTSPG (~0.08), BLKPG (~0.07), FGPG (~0.065), and ASTPG (~0.06) contribute in a fairly tight mid-range band. USG% is the least important at ~0.05. Compared to logistic regression, the Random Forest spreads importance more evenly across features.
Gradient Boosting
Also uses decision trees, but builds them sequentially: each new tree focuses on correcting the mistakes the previous ones made.
This model is heavily concentrated on just two features: BPM (~0.47) and WS (~0.41) together account for roughly 88% of total feature importance. All remaining features, PTSPG, VORP, ASTPG, DRBPG, contribute ~0.02–0.03 each, and BLKPG, USG%, and FGPG are effectively unused (near zero). This suggests the gradient boosting model learned that BPM and WS alone are nearly sufficient to separate MVP candidates.
Historical Results
The models were trained on data through 2017, so every season from 2018 onward is a genuine out of sample test, the models have never seen these players or seasons before.
Season Actual MVP LR RF GB 2018 James Harden #2 #2 #1 ✓ 2019 Giannis Antetokounmpo #1 ✓ #1 ✓ #1 ✓ 2020 Giannis Antetokounmpo #1 ✓ #1 ✓ #1 ✓ 2021 Nikola Jokić #1 ✓ #1 ✓ #1 ✓ 2022 Nikola Jokić #1 ✓ #1 ✓ #1 ✓ 2023 Joel Embiid #2 #4 #2 2024 Nikola Jokić #1 ✓ #1 ✓ #1 ✓ 2025 Shai Gilgeous-Alexander #3 #2 #569 Top-1 accuracy: LR 5/8 · RF 5/8 · GB 6/8
Top-3 accuracy: LR 8/8 · RF 7/8 · GB 7/8
Top-3 accuracy: LR 8/8 · RF 7/8 · GB 7/8
For five straight seasons (2019–2022 + 2024), all three models agreed on the same #1 pick, and were right every time.
In 2023, every model ranked Nikola Jokić #1, and by the numbers, he arguably had the better season. Joel Embiid won the award anyway, the kind of outcome that may reflect voter narrative/fatigue and team performance rather than pure statistics. In 2025, Gradient Boosting ranked Shai Gilgeous-Alexander outside the top 500, while Logistic Regression and Random Forest had him at #3 and #2 respectively. I have no idea why GB did this. Likely a bug.
Future Direction
No model is perfect, and these have known blind spots. Team record is not included, MVP voters have historically punished players on losing teams regardless of individual stats. Injuries and narrative don't appear in a box score. And the training data skews toward an older era; the three point revolution and the rise of players like SGA have introduced statistical profiles the 1970s–1990s data doesn't fully capture.
Current Season Predictions (2025–26)
LR RF GB #1 Nikola Jokić Shai Gilgeous-Alexander Nikola Jokić #2 Shai Gilgeous-Alexander Nikola Jokić Victor Wembanyama #3 Victor Wembanyama Victor Wembanyama Giannis Antetokounmpo #4 Luka Dončić Giannis Antetokounmpo Kawhi Leonard #5 Jalen Johnson Luka Dončić Luka Dončić Two of the three models have Nikola Jokić as the frontrunner. Random Forest is the dissenter, putting Shai Gilgeous-Alexander ahead. Victor Wembanyama appears in all three top 3s in just his second season, which is notable. Before running the models, I expected him to be #1 for all of them considering the way the models use advanced stats.
Conclusion
Thank you for reading. I hope you found this interesting. Basketball reference also has their own model if you would like to see a different result. Please do not gamble on my models!
13 votes -
Messy 2026 F1 cars leave a deeply disturbing impression
20 votes -
Opta removes all advanced statistical data from fbref.com
7 votes -
Mjällby AIF, a football team from a remote Swedish fishing village of 800 people, are on the brink of a fairytale league title in Allsvenskan
10 votes -
Norway's Olympic gold medallists Marius Lindvik and Johann André Forfang accept three-month suspensions for suit-tampering at the 2025 FIS Nordic World Ski Championships
6 votes -
MotoGP confirms C14 test for 100% non-fossil fuel
7 votes -
Check out my ongoing project where I try to find out how accurately a LLM can predict sports outcomes
5 votes -
Explaining the “Strava Tax”
12 votes -
How AI is powering the Boston Red Sox on the field and across operations
4 votes -
Two Norwegian ski jumpers have been disqualified from an event at the 2025 FIS Nordic World Ski Championships after their suits were found to have been manipulated
9 votes -
Grassroots clubs hold the key as Norway prepares for historic vote to scrap Video Assistant Referee at Norwegian Football Federation's annual general assembly
6 votes -
Norway on the verge of abolishing Video Assistant Referee from domestic football league after clubs in the country's top two divisions recommended formally that it should be discontinued
17 votes -
Disney, Fox and Warner Bros. Discovery call off plans to launch Venu sports streaming service
13 votes -
Norwegian matchdays have become a scene in which fans throw fishcakes, champagne corks and croissants onto the pitch against what they perceive to be the invasive technology of Video Assistant Referee
5 votes -
US judge temporarily blocks sports streaming service Venu, siding with Fubo on antitrust concerns
12 votes -
Top-flight match in Norway abandoned when fans staged protest against the use of Video Assistant Referees by throwing fishcakes, tennis balls and smoke bombs on the pitch
7 votes -
Plans for regulator illustrate inherently political nature of football
4 votes -
‘Burning Man for rednecks’: inside King of the Hammers, the gnarliest off-road race of the year
10 votes -
Warner, Fox, Disney to launch streaming sports joint venture
6 votes -
How to watch Super Bowl 2024: All the best streaming options
10 votes -
Haas team principal Ayao Komatsu believes that the team's new car will better suit Kevin Magnussen and his driving style
9 votes -
Supporters of clubs in Norway have demonstrated against the use of a Video Assistant Referee, while Sweden continues to hold out against introduction
7 votes -
Premier League to test video game-inspired camera angle this weekend
8 votes -
The fascinating physics of bowling
2 votes -
Golf is facing an existential crisis
10 votes -
Eight ways MLB is leaning on technology to keep fans engaged this summer
3 votes -
Danish Superliga club create virtual grandstand for fans to watch game on Zoom – three screens were joined together to make a 40 x 2.8m screen with space for 200 fan images
4 votes -
Remote cheering app could boost atmosphere in Japan's empty stadiums
4 votes -
Football team Brøndby IF in Denmark is using facial recognition to stop unruly fans
4 votes -
The golf ball that made golfers too good
6 votes -
America's Cup breakthrough as US make flying start towards Auckland 2021
6 votes -
Technology takes center court in the basketball world: The NBA elevates the game and fan experience thanks to technology and tech-savvy team owners willing to try new things.
6 votes -
Formula E starts season five in Saudi Arabia with a faster electric race car
7 votes -
Kelly Slater’s Shock Wave
5 votes -
How the Toyota cheat restrictor plates worked
4 votes -
The gambler who cracked the horse-racing code
5 votes