Predicting the NBA MVP with Machine Learning

~sports.basketball Text 989 words

Predicting the NBA MVP with Machine Learning Thesis Every season, basketball fans debate who deserves the MVP award. We built 3 machine learning models that attempt to answer that question using...

Predicting the NBA MVP with Machine Learning

Thesis

Every season, basketball fans debate who deserves the MVP award. We built 3 machine learning models that attempt to answer that question using box score statistics. At the end of each season, this award is determined by a panel of voters.

Methodology

Each model is trained on every NBA season from 1974 to 2017. For each player season, it looks at nine statistics:

Points, assists, blocks, defensive rebounds, and field goals per game the core production numbers
Win Shares (WS): an estimate of how many wins a player contributed to their team
Value Over Replacement Player (VORP): how much better a player is than a league average replacement
Box Plus/Minus (BPM): a player's net impact per 100 possessions
Usage Rate (USG%): what share of team plays run through that player

From those nine numbers, the model learns what a typical MVP season looks like versus a non MVP season, then applies that knowledge to current players. Each model outputs an independent probability that a given player wins MVP, not a share of a single pool, so the values do not sum to 1. Think of it as each player's individual odds.

Three Models, One Question

Rather than relying on a single approach, the system runs three different models and lets you compare:

Logistic Regression

The simplest of the three. It draws a straight line through the data, each statistic gets a weight, and a player's score is the weighted sum of their stats. It's easy to interpret (a higher coefficient means that stat matters more).

Win Shares (WS) is by far the most influential feature, with an absolute coefficient of ~1.85, nearly double the next most important feature. Box Plus/Minus (BPM) ranks second at ~1.0, followed by Defensive Rebounds per Game (DRBPG, ~0.85) and Assists per Game (ASTPG, ~0.70). VORP and Field Goals per Game (FGPG) contribute moderately at ~0.50. Blocks per Game (BLKPG), Points per Game (PTSPG), and Usage Rate (USG%) have minimal weight, all under 0.15.

Random Forest

Builds hundreds of decision trees, each one asking a series of "is this stat above or below X?" questions and averages their answers. It handles complex relationships between stats well and is less sensitive to any one unusual data point. Think of it as a large committee of simple rules voting together.

WS again dominates at ~0.31, accounting for roughly twice the importance of the next feature. VORP (~0.15) and BPM (~0.125) rank second and third. DRBPG (~0.10), PTSPG (~0.08), BLKPG (~0.07), FGPG (~0.065), and ASTPG (~0.06) contribute in a fairly tight mid-range band. USG% is the least important at ~0.05. Compared to logistic regression, the Random Forest spreads importance more evenly across features.

Gradient Boosting

Also uses decision trees, but builds them sequentially: each new tree focuses on correcting the mistakes the previous ones made.

This model is heavily concentrated on just two features: BPM (~0.47) and WS (~0.41) together account for roughly 88% of total feature importance. All remaining features, PTSPG, VORP, ASTPG, DRBPG, contribute ~0.02–0.03 each, and BLKPG, USG%, and FGPG are effectively unused (near zero). This suggests the gradient boosting model learned that BPM and WS alone are nearly sufficient to separate MVP candidates.

Historical Results

The models were trained on data through 2017, so every season from 2018 onward is a genuine out of sample test, the models have never seen these players or seasons before.

Season	Actual MVP	LR	RF	GB
2018	James Harden	#2	#2	#1 ✓
2019	Giannis Antetokounmpo	#1 ✓	#1 ✓	#1 ✓
2020	Giannis Antetokounmpo	#1 ✓	#1 ✓	#1 ✓
2021	Nikola Jokić	#1 ✓	#1 ✓	#1 ✓
2022	Nikola Jokić	#1 ✓	#1 ✓	#1 ✓
2023	Joel Embiid	#2	#4	#2
2024	Nikola Jokić	#1 ✓	#1 ✓	#1 ✓
2025	Shai Gilgeous-Alexander	#3	#2	#569

Top-1 accuracy: LR 5/8 · RF 5/8 · GB 6/8

Top-3 accuracy: LR 8/8 · RF 7/8 · GB 7/8

Top-3 accuracy: LR 8/8 · RF 7/8 · GB 7/8

For five straight seasons (2019–2022 + 2024), all three models agreed on the same #1 pick, and were right every time.

In 2023, every model ranked Nikola Jokić #1, and by the numbers, he arguably had the better season. Joel Embiid won the award anyway, the kind of outcome that may reflect voter narrative/fatigue and team performance rather than pure statistics. In 2025, Gradient Boosting ranked Shai Gilgeous-Alexander outside the top 500, while Logistic Regression and Random Forest had him at #3 and #2 respectively. I have no idea why GB did this. Likely a bug.

Future Direction

No model is perfect, and these have known blind spots. Team record is not included, MVP voters have historically punished players on losing teams regardless of individual stats. Injuries and narrative don't appear in a box score. And the training data skews toward an older era; the three point revolution and the rise of players like SGA have introduced statistical profiles the 1970s–1990s data doesn't fully capture.

Current Season Predictions (2025–26)

	LR	RF	GB
#1	Nikola Jokić	Shai Gilgeous-Alexander	Nikola Jokić
#2	Shai Gilgeous-Alexander	Nikola Jokić	Victor Wembanyama
#3	Victor Wembanyama	Victor Wembanyama	Giannis Antetokounmpo
#4	Luka Dončić	Giannis Antetokounmpo	Kawhi Leonard
#5	Jalen Johnson	Luka Dončić	Luka Dončić

Two of the three models have Nikola Jokić as the frontrunner. Random Forest is the dissenter, putting Shai Gilgeous-Alexander ahead. Victor Wembanyama appears in all three top 3s in just his second season, which is notable. Before running the models, I expected him to be #1 for all of them considering the way the models use advanced stats.

Conclusion

Thank you for reading. I hope you found this interesting. Basketball reference also has their own model if you would like to see a different result. Please do not gamble on my models!

13 votes

New search engine reveals if ancestors were in Nazi party

~humanities.history Article

6 comments

BBC

April 15

23 votes
The space race (back) to the Moon: Artemis, moon bases a competition beyond orbit

~space Video 51:36

1 comment

YouTube: Perun

April 14

7 votes
Students develop faux but sexy robotic sage grouse to strut their stuff in an effort to move a Grand Teton National Park breeding-ground lek away from jets
~science
- biology
Article 1022 words, published Mar 24 2026
3 comments

wyofile.com

April 13

25 votes
Tesla's supervised self-driving software gets Dutch okay, first in Europe

~transport Article 685 words

1 comment

Reuters

April 13

14 votes
What might be going on with this indie game "fansite"?

~games Ask

I recently came across an interesting-looking indie game, Idols of Ash. Basically, you have to use a simple grapple-and-swing mechanic to descend through an eldritch underground complex while...

I recently came across an interesting-looking indie game, Idols of Ash. Basically, you have to use a simple grapple-and-swing mechanic to descend through an eldritch underground complex while being pursued by a dangerous "murderpede" monster.

I first played it on what I thought was the official site, idolsofash.fun. It's a pretty spiffy design, with a playable web version, extensive FAQs, strategy guides, and embedded images and video of the game. But I ran into some bugs while playing -- no sound effects, weird lighting. When I mentioned these flaws on the developer's Itch.io page, they responded that they had nothing to do with the site.

Turns out it has a disclaimer at the very bottom: "Unofficial fan site. Not affiliated with or endorsed by Leafy Games." Buying and installing the actual version solved my tech issues. And in playing the game more, I noticed that the various guides on the site were subtly wrong in a lot of ways. The About page claims it's maintained by a big fan of the game, but in hindsight the whole thing seems AI-written and full of hallucinations.

Thing is, I don't get the angle here. There's no advertising on the site. It prominently links directly to the game's official Steam and Itch pages, so they're not trying to deliver malware or intercept the developer's sales. I assume the glitches are from a poor decompilation and rehosting of the original Godot engine game, but there's nothing to be gained from that. The presence of images and video suggests some level of human involvement in the site design, meaning it's not some cheap fire-and-forget thing. The URL and content are far too specific to flip into something else after gaining SEO rank. It presents (and acts) exactly like a non-commercial labor-of-love fansite (albeit one that shares the paid game for free in a broken state).

Could this be a genuine, if misguided, attempt by an actual fan to share the game using AI tools? Or is there some kind of scam I'm not seeing? Is this sort of fake AI fansite with embedded versions of the game a widespread problem with indie titles now?

5 comments

Jordan117

April 11

23 votes
Finishing the toy that Nintendo abandoned -- breathing new life into Mario Kart Live: Home Circuit

~games Article 746 words, published Apr 5 2026

6 comments

itsthejoker.github.io

April 10

34 votes
Anthropic announces deal with Google, Broadcom, says revenue has tripled
~finance
- business
Article 441 words
27 comments

Quartz

April 9

31 votes
Bitcoin’s creator has hidden behind the pseudonym Satoshi Nakamoto for seventeen years. But a trail of clues buried deep in crypto lore led to a 55-year-old computer scientist named Adam Back.
~finance
- cryptocurrency
Article
15 comments

The New York Times

April 9

27 votes
As an antidote to AI and online translation tools, a Cornell German professor gives her students a typewriter-only assignment once a semester

~humanities.languages Article published Mar 31 2026

0 comments

Associated Press

April 8

20 votes
Nation's largest urban battery is being built in Daly City, California

~enviro Link

9 comments

canarymedia.com

April 4

16 votes
Proton Meet isn't what they told you it was
~tech
- privacy
Article 1693 words
22 comments

sambent.com

April 4

25 votes
An ED resident and developer built a free, medically accurate clinical casebook for every patient of The Pitt

~tv Link

7 comments

reddit

April 5

20 votes
US imports more from Taiwan than China for first time in decades

~finance Article

8 comments

Bloomberg

February 22

20 votes
Denuvo DRM has been cirmumvented using hypervisor based bypass

~games Link

16 comments

tomshardware.com

April 2

51 votes
Enjoying reading in the age of LLMs

~humanities Ask

I used to really value the art of essay writing. There seemed to be such a richness in the different ways people would construct arguments, structure those arguments, then deliver those arguments...

I used to really value the art of essay writing. There seemed to be such a richness in the different ways people would construct arguments, structure those arguments, then deliver those arguments stylistically, not just from the perspective of being persuaded as a reader but also from the perspective of seeing how a given writer thinks, relates to the living tradition of language, and understands the world conceptually. But it's basically lost most of its meaning to me in this age of LLMs. The reality is, LLMs are capable of writing texts that, if you gave them to a seasoned reader 5 years ago, they'd say it was well written and indicative of a truly thoughtful mind. Even if there currently exist certain tells with LLMs, those styles certainly existed in different ways in real human writing beforehand. Now, those perfectly reasonable set of styles are verboten and we have to dedicate half our deep focus to figuring out whether, or to what extent, an essay or article was written by AI. It's difficult to enjoy, let alone care, about essay writing and the writers behind them now.

I can still find value in books, though, because they were written in the past and I don't mind never reading any non-scientific book published after 2022 if it comes down to it.

6 comments

thearctic

April 2

23 votes
Why Swedish schools are bringing back books

~books Article 1383 words

12 comments

undark.org

April 2

15 votes
YouTube gets its own FAST channels

~tv Article 201 words

1 comment

lowpass.cc

April 2

6 votes
Quantum computing bombshells that are not April Fools

~science Article 513 words

1 comment

scottaaronson.blog

April 2

18 votes
Gyre
~creative
- writing
Article 2203 words, published Feb 17 2026
6 comments

xk3

March 31

15 votes
Inside the ‘self-driving’ lab revolution

~science Article 1787 words

3 comments

Nature

April 1

10 votes
AI software for smart glasses wins £1m prize for technology to help people with dementia

~health.mental Article 703 words, published Mar 18 2026

4 comments

The Guardian

March 29

10 votes
Landslide: a ghost story

~humanities Article 5432 words, published Dec 23 2025

1 comment

wrecka.ge

March 30

8 votes
Ageless Linux emerges to protest OS-level age verification laws
~tech
- linux
Article 946 words
42 comments

itsfoss.com

March 17

45 votes
Norway and Iceland have signed agreements to participate in the European Union's GOVSATCOM and IRIS2 secure communications programmes
~space
- satellites
Article 355 words, published Mar 26 2026
0 comments

europeanspaceflight.com

March 29

12 votes
Meta and YouTube found liable in landmark social media addiction trial

~health.mental Article 121 words

12 comments

BBC

March 25

53 votes
US regulator bans imports of new foreign-made routers, citing security concerns
~tech
- security.national
- security.cyber
Article 92 words
27 comments

Reuters

March 24

58 votes
Opinions wanted on regular DEXA scans
~health
- fitness
Ask (advice)
I’ve gone a bit too deep on a rabbit hole after an offhand comment about protein intake and how much protein I should actually be consuming. It turns out that the 1.6g/kg of body weight is fairly...

I’ve gone a bit too deep on a rabbit hole after an offhand comment about protein intake and how much protein I should actually be consuming. It turns out that the 1.6g/kg of body weight is fairly arbitrary and body weight itself is not a particularly good point to use for an estimate if you are overweight. With that in mind I have been wondering about getting a DEXA body composition scan. It would be useful, I think, because it can also tell me about visceral fat which is an area I am particularly concerned about.

It turns out that it’s pretty cheap to get done; about $45 if you sign up for quarterly scans with a company called BodySpec. Their whole thing is making things cheaper by having repeat visits; a quantity discount, if you will.

Before I decide to do this (and while I wait to hear back about if I can get one done for free with my health plan), I just wanted to get people’s opinions on them. Have you had one or a series done? And more importantly, how has it empowered you to improve your health?

In all honesty I’m not sure the results will encourage me to make any particular change in my lifestyle or routine that I wouldn’t have been able to figure out without it.

2 comments

Akir

March 25

7 votes
Fairphone released the industry’s first ever nature report - The impact of consumer electronics on nature and biodiversity

~enviro Link

5 comments

fairphone.com

March 24

24 votes
Michael Hafftka releases all of his ~3800 paintings as Creative Commons, explicitly for use in training AI
~arts
Link
6 comments

huggingface.co

March 22

23 votes
Quentin Tarantino and Sylvester Stallone are teaming for a 1930s-set series filming in black and white with “1930s cameras”

~tv Article 200 words

24 comments

tmz.com

March 20

15 votes
BYD claims five-minute electric vehicle charging with new battery tech

~transport Article 428 words, published Mar 6 2026

55 comments

autoweek.com

March 18

48 votes
What do you think about putting your driver's license in your digital wallet?
~tech
- android
- privacy
- apple
- google
Ask
I forgot my driver's license today but had my phone with me. I remembered seeing stories that google and apple both allow these (for some states) in the digital wallet. Before doing this, I...

I forgot my driver's license today but had my phone with me. I remembered seeing stories that google and apple both allow these (for some states) in the digital wallet.

Before doing this, I thought I would ask people here to weigh in on whether it is a good idea. Is it considered secure? Is it going to cause me more privacy issues than a physical card in my wallet?

This is also related to recent discussions about online age verification.

This is a related Tildes post from last year: Google Wallet adds age verification and more government ID support

44 comments

hobbes64

March 16

20 votes
Subnautica 2 publisher Krafton's CEO asked ChatGPT how to void $250 million contract, ignores lawyers, loses in court

~games Article 752 words

21 comments

404media.co

March 17

72 votes
Going to Europe this summer? Prepare for a long queue.

~travel Article 985 words

0 comments

BBC

March 17

17 votes
New technology promises to protect farmers from the next fertilizer shock
~finance
- business
Link
1 comment

acs.org

March 17

7 votes
A writing professor’s new task in the age of AI: Teaching students when to struggle
~life
- education.higher
Article 1145 words
18 comments

The Conversation

March 16

20 votes
RE//verse 2026: Hacking the Xbox One

~games Video 57:17

4 comments

YouTube: REverse Conference

March 15

14 votes
AI was eroding trust in my classroom — so I got rid of typed papers and bought my students notebooks instead
~life
- education.higher
Article 784 words, published Mar 7 2026
75 comments

Business Insider

March 12

37 votes
Rescue dog Rosie’s cancer shrinks after world-first mRNA vaccine
~health
- medicine
Link
3 comments

theaustralian.com.au

March 15

32 votes
NVIDIA forks Godot to add path tracing

~games Link

1 comment

80.lv

March 15

20 votes
Norwegian influencer buys failed property development in Spain to build ‘self-sufficient’ eco-community – Modern Eco Village plans to erect 500 homes, schools and shops
~design
- urban planning
- architecture
Article 1840 words, published Mar 2 2026
24 comments

elpais.com

March 7

23 votes
Last chance to watch: Where to stream every 2026 Oscar nominee before Sunday's big night

~movies Article 601 words

0 comments

pcmag.com

March 13

5 votes
English language music is losing its stranglehold on the charts – sixteen different languages appeared in Spotify's Global Top 50 last year, more than double the figure from 2020

~music Article 144 words

3 comments

mycketforvirrad

March 11

25 votes
The secretive company filling video game sites with gambling and AI

~games Article 4234 words

14 comments

aftermath.site

March 3

37 votes
Channel Surfer - Watch YouTube like it's cable tv

~tv Link

2 comments

channelsurfer.tv

March 11

9 votes
The first multi-behavior brain upload
~science
- biology
Article 546 words
17 comments

Substack: Dr. Alex Wissner-Gross

March 11

35 votes
US government announces pilot program for eVTOLS and ultralight aerial vehicles even without FAA certification

~transport Article 402 words

6 comments

WIRED

March 11

14 votes
Why do AI company logos look like buttholes?

~design Article 1341 words, published Apr 10 2025

1 comment

velvetshark.com

March 8

21 votes

Prev Next