5
votes
A common misconception is that the risk of overfitting increases with the number of parameters in the model. In reality, a single parameter suffices to fit most datasets
@lopezdeprado:
A common misconception is that the risk of overfitting increases with the number of parameters in the model. In reality, a single parameter suffices to fit most datasets: https://t.co/4eOGBIyZl9 Implementation available at: https://t.co/xKikc2m0Yf
A surprising result, nicely illustrated.
Huh. I wouldn't have expected that result at all, nor would I have intuitively picked up that it applies to pretty much any data set if you define the representation correctly.
Beyond the abstract interest (and the clear refutation for anyone claiming parameter count maps to complexity), it seems like there might be some interesting applications for lossy compression; probably fairly niche ones given you'd be swapping CPU load for bandwidth/storage, when the latter is generally much cheaper, but maybe useful in extremely high latency or low reliability signalling.
I highly doubt it. You're not magically saving more information into that one parameter, but rather using the fact that reals have infinite precision and thus the ability to store an infinite amount of information. Computer approximations of reals such as float or double have finite precision and can't be used for that purpose.
You're right, I've just looked back at it and my initial assumption of how this worked was wrong - I was thinking that it was essentially a brute-force search for an input parameter that caused the equation to converge to a given output (hence my CPU vs storage comment), but it seems that the information itself is just encoded directly in the input parameter.