5 votes

A common misconception is that the risk of overfitting increases with the number of parameters in the model. In reality, a single parameter suffices to fit most datasets

@lopezdeprado:
A common misconception is that the risk of overfitting increases with the number of parameters in the model. In reality, a single parameter suffices to fit most datasets: https://t.co/4eOGBIyZl9 Implementation available at: https://t.co/xKikc2m0Yf

4 comments

  1. joelthelion
    Link
    A surprising result, nicely illustrated.

    A surprising result, nicely illustrated.

    1 vote
  2. [3]
    Greg
    Link
    Huh. I wouldn't have expected that result at all, nor would I have intuitively picked up that it applies to pretty much any data set if you define the representation correctly. Beyond the abstract...

    Huh. I wouldn't have expected that result at all, nor would I have intuitively picked up that it applies to pretty much any data set if you define the representation correctly.

    Beyond the abstract interest (and the clear refutation for anyone claiming parameter count maps to complexity), it seems like there might be some interesting applications for lossy compression; probably fairly niche ones given you'd be swapping CPU load for bandwidth/storage, when the latter is generally much cheaper, but maybe useful in extremely high latency or low reliability signalling.

    1. [2]
      joelthelion
      Link Parent
      I highly doubt it. You're not magically saving more information into that one parameter, but rather using the fact that reals have infinite precision and thus the ability to store an infinite...

      it seems like there might be some interesting applications for lossy compression

      I highly doubt it. You're not magically saving more information into that one parameter, but rather using the fact that reals have infinite precision and thus the ability to store an infinite amount of information. Computer approximations of reals such as float or double have finite precision and can't be used for that purpose.

      1 vote
      1. Greg
        Link Parent
        You're right, I've just looked back at it and my initial assumption of how this worked was wrong - I was thinking that it was essentially a brute-force search for an input parameter that caused...

        You're right, I've just looked back at it and my initial assumption of how this worked was wrong - I was thinking that it was essentially a brute-force search for an input parameter that caused the equation to converge to a given output (hence my CPU vs storage comment), but it seems that the information itself is just encoded directly in the input parameter.

        2 votes