10 votes

The unstoppable rise of disposable ML frameworks

5 comments

  1. [4]
    skybrian
    Link
    (Instead of "disposable" I would say "specialized".) From the blog post: ... ...

    (Instead of "disposable" I would say "specialized".)

    From the blog post:

    The GGML framework is just over a year old, but it has already changed the whole landscape of machine learning. Before GGML, an engineer wanting to run an existing ML model would start with a general purpose framework like PyTorch, find a data file containing the model architecture and weights, and then figure out the right sequence of calls to load and execute it. Today it’s much more likely that they will pick a model-specific code library like whisper.cpp or llama.cpp, based on GGML.

    This isn’t the whole story though, because there are also popular model-specific libraries like llama2.cpp or llama.c that don’t use GGML, so this movement clearly isn’t based on the qualities of just one framework. The best term I’ve been able to come up with to describe these libraries is “disposable”. I know that might sound derogatory, but I don’t mean it like that, I actually think it’s the key to all their virtues! They’ve limited their scope to just a few models, focus on inference or fine-tuning rather than training from scratch, and overall try to do a few things very well. They’re not designed to last forever, as models change they’re likely to be replaced by newer versions, but they’re very good at what they do.

    By contrast, traditional frameworks like PyTorch or TensorFlow try to do many different things for a lot of different audiences. They are designed to be toolkits that can be reused for almost any possible model, for full training as well as deployment in production, scaling from laptops (or even in TF’s case microcontrollers) to distributed clusters of hundreds of GPUs or TPUs. The idea is that you learn the fundamentals of the API, and then you can reuse that knowledge for years in many different circumstances.

    ...

    I was responsible for creating and maintaining the Raspberry Pi port of TensorFlow for a couple of years, and it was one of the hardest engineering jobs I’ve had in my career. It was so painful I eventually gave up, and nobody else was willing to take it on! Because TF supported so many different operations, platforms, and libraries, porting and keeping it building on non-x86 platform was a nightmare. There were constantly new layers and operations being added, many of which in turn relied on third party code that also had to be ported. I groaned when I saw a new dependency appear in the build files, usually for something like an Amazon AWS input authentication pip package that didn’t add much value for the Pi users, but still required me to figure out how to install it on a platform that was often unsupported by the authors.

    The beauty of single-purpose frameworks is that they can include all of the dependencies they need, right in the source code. This makes them a dream to install, often only requiring a checkout and build, and makes porting them to different platforms much simpler.

    ...

    One reason I’m so sure is that we’ve seen this movie before. I spent the first few years of my career working in games, writing rendering engines in the Playstation 1 era. The industry standard was for every team to write their own renderer for each game, maybe copying and pasting some code from other titles but otherwise with little reuse. This made sense because the performance constraints were so tight.

    5 votes
    1. [3]
      Pioneer
      Link Parent
      I wonder how much this is like the schizm in BI tools in the mid-2010s? There's two schools of thought around tooling and it's either GUI/Low-Code or Code. There's some truly insane people who...

      I wonder how much this is like the schizm in BI tools in the mid-2010s?

      There's two schools of thought around tooling and it's either GUI/Low-Code or Code. There's some truly insane people who insist that EVERYTHING must be coded, and there's some insane folks who insist No-Code is God.

      The same seems to be happening for ML tooling and frameworks at the moment. Easy to throw down packages that may come with serious cost implications because you can't tweak what you need when and where vs the heavy-handed Data Sci guys who want it from the ground up.

      Wonder where the future will take us?

      3 votes
      1. [2]
        xk3
        Link Parent
        I'm curious what you imagine a healthy balance looks like? Do you have some examples? Thanks

        There's some truly insane people who insist that EVERYTHING must be coded, and there's some insane folks who insist No-Code is God

        I'm curious what you imagine a healthy balance looks like? Do you have some examples? Thanks

        1. Pioneer
          Link Parent
          I don't think there is such a thing. In Data we've decided that tech & tools are 'everything' and it's bloody exhausting. Data Platforms / Solutions should be capable of handling almost anything...

          I don't think there is such a thing. In Data we've decided that tech & tools are 'everything' and it's bloody exhausting.

          Data Platforms / Solutions should be capable of handling almost anything thrown at it. ML is exactly the same really, some folks will use tools like Alteryx and others will use Py... so use the right tool for the job.

          1 vote
  2. DawnPaladin
    Link
    Interesting. Sometime soon I want to add some AI skills to my toolbox. I have lots of experience with JavaScript but no computer science degree and I don't remember any math beyond algebra. I've...

    Interesting. Sometime soon I want to add some AI skills to my toolbox. I have lots of experience with JavaScript but no computer science degree and I don't remember any math beyond algebra. I've been planning to do Practical Deep Learning for Coders, which uses FastAI and PyTorch. It sounds like this author is saying that path is deprecated. Is there a better choice?

    2 votes