This is kind of exciting, as this is not a super expensive setup -- I've already got this in my desktop. This is the realm of pretty high capability modeling -- no surprise they targeted 70b since...
This is kind of exciting, as this is not a super expensive setup -- I've already got this in my desktop. This is the realm of pretty high capability modeling -- no surprise they targeted 70b since that's the size of the largest Llama2 model. I firmly believe that current GPT4 capabilities will eventually be achievable within this size, although obviously not with today's technology.
The article itself is also actually a really good overview of the history of training optimization and where the current state of the art is.
This is kind of exciting, as this is not a super expensive setup -- I've already got this in my desktop. This is the realm of pretty high capability modeling -- no surprise they targeted 70b since that's the size of the largest Llama2 model. I firmly believe that current GPT4 capabilities will eventually be achievable within this size, although obviously not with today's technology.
The article itself is also actually a really good overview of the history of training optimization and where the current state of the art is.