11 votes

You can now train a 70b language model at home (if you have a dual-3090 or better)

Posted March 8, 2024 by unkz

Tags: artificial intelligence, language models.large, gpt, source.answer ai, long read

https://www.answer.ai/posts/2024-03-06-fsdp-qlora.html

Link information

This data is scraped automatically and may be incorrect.

Published: Mar 6 2024
Word count: 3979 words

1 comment

unkz (OP)
March 8, 2024 (edited March 8, 2024)
Link
This is kind of exciting, as this is not a super expensive setup -- I've already got this in my desktop. This is the realm of pretty high capability modeling -- no surprise they targeted 70b since...

This is kind of exciting, as this is not a super expensive setup -- I've already got this in my desktop. This is the realm of pretty high capability modeling -- no surprise they targeted 70b since that's the size of the largest Llama2 model. I firmly believe that current GPT4 capabilities will eventually be achievable within this size, although obviously not with today's technology.

The article itself is also actually a really good overview of the history of training optimization and where the current state of the art is.

5 votes