12
votes
What useful tasks are possible with an LLM with only 3B parameters?
Playing with Llama 7B and 13B, I found that the 13B model was capable of doing a simple task, rewriting titles in sentence case for Tildes submissions. The 7B model doesn't appear capable of the same task, out of the box.
I heard about Android's new AICore available on a couple of new devices. But it sounds like Gemini Nano, which runs on-device, can only handle 2B or 3B parameters.
Is this size of model useful for real tasks? Does it only become useful after training on a specific domain? I'm a novice and wanting to learn a little bit about it. On-device AI is an appealing concept to me.
When you say out of the box, what do you mean? Because the raw language model is indeed not very useful, but any of the instruct-tuned 7B variants are more than capable of such a simple task as changing sentence case.
3b models are good at quite a lot of tasks. Extracting keywords, certain kinds of summarization and content moderation are things I have applied them to. Also, non-LLM models such as image categorizers, or multi-modal models that can describe photos or videos in text form can also work effectively within such limited sizes.
That is helpful thanks! I should have clarified: I've been trying Ollama and their default llama2 model.
Looking at that page now, apparently Ollama's default uses a "Chat" variant:
I just tried the "llama2:text" (7B) variant mentioned on that page, and that was a mistake:
I should play with different training variants, it sounds like.
Update: This works better with a different prompt, on the default (Chat) 7B model:
Convert this title to sentence case without adding a period: Apple Releases macOS Sonoma 14.4.1 With Fix for USB Hub Bug
A slightly different prompt makes all the difference. Instead of returning the title unchanged, it now does what I ask. It's about trial and error to find the exact wording of the prompt.
So I definitely underestimated the power of 7B, and it's great knowing that a 3B model is going to be good enough for a lot of useful tasks as well.