17
votes
Is there a free LLM frontend that works out of the box?
I want something like typingmind but for free, and that doesn't require installation. mainly for gemini and mistral (or perhaps groq too) I just want to be able to paste my API key and just use it. I know about OpenWebUI and msty but OpenWebUI requires installation, and msty doesn't have an android version.
anyone know something like this ? (would also be nice if it supports LaTeX)
I'm not aware of any free utility like this. It sound like you just want a simple web app with a generic interface for different LLM services. IIUC you don't want to run the actual models locally (since that's not even possible for e.g. Gemini). However since this is a fairly trivial app to setup I gave Gemini the following prompt:
And after a few additional prompts I had a single html file that could talk to Gemini via the api (had to give it some of the curl commands used to talk to the api and the structure of the responses). I didn't test the mistral part since I don't have that service.
Once you have an html file that accomplishes what you need you could simply upload it as a static site and access it from any device. Note that it wouldn't sync conversations cross devices, but I'm sure you could prompt the LLM to use some syncing api to store your data... A bit more involved than just installing e.g. OpenWebUI somewhere to be honest though :)
not sure how I didn't think of that, but you're right. I used your prompt and it made a web page for it. works great. I wonder how if it can add more complex features like saving prompts
Unless you actually understand the output and know what targeted questions to ask it will be difficult to take much further purely by using LLMs. They tend to loose the plot over time as the context gets too big.
It's why I think they are great tools in areas where you have the knowledge to validate their output, be critical, adjust and restart a conversation chain. For everything else they might create a decent proof of concept but I wouldn't rely on them too much.
Not to mention that if you don't have the knowledge to validate that what you have now works as well as you think it does. Specifically when cost is concerned. I am not sure about google but anthropic and openai both offer what they call "prompt caching" which can reduce usage cost a lot. The way they implemented it is different for both, so you would need to actually know about these features to properly implement them. Since these are relatively new features the current models don't have much information in their training data about these things (if any).
It is just an example and how much of it is an issue really depends on your use case. But I am always highly skeptical about "let an LLM just generate it" comments. To the point that I was tempted to mark the comment above as noise (I didn't) because I think it is a bad practice to the point of being harmful to use LLMs in that way.
I'm curious why and more precisely what you mean. For me, as a software developer, I have the advantage of being able to scan the generated code and know where things seem off (and also guide the LLM to generate a valid solution). I could probably have created the same thing in maybe twice the time it took to write up the prompt and guide the LLM to a working solution (and it would probably be a bit neater and all that). But to me this application is really straightforward and boring. There is nothing new about it. It is essentially glue-code. So to me a generic interface for different LLMs via api calls seems like an obvious use case that can be generated (even without much development knowledge).
So in what way is it a "bad practice to the point of being harmful to use LLMs" in this way?
Frankly, I think your advice came from a position of “curse of knowledge”.
The bolded bit highlights why I find this advice problematic. You as a developer have the ability to properly validate the code that rolls out of it and act based on the results. It is a different story for someone without development experience, they cannot do this and have to pretty much blindly trust whatever the LLM produces.
I have no doubt that you can ask an LLM to create a generic boilerplate SPA as you suggested that works to some degree or the other. But even in that very basic question you can already get very different results. Different LLM providers have distinct API implementations (like prompt caching) that can significantly impact costs. You know how to double-check the API calls made against the API documentation. You can ask it to specifically target a specific type of API call. Because you have the experience that lets you this. Someone without that experience and knowledge of APIs cannot evaluate if the code is making efficient use of the APIs or if it might rack up unnecessary costs.
But, all things considered, if that had been the only part of your advice I wouldn't have had as much of an issue with it.
What really caused me concern is this part of your advice:
In my opinion this turned it into bad advice to give to someone you don't know has the experience to properly handle this.
Because your suggestion moves the entire thing from a local only application to something where they need to host it with internet exposed endpoints. Introducing needs for proper authentication, security against bad actors, and data protection. Even experienced developers can find implementing these securely challenging. How can someone without development experience validate that the LLM's code properly handles these things?
Not only that, LLMs also tend to “lose the plot” in longer conversations, especially with complex code contexts. They might generate code that appears to work but has subtle bugs or suddenly introduce regression in generated code.
For example, an LLM might initially generate code that properly handles API authentication, but in a follow-up prompt about adding a new feature, it might regenerate that section without the security checks while maintaining an apparently working interface. A regression you likely would be able to catch quickly enough, but someone without your background and experience wouldn't know to look for it.
You might think this is far-fetched, but if you ever have had a slightly longer “conversation” with an LLM involving code you know that they tend to break down after a while where they start to “miss” things. This happens quicker the more code you provide upfront.
It's similar to why we wouldn't advise someone to blindly trust medical advice from an LLM. The stakes might be different, but the principle is the same. Without the expertise to validate the output, you're putting a lot of faith in a system that isn't designed to be authoritative.
It really is the difference between using an LLM as a tool within your tool belt versus treating it as a developer in its own right. LLMs can be great learning tools. Asking them to explain concepts, looking at their code examples to understand patterns, or using them to explore different approaches. But treating them as a replacement for development expertise, especially when you can't validate their output, is where things become problematic.
I honestly think you made your comment with the best intention, I did not mean to imply there was malicious attempt. But I did want to point out the risks involved to cuteFox as I think I made it clear enough there are risks involved.
Thanks for the clarification! I think you make a good point; I should have made it clearer that once you start doing something more involved (i.e. when it is no longer a simple app contained in a single html file) then it's probably time to start using some dedicated software such as OpenWebUI, which can be set up to use a remote api with an API key (so you basically just run a webserver that calls e.g. ChatGPT). That was what I ended on, but I see that wasn't clearly expressed!
I would however trust the user to evaluate if the app fits their requirements in the case of the SPA. And, I would trust that single html file generated by an LLM much more than a fairly unknown service like typingmind, where you are prompted to give them your API key! That might just be me being untrusting of SaaS providers though.
Big AGI allows you to run it directly in the browser (use the launch button) and works fairly well in my experience. The current release doesn't support gemini and mistral directly but you can use them through open router.
V2 is in development and was planned to release in this month, not sure if that is still happening. I have been running it locally on the v2 branch, and it will also support Google directly once that is released.
As far as installations go, did you mean in the context of not wanting to mess with docker on a vps to be able to use it on mobile? Or no installations at all? Because for android there are also apps like GPTMobile (available on both f-droid and regular store) that work reasonably well.
Of note, I can't stop thinking about that name, it's so perfect, so prescient, so horrifically sardonically bad, that I just adore it. I would have personally called it "BIG agi" just to flip the dynamic, but it's pretty great.
How about a name that flips the dynamic even further? :)
Heh, yeah the name certainly is a thing. From all the llm api front-ends it is easily one of my favorites. Most of the other projects are frantically throwing in every gimmicky feature they can think with little eye for it being a well rounded product. OpenWebUI being the biggest offender where it has glaring bugs in output rendering and I wouldn't dare to host it publicly simply because I don't trust it to be actually secure.
It would be nice if big agi had actually account support for conversation syncing. But I also found that I rarely need conversations on a different device. If I do it mostly is the output anyway.
Thanks, Big AGI look like exactly what I wanted! and actually I tried it out and it does support both Gemini and Mistral. regarding GPTMobile, I tried several other apps like that but mostly didn't like the UI (or they were missing some features). this one seems much better than other ones, but still has that rounded corners UI 😅 nevertheless I'll try it
OpenRouter has a few free models you can use in their chat interface.
oh I didn't realise gemini was free there as well, although it keeps showing errors when I use it, guess they're hitting the rate limit. thanks
As far as I'm aware, between your three criteria of free, no installation needed and the implied not hopelessly limited, you will be only be able to pick two. Mozilla's llamafile simplifies the process down to "download file, execute", but it's still something you have to run on your own hardware.
well there are many such tools, like lm studio, jan.ai, msty etc but I don't have the hardware to run any of the bigger models. most I can run is 8b :(
Not exactly what you are asking, but since you mentioned that you use gemini mostly, you can use google ai studio for free with no limitations, with access to all of their models