Paying for AI: Have you found it to be worth it?
I'm starting to use AI increasingly, and am getting some value out of it. I'm curious if paying for paid tiers of the big players (in particular, ChatGPT and Claude) provides significantly better responses.
I'm aware that the paid tiers offer more features and benefits than just higher response quality. For me, those are just nice-to-haves, and not my primary concern.
My main uses of AI are software development and foreign language learning. So far, I've used the free versions of ChatGPT and Claude, as well as "proxies," including Github Copilot and Duck.ai. For both my use cases, I've found the responses usually good and helpful. I just maintain a healthy skepticism about the correctness of the answers, and challenge, test, and double check where needed (especially testing suggested code when developing software).
Have you found response quality to be noticeably and significantly better with paid tiers? I was just randomly thinking, and it occurred to me that the cost of an AI subscription is in the same ballpark as a subscription to a language learning service like Duolingo. So, if I can get value from AI that approaches what I'd get from a dedicated language learning service (even if it doesn't quite match or exceed it), then also getting the value of general AI in the same subscription should make things quite valuable and worth it. Not to mention possibly getting better software development assistance in the same package.
I'm losing interest in LLMs. I find myself turning to Wikipedia instead of chat "search" (I would love a semantic search function in Wikipedia) and using AI less with software development. They are often more trouble than they are worth when writing code - with the exception of boilerplate and autocomplete. I feel like students and junior developers should be cautioned against using LLMs for more than just API documentation search.
That's exactly how I feel and, at least at my job, you're rarely actually writing boilerplate and an LSP server handles autocomplete just fine.
The one thing I've been using it for has been when I'm trying to do certain script-y things in the shell or in python, like
how do I do <x> in bash
type of questions, where<x>
is relatively simple, and it can be decent.Yep that's pretty much how I use it and I kinda see myself using it like that for a while. The AI things that try to automate everything or try to solve too many problems at once always seem to fail to deliver.
I actually wrote a CLI tool (as many others have) that fits my workflow specifically for this, honestly the APIs between the models are fairly consistent so was straightforward to do this. Highly recommend building your own for your own custom needs. If you do build one be wary of building on top of others tools/libraries, lots of cruft out there that tries to seem like it does everything for everyone. My engineering philosophy is to try to only do as much abstraction as absolutely necessary and it as served me well so far
I see your point, but I myself never use AI for autocomplete or autopaste. I mainly use it to help debug by pasting error messages, etc., and working with it to hunt down a bug or config problem, etc. I always work with the conversation UI, and not an auto-filling IDE integration. I'll admit its track record is not perfect, but, on the other hand, I also have to admit that it has helped me -- probably at least half the time. I find that corporate AI accounts help by allowing, say, Copilot to have the whole closed-source codebase(s) of the company available for searching and analysis. When working across timezones, and colleagues are no longer around to get help from, AI can be a good second-best option.
Yeah I've used Cursor to semantic search in a new codebase before with success.
Edit: As a freelancer it makes me appear more competent as I can finish a job asking fewer questions about basic stuff like "Where does this name come from?" or "Where is this operation performed?". The cleanest set of interactions are such that I collect business/design/implementation requirements and then handle everything else on my own.
I'm leaning the same way myself. I'm tired of being confidently lied to by them when asking about documentation. The worst of this is usually when asking about software that has had frequent UI changes. I'm constantly having to remind them what version I'm using and even then, it will spit out convincing sounding, incorrect answers constantly that I have to then waste time confirming are incorrect.
I do. Deep research (in its various iterations) is fantastically useful and I still feel like I'm only scraping the surface of what it could be used for. And using the regular models, as well, has been helpful, especially if I make an effort to stay in their wheelhouses. Trip planning, for example, can be fantastically helpful. (Models are good at it because it's text based and there's so much data on trips on the internet).
To expand on that a bit: I have always been a good trip planner. I am good at logistics. I can find good information via well-executed searches. And even so, using an LLM makes me more efficient -- really, I think this is the scenario where models add the most value. It's not when you don't know what you're doing, it's when you do know what you're doing, you know what you're looking for, and want to offload very specific tasks to the model.
Any example workflows/tasks you're using LLMs for with trip planning?
For travel, my family spends most of our time debating destinations, flights, hotels, and transportation. We don't spend much time planning itineraries once we have a basic idea of the activities available around our destination.
For sure. Here's a recent example -- I'd asked Gemini to give me some options for getting into central London during a (very) long layover. Here's what it said (and as far as I can tell, everything ended up being correct):
Unfortunately other trips were more than 90 days ago so it hasn't retained the conversations, but I also found it extremely useful for finding hotels meeting our criteria in a series of cities I fed it based on our itinerary. (An itinerary which I also used Gemini to help craft). I asked it to give me some options for luxury hotels that were still family friendly in [European capital city]; it gave me a few; I asked which was best, and it responded that they were all more or less the same, but one was much closer to a park with a playground. That's legitimately helpful advice, and we ended up staying in that hotel.
Point is, if I just said "give me hotels," it's going to flounder. But giving it some criteria and then asking it to refine works quite well. I've found giving it price points is imperfect, but it does okay.
I found mindtrip.ai to be useful for trip planning actually. It gives you a shared itinerary and you can modify it via group chat and private chat. You can also just paste in hotel/train/plane receipts and it fills them in to your itinerary.
Yes, the advanced reasoning models are substantially more accurate and useful. I have a paid account for both ChatGPT and Claude and have good experiences with both.
As an aside, what languages are you learning with AI? I’m developing a language learning AI app at the moment. Would you like a free account to test it?
I'm currently focusing on Korean, but it's possible I might also start (or continue) learning other languages, too, like French and Japanese. I'm generally happy to help do software testing, especially for small startups and open source projects -- however, if by "app", you mean something closed-source that needs to be installed (whether mobile or desktop), I'm afraid I'll have to pass (though I appreciate the offer). If your service is accessible purely through a web interface, sure, feel free to throw me a DM.
Oh, and thank you for sharing your experience and opinion on paid AI.
It’s both an app and a web page. Really, it’s a web page wrapped as an app since you can’t really put powerful LLMs on device yet, but you can access it via a regular desktop browser as well. I’ll shoot you a DM with the details.
I have access to ChatGPT via work and I find both the deep research and regular use of the premium thinking models to be really useful for investigative work. As you have experience with both can you compare these features with the premium Claude offering for me?
The things I like most about Claude pro are:
I don’t use the Google workspace connection or the code system directly. I do use it for coding though, in copilot agent mode.
I will second Claude projects. They sounded like a gimmick to me at first, but they've turned out to be super helpful so far: they eliminate the overhead of specifying the same context over and over with each new chat. You can definitely work around this manually—just ask really detailed questions. But it's so nice to upload (for example) a .txt of some software's help file and have it automatically just know all that.
You can also use AI to help generate context documents. Start a new chat: "Let's pretend you're a new person on my team. I'll start describing the project, and you ask questions about what you don't understand." Then have a back-and-forth conversation. After a while, you ask it to write an "onboarding doc" in Markdown, you do some editing on your own, and then you upload the doc to the project. It still takes effort; the AI can't magically know your situation, and you still have to type all the details. But you can do it in a more casual manner, and you don't waste time describing things the AI already knows.
Learning Italian currently and I am definitely interested in a test account.
I’ll DM you some details.
I got Gemini Pro for a year with my phone, it's kinda cool and I use it occasionally, but I definitely won't renew it. I use very little LLMs in my life, just haven't found them to be particularly useful versus a quick Kagi search.
So you haven't tried doing things with AI beyond informational search? My use cases (software dev, language learning) seem well-suited to having back-and-forth with an AI to solve a problem, or refine details and nuances.
I find they more frequently go in circles, getting worse and worse with the increasing token count.
For example, I got a basic awk script to do a simple filter, but it couldn't refine that into a more complex awk script that could count and sum multiple instances how I wanted. I got a bit closer to what I needed after restarting from scratch. Sometimes the time spent on refining the perfect prompt would be better spent reading documentation...
Most AI providers have a developer playground where you can use the more advanced models for little / no cost:
OpenAI (ChatGPT) and Anthropic (Claude) are quite cheap to use (pay as you go). AI studio (Gemini) is completely free.
The only downside is that the UIs are more complex compared to something like chatgpt.com
If you’re impressed with the results you can upgrade to a monthly subscription (or just keep using the developer playground) :)
I setup open-webui, perplexica and automatic1111 in my homelab to learn/play.
Been using both local stab diffusion/llms locally on my 4080, and also using openrouter when I want to play with paid models.
I'm not a peogrammer at $job, I can't it or leave it usually. I've used it to sunmarise a few notes and draft up documents, but only to break the 'blank page' problem go get me started. I usually start from scratch and springboard off the AI. To me in that regard, it's all hype that won't be replacing my job.
I have been fairly impressed in Claude 3.7 for some hobby one-off programming stuff, but find it quickly loses the plot if I try and do anything large/significant once I have a codebase foundation
Well, in my case, I rarely ask AI to do high-level or large-scale software design. I just turn to it for stuff like getting unstuck with things, or to give syntax help, or how to use common libraries or packages.
I pay for t3.chat. It’s cheaper than the standard $20/month and has a much better interface. And you can select many different models with it. I also pay for cursor.
If you still have an edu email address, you can get Gemini pro for a year for free.
I’ve been using AI for a month now. I use the free version of ChatGPT. And I’ve paid for this thing called Suno that generates music.
The music thing is mostly a distraction. I create songs in the style of artists and sometimes with a movie in mind. The most psychotic thing I’ve done is upload a photograph of myself and asked ChatGPT to create a song about me as if it were from Lana Del Rey.
I suppose I just don’t do stuff that actually requires it. I know someone that uses it to generate practice sentences and games for her speech therapy practice. So for me it’s not worth it in that sense. I understand ChatGPT has replaced Google for a lot of people but I don’t think that the information is consistently accurate enough to do that with it.
It’s helpful in organizing lists for me though.
We're using it heavily at work. We use Claude sonnet via AWS bedrock,j and use that to run Cline and Claude Code. We also have copilot in VS code with Claude sonnet enabled. We also have Gemini pro. (I know, crazy).
I like Cline better than copilot for larger tasks. Example: write me a script that queries my cloudformation stack for all the resources that have logs, select one, and tail the logs from it to the terminal.
Sometimes I make a task too complicated and hit the limits of the model in Cline, and I have to start over.
I like copilot for smaller in-place edits where I know exactly what and where I want it to do. Example: make me a for loop that iterates this list (I can never remember the syntax for JS).
We tried CursorAI and it seemed pretty good, but several people had trouble running it in Linux, and I didn't really like it being a forked / nerfed version of VS code.
Claude Code runs from the terminal, so it's harder to see what it is doing than using Cline, but it is pretty smart about inspecting the code base as well as running commands and inspecting error outputs and iterating on them. Example: *this cloud formation stack has errors when deploying it.. Run it, inspect the errors, and propose changes to fix them. *
I like the Gemini Pro chat interface for real chat interactions and research, as well as double checking things that sound fishy in the other models.
Right now I'm not really doing code personally, but if I was, I'd probably choose an Anthropic subscription for sonnet, with a monthly budget for Claude access for bigger tasks, then use it with Cline and Claude Code.
One thing I'm on the lookout for is a tool to run at the command line where I can say, rebase the commits on my branch after main onto the last commit on branch X. Github copilot has a tool like this that I use, but I don't like the UX and I wish I could configure it to use a different model.
I use Microsoft Copilot from a M365 Enterprise subscription at work and think it's phenomenal. It's trickled into my daily life with my M365+Copilot subscription (they raised the cost, folded it in, and let me opt out, I kept it, figuring I can drop it later). I don't use it all the time outside of work but I like Copilot for two things:
Simple scripts, like writing a
sox
-based script to re-encode samples, or write a function with a variable containing a large pool of data that'll be formatted.Conversational search. LLMs are excellent at reading a bit between the lines and providing results I would've been a bit hard-pressed to find on my own. Since I started using Copilot, I've been able to solve issues faster with direct links to forum posts. I still use it for basic scripts if I don't remember a specific Powershell verb at work, but 80% of the time it's "Can you find me a link to <corporate documentation> about <thing>?"
Rubber ducky type stuff. If I'm in a creative funk I can bounce song ideas off of it. The yes-manning is a bit annoying, but I can bounce a concept off of it, or ask it for an idea and build from its description. I can throw code I've given up on at it for a fix, if it's sufficiently simple.
I don't trust LLMs for everything, and try not to use them for the answer (though, Perplexity is pretty awesome), but they're excellent as search engines on steroids when allowed to peruse the internet.
Yes, those are two particular use cases that I've found AI is good at at work. Documentation search (though, Confluence search is a really low bar to exceed, frankly); and summarizing stuff, especially meeting transcripts. Great for catching up on key points of meetings you were absent from.
100% ChatGPT is more useful with the premium subscription. My university purchased premium subscriptions for all students and the difference in the quality of the output with the premium models and the free models are night and day. The reasoning models make far fewer mistakes.
If you are unsure, you can just pay the $20 for a month of premium with one of the large AI companies. You’ll quickly see if it’s worth it for the kinds of prompts you input.
I pay for cursor and find it extremely worth it. I'm an experienced engineer so I can give very specific instructions, and my code is really well organized into patterns AI understands quickly. The end result is that I can make changes across many layers at once (schema, infrastructure, API spec, implementation, f front end, unit tests) in the time it takes me to bullet out how each needs to change. Then it can run my project and end to end test it for me.
I'm scared of AI, but until it takes my job, cursor makes me far, far more productive. And I hate to say it, but this is always how I wanted to code. I like thinking about how to put the code together, but being able to have a tool do the rote part is nice. It feels like I'm spending my time on the parts I should be spending my time on.
I already posted this week about using Riffusion to make music, but I also use claude code to prototype ideas on occasion.
I have been comparing the ChatGPT Plus subscription against the Gemini Pro one for my personal use and to understand their capabilities for my work. I don’t know if I use either enough to justify their subscription price but because I have them, I find myself going to them first just to see if they can do something. I particularly like Gemini’s integration with my Google account where I can just tell it to add stuff to my calendar with some details and I don’t need to do that myself. I also think Gemini seems better for a majority of my use except that I don’t think it manages history as well as ChatGPT. You should consider something that has a bunch of different models you can use like Perplexity where you pay $20 and basically get Gemini Pro and GPT Plus.
I tried ChatGPT and Claude (paid for both) but eventually landed on a combination of Kagi questions for simpler queries and perplexity for more advanced questions/back and forth.
Having a space in perplexity (same as a project in Claude) is also very useful. I’d upload a technical book or other documents and get a specialist intern which quotes its sources so I can verify what it says.
I haven't found a need to reach for paid models. I use small local models for glorified rubber duck debugging and light image editing (removing backgrounds, image segmentation, the odd touch up). They're great for highlighting blind spots (the you don't know what you don't know problem), discovering jargon, and providing a jumping off point. But the nice thing about small local models is they don't remove the friction. I usually have to modify code to make it work, the models aren't great at reading my mind - I actually have to articulate and provide context if I want better results. Now that sounds like a negative, but I find that's where personal growth is, in the friction.