I don't get who this is for... Specifically they failed to sell me what it is doing specifically that warrants a $180 price increase compared to the $20 plus subscription. $20 which already is...
I don't get who this is for... Specifically they failed to sell me what it is doing specifically that warrants a $180 price increase compared to the $20 plus subscription. $20 which already is quite a lot unless you use it an insane amount as for most usage you are probably cheaper of using the API (though that does require setting up something like openweb ui).
Something something more compute, o1 being "better".
I'm trying to figure out if the $200/month price-tag is there because it's high margin and they think they can really squeeze enterprise users - or if they're so deep into diminishing returns they...
I'm trying to figure out if the $200/month price-tag is there because it's high margin and they think they can really squeeze enterprise users - or if they're so deep into diminishing returns they need to 10x the price to pay for marginal improvements.
Also consider that they’ve been selling their existing offerings well below cost to try and capture market share, so who knows, maybe this is their first offering that reflects the real service cost.
Also consider that they’ve been selling their existing offerings well below cost to try and capture market share, so who knows, maybe this is their first offering that reflects the real service cost.
That would be an interesting point, though I would think that you would hear more rumblings from other parties hosting LLMs in various ways as well. Although part of the cost is of course the...
That would be an interesting point, though I would think that you would hear more rumblings from other parties hosting LLMs in various ways as well. Although part of the cost is of course the insane amount of VC money that has been thrown at the development of these models.
I don’t know about the others, but OpenAI is either already, or will be having in the future an incredibly bad time regarding their financials. See a previous topic here for more background info.
I don’t know about the others, but OpenAI is either already, or will be having in the future an incredibly bad time regarding their financials. See a previous topic here for more background info.
I think it's feasible for them to double or triple their income in the next several years. The potential use cases really are endless, whether that's good or bad. Microsoft is using GPT, and I...
Exemplary
I think it's feasible for them to double or triple their income in the next several years.
The potential use cases really are endless, whether that's good or bad.
Microsoft is using GPT, and I don't see them just dropping this partnership after they've done so much to bake it into their OS and search platform.
Here is how I use it at work:
My K-12 public school is encouraging teachers to use it (and paying for it) so that we can spend our time more efficiently. They even trained us on appropriate and inappropriate uses.
It's a lifesaver at work. Once I started using it, I noticed how much time I was wasting on tedious little things such as wording something properly. That 1- 2 minutes you spend wordsmithing really adds up when you send 40 emails and grade 100 papers. I bullet point some feedback, GPT gives me a digestible paragraph, I revise it quick. Done. It's my work, I'm more than capable of writing a paragraph, but I used a tool to do it in half the time with no typos. In addition, I noticed I was spending a lot of time searching Google for the perfect image to add to a text for my sped and non-native English speaking students. Now, if Google doesn't have a good image, I just make one. I recently added pictures to every page of a fiction horror story we read for ELA. It helped the kids a lot. I also use it to help generate lesson ideas. I know the content like the back of my hand, but man, gpt spits out some clever lesson ideas I would have never considered.
To be clear, it doesn't do my job for me. A random person couldn't just start teaching properly with AI. But, if you already know what you're doing and review all AI output carefully, it can really streamline things for you in a way that makes an overworked teaching job downright manageable. Those little time savings add up to hours every week.
Okay, now apply this to every profession. I'm not saying it's inherently good or without flaws. In fact, I think it's quite dangerous and I think inappropriate use of this tool can and will destroy lives. One irresponsible doctor or lawyer can absolutely fuck someone by letting AI do their job for them.
However, appropriate use will enhance performance and productivity. It has the potential to be good for workers and shareholders alike.
I honestly don't think it's going anywhere. It'll keep improving, it'll become more lightweight and efficient, and it'll be wonderful and terrible all at once just like the internet has been.
You’ve laid out a not terribly unrealistic plan to increase their user count and by extension their revenue. I do not at all deny the practicality and numerous use cases that already exist or will...
You’ve laid out a not terribly unrealistic plan to increase their user count and by extension their revenue. I do not at all deny the practicality and numerous use cases that already exist or will exist.
But if nothing fundamental changes about their business model, i.e. the LLMs’ training and operational cost, having more users for OpenAI means losing more money, as each user is a loss – cost is higher than what they bring in. Growth is almost an undesirable factor to them!
Great post! First hand experience using AI as a force multiplier instead of a labor replacement. I have not really had a chance to talk to anyone that uses AI for work. Mostly it has been doom and...
Great post! First hand experience using AI as a force multiplier instead of a labor replacement. I have not really had a chance to talk to anyone that uses AI for work. Mostly it has been doom and gloom about it replacing people jobs, but nothing concrete.
Full disclosure I have a lot of concerns around AI safety and where this all could end up. I less often considered what AI might do if it is not a humanity ending event. In this case AI is helping by giving you the ability to offload part of the "thinking" in your work. For example, when you talk about word smithing. It's kind of like the hand written card vs a hallmark card idea. It is much faster to choose the prewritten card that matches what you are thinking, but without having to come up with every words yourself. Like the AI generated Email. It's easier to tell it the general theme and have it pick the words, then you can just "buy the card." By not having to spend as much of your time doing the communication part of your job, you free up more time to do the actual work.
I could see this saving huge amounts of time in my day as well. Most of my day is trying to make sure the the person getting my email/communication will understand what I'm trying to say. Speeding up effective communication could have huge potential gains for productivity.
I can also see your point about this being like the internet though. Nothing is free and everything has a cost. The internet gave us the ability to really connect and learn from each other. Now it seems to be more of a way to divide and distract us.
So let's look at that side of the AI coin too. What do we lose to gain this effective communication tool?
Frankly, both of them sound plausible, possibly even both at the same time. Unless their marketing is truly terrible at explaining what warrants a 900% increase in cost compared to current...
Frankly, both of them sound plausible, possibly even both at the same time. Unless their marketing is truly terrible at explaining what warrants a 900% increase in cost compared to current offerings. Which I doubt, if past announcements are anything to go by they have highlighted everything they feel can highlight. Which isn't much...
My guess is that they're going to keep scaling the free and $20 ones down until they're basically not usable for companies. Maybe something even like, "wow that will take considerable computing...
My guess is that they're going to keep scaling the free and $20 ones down until they're basically not usable for companies. Maybe something even like, "wow that will take considerable computing power. Wait 1 hr for result or get your company to upgrade for answer now"
Reasoning: back in the day I was able to ask the free version something like, give me 100 words that scramble from letters [list], and then recently it told me it can't do that anymore without paying. Which is funny because I took my (non) business back to the old letter scramble bots.
Not that I’m a fan of AI for this sort of problem, but llama 3.2 via ollama (local LLM wrapper) was fine at this task, so it might work as an alternative. It’s pretty quick to download and get...
Not that I’m a fan of AI for this sort of problem, but llama 3.2 via ollama (local LLM wrapper) was fine at this task, so it might work as an alternative. It’s pretty quick to download and get running, too.
If you’re on a Mac and you already use homebrew, I think you get it running by putting this into terminal: brew install ollama ollama run llama3.2 I’m in software dev though, so I don’t know if...
If you’re on a Mac and you already use homebrew, I think you get it running by putting this into terminal:
brew install ollama
ollama run llama3.2
I’m in software dev though, so I don’t know if that’s already assuming too much. I run nixOS at home and it’s just as “easy”, but I feel suddenly very self conscious about how high off the ground my bar is …
Ah, that’s a shame, but no worries. I’d note that there are some other, more friendly options available (LM Studio for example) which are pretty much click and run, but I can’t speak to them from...
Ah, that’s a shame, but no worries. I’d note that there are some other, more friendly options available (LM Studio for example) which are pretty much click and run, but I can’t speak to them from hands on experience.
You can also just hit "Download" on the page you linked above and it'll walk you through with a nice friendly GUI, which is probably more comfortable for a lot of users :) (cc @lou - although the...
You can also just hit "Download" on the page you linked above and it'll walk you through with a nice friendly GUI, which is probably more comfortable for a lot of users :) (cc @lou - although the https://jan.ai/ link looks pretty nice too, I hadn't come across that one!)
Did ChatGPT itself tell you that it can't do something without you paying? That absolutely sounds like a mistake by it that you shouldn't trust. In general it and its free version have only gotten...
Did ChatGPT itself tell you that it can't do something without you paying? That absolutely sounds like a mistake by it that you shouldn't trust. In general it and its free version have only gotten better over time, and to my knowledge it's never been instructed to try to upsell people in conversation. Just start a new conversation and ask it again. (Also consider checking the ChatGPT memories in the settings page to make sure it hasn't recorded a misleading memory about the previous conversation.)
You may have planted the idea that you need to pay more into the conversation and it was too agreeable with that. It is a common issue for ChatGPT to believe something to a fault once it has been said in the conversation.
darn near so, yeah. After reading your comment I wasn't sure if I hallucinated that "conversation" or if I somehow set it up to sell me stuff. So I tried it again, with this being the conclusion...
darn near so, yeah. After reading your comment I wasn't sure if I hallucinated that "conversation" or if I somehow set it up to sell me stuff. So I tried it again, with this being the conclusion
The up sell is in a box, not technically part of the chat, which makes it more obvious that it's the humans who programmed it who intended for it to be shown, and not part of the LLM generated content.
From a previously failed attempt to get 10 unique jokes, I know it will happily go on forever failing the task. So maybe I did set it up wrong by asking if something's wrong with the prompt, but I didn't want to go on much longer knowing it isn't a tool that learns or understand what I'm actually asking anyway. Suggestion for dealing with the impasse of repeated failures confidently presented as passes?
LLMs are bad at math (no joke) as at the basis they do next word prediction (this is very simplified). My guess is that lists of 10 items are common enough in the training data that they can...
From a previously failed attempt to get 10 unique jokes, I know it will happily go on forever failing the task. So maybe I did set it up wrong by asking if something's wrong with the prompt,
LLMs are bad at math (no joke) as at the basis they do next word prediction (this is very simplified). My guess is that lists of 10 items are common enough in the training data that they can easily do those. For anything else it simply doesn't "know" how much words it has.
It's a common trap for people to walk into as it isn't quite obvious they are weak in this area and they sure as hell will not tell you.
But it couldn't. I'm starting to think these things don't hardly do anything useful at all. It was unable to pull ten jokes from their infinite sources of training material. If it has any useful...
My guess is that lists of 10 items are common enough in the training data that they can easily do those.
But it couldn't. I'm starting to think these things don't hardly do anything useful at all. It was unable to pull ten jokes from their infinite sources of training material. If it has any useful (general) intelligence at all, it'd pull up an old reddit thread and just copy paste the top ten comments, maybe re-phrase things a bit.
I certainly won't find it payment level useful until it stops being confidently wrong about things. I'd be much happier with a narrow use chatbot that reports "I don't know how to do that" frequently, rather than something that gives you oranges when you as for apples and is pleased as plum about bring dead wrong. It's even good at insidiously and unexpectedly failing simple fetch requests such as "is ___ a noun" or "is (place) an island" or "give me the help of this command". It doesn't know what it knows, and it certainly doesn't know what it doesn't know
Oh I read it as if it could get 10 items but just not good jokes, my bad. Further reinforces the point though, don't use LLMs for number related things. As for the quality of the jokes, LLMs can't...
But it couldn't.
Oh I read it as if it could get 10 items but just not good jokes, my bad. Further reinforces the point though, don't use LLMs for number related things.
As for the quality of the jokes, LLMs can't really write good unique material either. They are effectively predicting patterns based on all the training material. It's why I have a hard time believing all those claims about them being used to generate anything substantial.
What they are good for (in my opinion) is mostly as tools to help you with your own materials. With a big asterisk that they do require you to know their limitations and idiosyncrasies. For technical stuff I wrote about it here. But also for non-technical things I find them useful as more generic tools.
But the overall theme is that I rarely, if ever use them to create something directly. Heck, I barely even ask them to modify something and use that.
LLMs can actually be decent at making jokes depending the prompt. I think the trick is to get it out of writing in the "assistant" voice and to get it to lean into absurd mash-ups. Here are my...
LLMs can actually be decent at making jokes depending the prompt. I think the trick is to get it out of writing in the "assistant" voice and to get it to lean into absurd mash-ups. Here are my results of telling Claude 3 (and GPT 4, which wasn't as good) to write tweets in the style of dril, a popular Twitter user: https://bsky.app/profile/macil.tech/post/3kpcvicmirs2v
My vague impression is that OpenAI o1 is very expensive to run. They probably only have enough capacity to serve "whales" who want to try out the latest and aren't price sensitive. Who is it for?...
My vague impression is that OpenAI o1 is very expensive to run. They probably only have enough capacity to serve "whales" who want to try out the latest and aren't price sensitive. Who is it for? People with expense accounts who can justify it as research.
Has anyone had a good experience with the o1 preview? The preview just looked like it was stepping through its own iterations using the 4o model (or something similar) and didn't feel like it gave...
Has anyone had a good experience with the o1 preview? The preview just looked like it was stepping through its own iterations using the 4o model (or something similar) and didn't feel like it gave me answers that were better than just a few manual iterations with 4o. That and it's kind of unnecessarily verbose, but maybe that's the point and I'm just not using it for the right kind of questions.
But if 1o preview is supposed to be a selling point it doesn't feel like it does a very good job of selling the step up. At least in my experience so far.
The o1 model has special training that 4o doesn't to help it be more productive at making progress in solving problems while it writes text to itself, but your experience matches mine. From the...
The o1 model has special training that 4o doesn't to help it be more productive at making progress in solving problems while it writes text to itself, but your experience matches mine. From the benchmarks it appears there are certain kinds of multi-step problems that o1 uniquely excels at, but for a lot of other stuff it ends up being only as good as 4o while being slower.
It’s for companies. Mostly things like software companies. Even for a modestly paid software engineer, if they cost like $120/hr, then if this software can save them 2 hours of work a month,...
It’s for companies. Mostly things like software companies. Even for a modestly paid software engineer, if they cost like $120/hr, then if this software can save them 2 hours of work a month, you’ve made your money back already and then some.
No, it is not. There is nothing in the announcement that indicates that this is performing 900% better compared to the plus subscription offering. Besides for $180 you get a lot of API usage which...
No, it is not. There is nothing in the announcement that indicates that this is performing 900% better compared to the plus subscription offering.
Besides for $180 you get a lot of API usage which is much more beneficial for software engineers as they can plug it in all sorts of tooling. Not just chat.
The $20/month Plus subscription has strict limits on o1 and Advanced Voice Mode usage. Some users have tasks that only these features are able to accomplish. The new $200/month Pro subscription...
The $20/month Plus subscription has strict limits on o1 and Advanced Voice Mode usage. Some users have tasks that only these features are able to accomplish. The new $200/month Pro subscription gives you more than 10x those limits (unlimited).
Comparing the old cost to the new cost isn't that useful in this context. Let's say hypothetically the old version is a net negative in productivity for any work task, so it's not worth using even...
Comparing the old cost to the new cost isn't that useful in this context. Let's say hypothetically the old version is a net negative in productivity for any work task, so it's not worth using even if it were free. If the new version is a net positive in productivity for some work tasks, it could be worth a relatively high price depending on how many hours it saves. The charts claim a significant increase in accuracy for some tasks. I can imagine that might make the difference in some situations.
Well, my main point is that something doesn't have to be 10x better to be worth paying 10x the price. Although I can't say whether this particular product is worth it's asking price. However,...
Well, my main point is that something doesn't have to be 10x better to be worth paying 10x the price. Although I can't say whether this particular product is worth it's asking price.
However, unless I'm interpreting things wrong, half of the linked announcement is charts demonstrating improved accuracy and reliability in some benchmarks. I have no idea how realistic these claims are, but they are claiming it.
I agree that it doesn't have to be a 1:1 relation in price increase. What I am having trouble wrapping my head around is that the improvements they are claiming are fairly typical as to the claims...
I agree that it doesn't have to be a 1:1 relation in price increase. What I am having trouble wrapping my head around is that the improvements they are claiming are fairly typical as to the claims with previous model improvements. Those announcements had similar graphs, etc. In fact, those announcements often came with demos as well showcasing a lot of these capabilities.
In that context this announcement, from my perspective, is outright timid as far as openAI announcements go. Certainly when you also factor in the price increase.
We know we can increase accuracy by making the LLM check its own work, but it costs a ton more compute for a modest increase in accuracy. Totally speculation, but based on the announcement saying...
We know we can increase accuracy by making the LLM check its own work, but it costs a ton more compute for a modest increase in accuracy. Totally speculation, but based on the announcement saying that it "thinks longer" I assume this is it, and they're not advertising it too hard because the general public would obviously balk at such a price.
I use ChatGPT a lot for various work tasks, especially coding small scripts. I don't need a model that's "smarter" or "thinks harder". I really need something that I can throw a lot of data at it,...
I use ChatGPT a lot for various work tasks, especially coding small scripts. I don't need a model that's "smarter" or "thinks harder". I really need something that I can throw a lot of data at it, like database tables, CSV files, and text documents, and have it understand the data without a lot of tedious mapping of columns, setting up definitions and rules, etc.
How long is this whole "the new model is smarter and thinks harder" thing going to last? What's the point of improving the model and leaving the"worse" ones available to use? Are they trying to...
How long is this whole "the new model is smarter and thinks harder" thing going to last? What's the point of improving the model and leaving the"worse" ones available to use? Are they trying to feign additional value? As a layman, I just don't get it.
4o is much cheaper and quicker than o1 while being just as good for a lot of tasks. o1 is only better for certain tasks. On the developer side, there's been a lot of trade-offs in choosing between...
4o is much cheaper and quicker than o1 while being just as good for a lot of tasks. o1 is only better for certain tasks.
On the developer side, there's been a lot of trade-offs in choosing between different LLMs for a while, with a whole spectrum from cheap+quick+dumb models to expensive+slow models. I've been surprised that the consumer applications (ChatGPT, Claude, Gemini) have stuck to a single subscription option for so long instead of offering multiple price points as ChatGPT is doing now.
This comes shortly after media reports about OpenAI considering putting ads in the free version. If they need revenue this badly, I'd expect the api pricing to increase as well.
This comes shortly after media reports about OpenAI considering putting ads in the free version. If they need revenue this badly, I'd expect the api pricing to increase as well.
I don't get who this is for... Specifically they failed to sell me what it is doing specifically that warrants a $180 price increase compared to the $20 plus subscription. $20 which already is quite a lot unless you use it an insane amount as for most usage you are probably cheaper of using the API (though that does require setting up something like openweb ui).
Something something more compute, o1 being "better".
I'm trying to figure out if the $200/month price-tag is there because it's high margin and they think they can really squeeze enterprise users - or if they're so deep into diminishing returns they need to 10x the price to pay for marginal improvements.
Also consider that they’ve been selling their existing offerings well below cost to try and capture market share, so who knows, maybe this is their first offering that reflects the real service cost.
That would be an interesting point, though I would think that you would hear more rumblings from other parties hosting LLMs in various ways as well. Although part of the cost is of course the insane amount of VC money that has been thrown at the development of these models.
I don’t know about the others, but OpenAI is either already, or will be having in the future an incredibly bad time regarding their financials. See a previous topic here for more background info.
I think it's feasible for them to double or triple their income in the next several years.
The potential use cases really are endless, whether that's good or bad.
Microsoft is using GPT, and I don't see them just dropping this partnership after they've done so much to bake it into their OS and search platform.
Here is how I use it at work:
My K-12 public school is encouraging teachers to use it (and paying for it) so that we can spend our time more efficiently. They even trained us on appropriate and inappropriate uses.
It's a lifesaver at work. Once I started using it, I noticed how much time I was wasting on tedious little things such as wording something properly. That 1- 2 minutes you spend wordsmithing really adds up when you send 40 emails and grade 100 papers. I bullet point some feedback, GPT gives me a digestible paragraph, I revise it quick. Done. It's my work, I'm more than capable of writing a paragraph, but I used a tool to do it in half the time with no typos. In addition, I noticed I was spending a lot of time searching Google for the perfect image to add to a text for my sped and non-native English speaking students. Now, if Google doesn't have a good image, I just make one. I recently added pictures to every page of a fiction horror story we read for ELA. It helped the kids a lot. I also use it to help generate lesson ideas. I know the content like the back of my hand, but man, gpt spits out some clever lesson ideas I would have never considered.
To be clear, it doesn't do my job for me. A random person couldn't just start teaching properly with AI. But, if you already know what you're doing and review all AI output carefully, it can really streamline things for you in a way that makes an overworked teaching job downright manageable. Those little time savings add up to hours every week.
Okay, now apply this to every profession. I'm not saying it's inherently good or without flaws. In fact, I think it's quite dangerous and I think inappropriate use of this tool can and will destroy lives. One irresponsible doctor or lawyer can absolutely fuck someone by letting AI do their job for them.
However, appropriate use will enhance performance and productivity. It has the potential to be good for workers and shareholders alike.
I honestly don't think it's going anywhere. It'll keep improving, it'll become more lightweight and efficient, and it'll be wonderful and terrible all at once just like the internet has been.
You’ve laid out a not terribly unrealistic plan to increase their user count and by extension their revenue. I do not at all deny the practicality and numerous use cases that already exist or will exist.
But if nothing fundamental changes about their business model, i.e. the LLMs’ training and operational cost, having more users for OpenAI means losing more money, as each user is a loss – cost is higher than what they bring in. Growth is almost an undesirable factor to them!
Great post! First hand experience using AI as a force multiplier instead of a labor replacement. I have not really had a chance to talk to anyone that uses AI for work. Mostly it has been doom and gloom about it replacing people jobs, but nothing concrete.
Full disclosure I have a lot of concerns around AI safety and where this all could end up. I less often considered what AI might do if it is not a humanity ending event. In this case AI is helping by giving you the ability to offload part of the "thinking" in your work. For example, when you talk about word smithing. It's kind of like the hand written card vs a hallmark card idea. It is much faster to choose the prewritten card that matches what you are thinking, but without having to come up with every words yourself. Like the AI generated Email. It's easier to tell it the general theme and have it pick the words, then you can just "buy the card." By not having to spend as much of your time doing the communication part of your job, you free up more time to do the actual work.
I could see this saving huge amounts of time in my day as well. Most of my day is trying to make sure the the person getting my email/communication will understand what I'm trying to say. Speeding up effective communication could have huge potential gains for productivity.
I can also see your point about this being like the internet though. Nothing is free and everything has a cost. The internet gave us the ability to really connect and learn from each other. Now it seems to be more of a way to divide and distract us.
So let's look at that side of the AI coin too. What do we lose to gain this effective communication tool?
Frankly, both of them sound plausible, possibly even both at the same time. Unless their marketing is truly terrible at explaining what warrants a 900% increase in cost compared to current offerings. Which I doubt, if past announcements are anything to go by they have highlighted everything they feel can highlight. Which isn't much...
My guess is that they're going to keep scaling the free and $20 ones down until they're basically not usable for companies. Maybe something even like, "wow that will take considerable computing power. Wait 1 hr for result or get your company to upgrade for answer now"
Reasoning: back in the day I was able to ask the free version something like, give me 100 words that scramble from letters [list], and then recently it told me it can't do that anymore without paying. Which is funny because I took my (non) business back to the old letter scramble bots.
Not that I’m a fan of AI for this sort of problem, but llama 3.2 via ollama (local LLM wrapper) was fine at this task, so it might work as an alternative. It’s pretty quick to download and get running, too.
Is it "computer programmer quick" or "everyone else quick"?
If you’re on a Mac and you already use homebrew, I think you get it running by putting this into terminal:
I’m in software dev though, so I don’t know if that’s already assuming too much. I run nixOS at home and it’s just as “easy”, but I feel suddenly very self conscious about how high off the ground my bar is …
That's "computer programmer" quick lol.
Maybe I'll try something like that a year from now. Thanks.
https://jan.ai/ is an open source user friendly option to run things locally.
Thanks ;)
Ah, that’s a shame, but no worries. I’d note that there are some other, more friendly options available (LM Studio for example) which are pretty much click and run, but I can’t speak to them from hands on experience.
However it’s definitely cheaper than chatGPT!
You can also just hit "Download" on the page you linked above and it'll walk you through with a nice friendly GUI, which is probably more comfortable for a lot of users :) (cc @lou - although the https://jan.ai/ link looks pretty nice too, I hadn't come across that one!)
I just used this https://wordunscrambler.me/
Did ChatGPT itself tell you that it can't do something without you paying? That absolutely sounds like a mistake by it that you shouldn't trust. In general it and its free version have only gotten better over time, and to my knowledge it's never been instructed to try to upsell people in conversation. Just start a new conversation and ask it again. (Also consider checking the ChatGPT memories in the settings page to make sure it hasn't recorded a misleading memory about the previous conversation.)
You may have planted the idea that you need to pay more into the conversation and it was too agreeable with that. It is a common issue for ChatGPT to believe something to a fault once it has been said in the conversation.
darn near so, yeah. After reading your comment I wasn't sure if I hallucinated that "conversation" or if I somehow set it up to sell me stuff. So I tried it again, with this being the conclusion
The up sell is in a box, not technically part of the chat, which makes it more obvious that it's the humans who programmed it who intended for it to be shown, and not part of the LLM generated content.
Here's the rest of the chat 1, 2, 3
From a previously failed attempt to get 10 unique jokes, I know it will happily go on forever failing the task. So maybe I did set it up wrong by asking if something's wrong with the prompt, but I didn't want to go on much longer knowing it isn't a tool that learns or understand what I'm actually asking anyway. Suggestion for dealing with the impasse of repeated failures confidently presented as passes?
LLMs are bad at math (no joke) as at the basis they do next word prediction (this is very simplified). My guess is that lists of 10 items are common enough in the training data that they can easily do those. For anything else it simply doesn't "know" how much words it has.
It's a common trap for people to walk into as it isn't quite obvious they are weak in this area and they sure as hell will not tell you.
But it couldn't. I'm starting to think these things don't hardly do anything useful at all. It was unable to pull ten jokes from their infinite sources of training material. If it has any useful (general) intelligence at all, it'd pull up an old reddit thread and just copy paste the top ten comments, maybe re-phrase things a bit.
I certainly won't find it payment level useful until it stops being confidently wrong about things. I'd be much happier with a narrow use chatbot that reports "I don't know how to do that" frequently, rather than something that gives you oranges when you as for apples and is pleased as plum about bring dead wrong. It's even good at insidiously and unexpectedly failing simple fetch requests such as "is ___ a noun" or "is (place) an island" or "give me the help of this command". It doesn't know what it knows, and it certainly doesn't know what it doesn't know
Oh I read it as if it could get 10 items but just not good jokes, my bad. Further reinforces the point though, don't use LLMs for number related things.
As for the quality of the jokes, LLMs can't really write good unique material either. They are effectively predicting patterns based on all the training material. It's why I have a hard time believing all those claims about them being used to generate anything substantial.
What they are good for (in my opinion) is mostly as tools to help you with your own materials. With a big asterisk that they do require you to know their limitations and idiosyncrasies. For technical stuff I wrote about it here. But also for non-technical things I find them useful as more generic tools.
But the overall theme is that I rarely, if ever use them to create something directly. Heck, I barely even ask them to modify something and use that.
LLMs can actually be decent at making jokes depending the prompt. I think the trick is to get it out of writing in the "assistant" voice and to get it to lean into absurd mash-ups. Here are my results of telling Claude 3 (and GPT 4, which wasn't as good) to write tweets in the style of dril, a popular Twitter user: https://bsky.app/profile/macil.tech/post/3kpcvicmirs2v
My vague impression is that OpenAI o1 is very expensive to run. They probably only have enough capacity to serve "whales" who want to try out the latest and aren't price sensitive. Who is it for? People with expense accounts who can justify it as research.
Has anyone had a good experience with the o1 preview? The preview just looked like it was stepping through its own iterations using the 4o model (or something similar) and didn't feel like it gave me answers that were better than just a few manual iterations with 4o. That and it's kind of unnecessarily verbose, but maybe that's the point and I'm just not using it for the right kind of questions.
But if 1o preview is supposed to be a selling point it doesn't feel like it does a very good job of selling the step up. At least in my experience so far.
The o1 model has special training that 4o doesn't to help it be more productive at making progress in solving problems while it writes text to itself, but your experience matches mine. From the benchmarks it appears there are certain kinds of multi-step problems that o1 uniquely excels at, but for a lot of other stuff it ends up being only as good as 4o while being slower.
It’s for companies. Mostly things like software companies. Even for a modestly paid software engineer, if they cost like $120/hr, then if this software can save them 2 hours of work a month, you’ve made your money back already and then some.
No, it is not. There is nothing in the announcement that indicates that this is performing 900% better compared to the plus subscription offering.
Besides for $180 you get a lot of API usage which is much more beneficial for software engineers as they can plug it in all sorts of tooling. Not just chat.
The $20/month Plus subscription has strict limits on o1 and Advanced Voice Mode usage. Some users have tasks that only these features are able to accomplish. The new $200/month Pro subscription gives you more than 10x those limits (unlimited).
Which type of users would that be exactly?
Comparing the old cost to the new cost isn't that useful in this context. Let's say hypothetically the old version is a net negative in productivity for any work task, so it's not worth using even if it were free. If the new version is a net positive in productivity for some work tasks, it could be worth a relatively high price depending on how many hours it saves. The charts claim a significant increase in accuracy for some tasks. I can imagine that might make the difference in some situations.
If it really that much more accurate I would imagine they had focused significantly more on that aspect.
Well, my main point is that something doesn't have to be 10x better to be worth paying 10x the price. Although I can't say whether this particular product is worth it's asking price.
However, unless I'm interpreting things wrong, half of the linked announcement is charts demonstrating improved accuracy and reliability in some benchmarks. I have no idea how realistic these claims are, but they are claiming it.
I agree that it doesn't have to be a 1:1 relation in price increase. What I am having trouble wrapping my head around is that the improvements they are claiming are fairly typical as to the claims with previous model improvements. Those announcements had similar graphs, etc. In fact, those announcements often came with demos as well showcasing a lot of these capabilities.
In that context this announcement, from my perspective, is outright timid as far as openAI announcements go. Certainly when you also factor in the price increase.
We know we can increase accuracy by making the LLM check its own work, but it costs a ton more compute for a modest increase in accuracy. Totally speculation, but based on the announcement saying that it "thinks longer" I assume this is it, and they're not advertising it too hard because the general public would obviously balk at such a price.
Discussion on HN: https://news.ycombinator.com/item?id=42330732
I use ChatGPT a lot for various work tasks, especially coding small scripts. I don't need a model that's "smarter" or "thinks harder". I really need something that I can throw a lot of data at it, like database tables, CSV files, and text documents, and have it understand the data without a lot of tedious mapping of columns, setting up definitions and rules, etc.
How long is this whole "the new model is smarter and thinks harder" thing going to last? What's the point of improving the model and leaving the"worse" ones available to use? Are they trying to feign additional value? As a layman, I just don't get it.
4o is much cheaper and quicker than o1 while being just as good for a lot of tasks. o1 is only better for certain tasks.
On the developer side, there's been a lot of trade-offs in choosing between different LLMs for a while, with a whole spectrum from cheap+quick+dumb models to expensive+slow models. I've been surprised that the consumer applications (ChatGPT, Claude, Gemini) have stuck to a single subscription option for so long instead of offering multiple price points as ChatGPT is doing now.
This comes shortly after media reports about OpenAI considering putting ads in the free version. If they need revenue this badly, I'd expect the api pricing to increase as well.