-
6 votes
-
AI slop is killing our channel
36 votes -
Why do LLMs freak out over the seahorse emoji?
50 votes -
Merriam-Webster has unveiled their latest and greatest LLM to date
67 votes -
Why language models hallucinate
27 votes -
Is it possible to easily finetune an LLM for free?
so Google's AI Studio used to have an option to finetune gemini flash for free by simply uploading a csv file. but it seems they have removed that option, so I'm looking for something similar. I...
so Google's AI Studio used to have an option to finetune gemini flash for free by simply uploading a csv file. but it seems they have removed that option, so I'm looking for something similar. I know models can be finetuned on colab but the problem with that is it's way too complicated for me, I want something simpler. I think I know enough python to be able to prepare a dataset so that shouldn't be a problem.
21 votes -
Deep Think with Confidence
9 votes -
AI tokens are getting more expensive
10 votes -
Claude Opus 4 and 4.1 can now end a rare subset of conversations
15 votes -
Social media probably can’t be fixed
38 votes -
Evaluating GPT5's reasoning ability using the Only Connect game show
18 votes -
Is chain-of-thought reasoning of LLMs a mirage? A data distribution lens.
28 votes -
Reddit will block the Internet Archive
58 votes -
Question - how would you best explain how an LLM functions to someone who has never taken a statistics class?
My understanding of how large language models work is rooted in my knowledge of statistics. However a significant number of people have never been to college and statistics is a required course...
My understanding of how large language models work is rooted in my knowledge of statistics. However a significant number of people have never been to college and statistics is a required course only for some degree programs.
How should chatgpt etc be explained to the public at large to avoid the worst problems that are emerging from widespread use?
37 votes -
AI industry horrified to face largest copyright class action ever certified
63 votes -
The great LLM scrape
24 votes -
Persona vectors: monitoring and controlling character traits in language models
13 votes -
OpenAI can rehabilitate AI models that develop a “bad boy persona”
14 votes -
The future of forums is lies, I guess
63 votes -
No, of course I can! Refusal mechanisms can be exploited using harmless fine-tuning data.
9 votes -
AI coding tools make developers slower but they think they're faster, study finds
40 votes -
Pay up or stop scraping: Cloudflare program charges bots for each crawl
46 votes -
Cats confuse reasoning LLM: Query-agnostic adversarial triggers for reasoning models
24 votes -
TikTok is being flooded with racist AI videos generated by Google’s Veo 3
35 votes -
Your brain on ChatGPT: Accumulation of cognitive debt when using an AI assistant for essay writing task
54 votes -
Echo Chamber: A context-poisoning jailbreak that bypasses LLM guardrails
34 votes -
Is pop culture a form of "model collapse?"
Disclaimer: I do not like LLMs. I am not going to fight you on if you say LLMs are shit. One of the things I find interesting about conversations on LLMs is when have a critique about them, and...
Disclaimer: I do not like LLMs. I am not going to fight you on if you say LLMs are shit.
One of the things I find interesting about conversations on LLMs is when have a critique about them, and someone says, "Well, it's no different than people." People are only as good as their training data, people misremember / misspeak / make mistakes all the time, people will listen to you and affirm you as you think terrible things. My thought is that not being reliably consistent is a verifiable issue for automation. Still, I think it's excellent food for thought.
I was looking for new music venues the other day. I happened upon several, and as I looked at their menu and layout, it occurred to me that I had eaten there before. Not there, but in my city, and in others. The Stylish-Expensive-Small-Plates-Record-Bar was an international phenomenon. And more than that, I couldn't help but shake that it was a perversion of the original, alluring concept-- to be in a somewhat secretive record bar in Tokyo where you'll be glared into the ground if you speak over the music.
It's not a bad idea. And what's wrong with evoking a good idea, especially if the similarity is just unintentional? Isn't it helpful to be able to signal to people that you're like-that-thing instead of having to explain to people how you're different? Still, the idea of going just made me assume it'd be not simply like something I had experienced before, but played out and "fake." We're not in Tokyo, and people do talk over the music. And even if they didn't, they have silverware and such clanging. It makes me wonder if this permutation is a lossy estimation of the original concept, just chewed up, spat out, slurped, regurgitated, and expensively funded.
other forms of conceptual perversion:
- Matters of Body Image - is it a sort of collapse when we go from wanting 'conventional beauty' to frankensteining features onto ourselves? Think fox eye surgeries, buccal fat removal, etc. Rather than wanting to be conventionally attractive, we aim for the related concept of looking like people who are famous.
- (still thinking)
15 votes -
Disney files landmark case against AI image generator
16 votes -
The Common Pile v0.1: An 8TB dataset of public domain and openly licensed text
26 votes -
Six-month-old, solo-owned vibe coder Base44 sells to Wix for $80M cash
13 votes -
Is the AI bubble about to burst?
35 votes -
OpenAI featured chatbot is pushing extreme surgeries to “subhuman” men
35 votes -
LLMs and privacy
Hello to everyone who's reading this post :) Now LLMs are increasingly so useful (of course after careful review of their generated answers), but I'm concerned about sharing my data, especially...
Hello to everyone who's reading this post :)
Now LLMs are increasingly so useful (of course after careful review of their generated answers), but I'm concerned about sharing my data, especially very personal questions and my thought process to these large tech giants who seem to be rather sketchy in terms of their privacy policy.
What are some ways I can keep my data private but still harness this amazing LLM technology? Also what are some legitimate and active forums for discussions on this topic? I have looked at reddit but haven't found it genuinely useful or trustworthy so far.
I am excited to hear your thoughts on this!
33 votes -
Which translation tools are LLM free? Will they remain LLM free?
Looking at the submission rules for Clarkesworld Magazine, I found the following: Statement on the Use of “AI” writing tools such as ChatGPT We will not consider any submissions translated,...
Looking at the submission rules for Clarkesworld Magazine, I found the following:
Statement on the Use of “AI” writing tools such as ChatGPT
We will not consider any submissions translated, written, developed, or assisted by these tools. Attempting to submit these works may result in being banned from submitting works in the future.
EDIT: I assume that Clarkesworld means a popular, non-technical understanding of AI meaning post-chatGPT LLMs specifically and not a broader definition of AI that is more academic or pertinent the computer science field.
I imagine that other magazines and website have similar rules. As someone who does not write directly in English, that is concerning. I have never translated without assistance in my life. In the past I used both Google Translate and Google Translator Toolkit (which no longer exist).
Of course, no machine translation is perfect, that was only a first pass that I would change, adapt and fix extensively and intensely. In the past I have used the built-in translation feature from Google Docs. However, now that Gemini is integrated in Google Docs, I suspected that it uses AI instead for translation. So I asked Gemini, and it said that it does. I am not sure if Gemini is correct, but, if it doesn't use AI now it probably will in the future.
That poses a problem for me, since, in the event that I wish to submit a story to English speaking magazines or websites, I will have to find a tool that is guaranteed to be dumb. I am sure they exist, but for how long? Will I be forced to translate my stories like a cave men? Is anyone concerned with keeping non-AI translation tools available, relevant, and updated? How can I even be sure that a translation tool does not use AI?
28 votes -
Duolingo is replacing human workers with AI
34 votes -
Large Language Models are more persuasive than incentivized human persuaders
14 votes -
Some ChatGPT users are developing delusional beliefs that are reinforced by the large language model
53 votes -
When ChatGPT broke an entire field: An oral history
14 votes -
State Bar of California admits it used AI to develop exam questions, triggering new furor
25 votes -
OpenAI is a systemic risk to the tech industry
35 votes -
Kagi Assistant is now available to all users
44 votes -
Russia seeds chatbots with lies. Any bad actor could game AI the same way.
33 votes -
Anubis works
35 votes -
The ARC-AGI-2 benchmark could help reframe the conversation about AI performance in a more constructive way
The popular online discourse on Large Language Models’ (LLMs’) capabilities is often polarized in a way I find annoying and tiresome. On one end of the spectrum, there is nearly complete dismissal...
The popular online discourse on Large Language Models’ (LLMs’) capabilities is often polarized in a way I find annoying and tiresome.
On one end of the spectrum, there is nearly complete dismissal of LLMs: an LLM is just a slightly fancier version of the autocomplete on your phone’s keyboard, there’s nothing to see here, move on (dot org).
This dismissive perspective overlooks some genuinely interesting novel capabilities of LLMs. For example, I can come up with a new joke and ask ChatGPT to explain why it’s funny or come up with a new reasoning problem and ask ChatGPT to solve it. My phone’s keyboard can’t do that.
On the other end of the spectrum, there are eschatological predictions: human-level or superhuman artificial general intelligence (AGI) will likely be developed within 10 years or even within 5 years, and skepticism toward such predictions is “AI denialism”, analogous to climate change denial. Just listen to the experts!
There are inconvenient facts for this narrative, such as that the majority of AI experts give much more conservative timelines for AGI when asked in surveys and disagree with the idea that scaling up LLMs could lead to AGI.
The ARC Prize is an attempt by prominent AI researcher François Chollet (with help from Mike Knoop, who apparently does AI stuff at Zapier) to introduce some scientific rigour into the conversation. There is a monetary prize for open source AI systems that can perform well on a benchmark called ARC-AGI-2, which recently superseded the ARC-AGI benchmark. (“ARC” stands for “Abstract and Reasoning Corpus”.)
ARC-AGI-2 is not a test of whether an AI is an AGI or not. It’s intended to test whether AI systems are making incremental progress toward AGI. The tasks the AI is asked to complete are colour-coded visual puzzles like you might find in a tricky puzzle game. (Example.) The intention is to design tasks that are easy for humans to solve and hard for AI to solve.
The current frontier AI models score less than 5% on ARC-AGI-2. Humans score 60% on average and 100% of tasks have been solved by at least two humans in two attempts or less.
For me, this helps the conversation about AI capabilities because it gives a rigorous test and quantitative measure to my casual, subjective observations that LLMs routinely fail at tasks that are easy for humans.
François Chollet was impressed when OpenAI’s o3 model scored 75.7% on ARC-AGI (the older version of the benchmark). He emphasizes the concept of “fluid intelligence”, which he seems to define as the ability to adapt to new situations and solve novel problems. Chollet thinks that o3 is the first AI system to demonstrate fluid intelligence, although it’s still a low level of fluid intelligence. (o3 also required thousands of dollars’ worth of computation to achieve this result.)
This is the sort of distinction that can’t be teased out by the polarized popular discourse. It’s the sort of nuanced analysis I’ve been seeking out, but which has been drowned out by extreme positions on LLMs that ignore inconvenient facts.
I would like to see more benchmarks that try to do what AGI-AGI-2 does: find problems that humans can easily solve and frontier AI models can’t solve. These sort of benchmarks can help us measure AGI progress much more usefully than the typical benchmarks, which play to LLMs’ strengths (e.g. massive-scale memorization) and don’t challenge them on their weaknesses (e.g. reasoning).
I long to see AGI within my lifetime. But the super short timeframes given by some people in the AI industry feel to me like they border on mania or psychosis. The discussion is unrigorous, with people pulling numbers out of thin air based on gut feeling.
It’s clear that there are many things humans are good at doing that AI can’t do at all (where the humans vs. AI success rate is ~100% vs. ~0%). It serves no constructive purpose to ignore this truth and it may serve AI research to develop rigorous benchmarks around it.
Such benchmarks will at least improve the quality of discussion around AI capabilities, insofar as people pay attention to them.
Update (2024-04-11 at 19:16 UTC): François Chollet has a new 20-minute talk on YouTube that I recommend. I've watched a few videos of Chollet talking about ARC-AGI or ARC-AGI-2, and this one is beautifully succinct: https://www.youtube.com/watch?v=TWHezX43I-4
10 votes -
Using Claude and undocumented Google Calendar features to automate event creation
4 votes -
Tracing the thoughts of a large language model
10 votes -
Review: Cræft, by Alexander Langlands
4 votes -
Please stop externalizing your costs directly into my face
121 votes -
FOSS infrastructure is under attack by AI companies
39 votes -
LLM crawlers continue to DDoS SourceHut
11 votes