Tildes

Activity

Votes

Comments

New

All activity

Showing only topics with the tag "language models.large". Back to normal view

An AI agent published a hit piece on me

~tech Article 1489 words

22 comments

theshamblog.com

February 12

48 votes
My personal AI assistant project

~tech Ask

Let me start off by saying that I'm exhausted by AI hype. Being interested in LLM agent technology (AI agent hereafter for brevity) means skimming over a lot of hype for one or two useful, semi...

Let me start off by saying that I'm exhausted by AI hype. Being interested in LLM agent technology (AI agent hereafter for brevity) means skimming over a lot of hype for one or two useful, semi reality based, bits of information. Maybe the part that I find the most frustrating is how effective the hype is. I don't know if there's ever been a hype cycle like this. Probably a big part of the reason for that is the internet has already proven, within living memory for most people, that technological revolutions really can change everything. Or mess everything up. Either way they generate a lot of economic activity.

So this post is not that. I'm not going to tell you about how AI agents are the second coming for Christ. I'm not selling anything.

Fairly early into learning about AI agents I wanted a way to connect to the agent remotely without hosting it somewhere or exposing ports to the internet. I settled on tailscale and a remote terminal and moved on, I rarely used it. Somehow the tiny friction of "Turn on tailscale, open terminal app, connect, run agent" was enough to make it not feel worth it.

I know I'm far from the only person who had the same "I want it remote" thought, the best evidence: OpenClaw. It's just one of those things that everyone naturally converges on.

If you're not familiar with OpenClaw, the TLDR is: Former founder with more money than he'll ever need vibecodes a bridge between instant messenger apps and LLM APIs. Nothing about it is technically challenging or requires solving any particularly hard problems. It almost immediately becomes the fastest growing GitHub repo of all time and is currently at number 14 for number of stars. It blew up the (tech) internet like very few things ever have. Within months he was hired by Open AI.

OpenClaw now does more than just connect messaging and agents, but I believe that one piece is the killer feature. My tailscale terminal solution, combined with a scheduled task or a cron job and some context files could already do all of the things that OpenClaw can do, and countless people had already implemented similar solutions. But I think it was the tiny bit of friction OpenClaw removed that was responsible for a lot its popularity.

I thought that was interesting but I have no interest in the security nightmare that is OpenClaw, or the "sentience" vibe for that matter, so I built my own tool.

Essentially it's just a light secondary harness combined with a bridge between Signal and Claude Code. It does some other things too, things I wished existing harnesses did, some memory and guidelines, automated prompts and reminders to wake the agent up and have it do stuff, some context to give the agent some level of persistence, make it less LLMy, less annoying. None of that is particularly interesting though.

Once I got it working (MVP took less than a day) and started playing with it, the OpenClaw phenomenon made a lot more sense. Somehow having the agent in a chat interface, with almost zero friction (just open the chat and send something) was cooler than it had any reason to be.

I can't explain it any better than that at the moment. Not only was it kinda fun, it lent itself to a whole range of "what ifs". What if it could do X? What if I wrote a tool that gave it Y capability? I've been experiencing that for some time, but somehow agent in your pocket has a different feeling.

Here's an example of a "what if". What if it could do our grocery shopping? I definitely want that. I already had a custom browser tool that I built for agent coding assistance so I was most of the way there. It was just a matter of teaching the agent to login and navigate a website, something they're already trained to do. Some hand holding, a few helper scripts, and an evening's worth of hours later and I had it working. The agent can respond to a shopping request by building a shopping list based on our most recent orders, presenting it to us for approval/edits in a Signal group chat, doing searches for any additional product requests and adding the finalized order to the cart. It could also checkout the order and schedule the delivery time but I'm doing the last 2 clicks manually for the time being. It's an idiot savant, it seems like a bad idea to give it access to my credit card. Maybe eventually.

The fact that I can handle shopping with a couple of signal messages feels effortless in a way that handling shopping by connecting to my PC terminal remotely via tailscale terminal wouldn't have. Especially when I can include people in the loop who have no interest in tailscaling anywhere. Everyone can use messaging apps.

I imagine before long solutions like this will be built in, either in the grocery websites and apps, or into the frontier harnesses themselves. There will probably be agents everywhere, for better or worse. Probably I'll wish that the agents would all fuck off. In the meantime it's exciting how easy it is to get these tools to do useful things.

13 comments

post_below

February 23

33 votes
AI’s memorization crisis (gifted link)

~tech Article 2298 words, published Jan 9 2026

19 comments

The Atlantic

3 days ago

24 votes
Anthropic rejects latest US Pentagon offer: ‘We cannot in good conscience accede to their request’

~tech Article

39 comments

CNN

4 days ago

61 votes
New accounts on Hacker News ten times more likely to use em-dashes
~tech
- social media
Article 255 words
44 comments

marginalia.nu

6 days ago

53 votes
Updating Eagleson's Law in the age of agentic AI

~comp Ask

Eagleson's Law states "Any code of your own that you haven't looked at for six or more months might as well have been written by someone else." I keep reading how fewer and fewer of the brightest...

Eagleson's Law states

"Any code of your own that you haven't looked at for six or more months might as well have been written by someone else."

I keep reading how fewer and fewer of the brightest developers are writing code and letting their AI agent to do it all. How do they know what's really happening? Does it matter anymore?

Curious to hear this communities thoughts

9 comments

hamitosis

5 days ago

11 votes
Ladybird chooses Rust as its successor language to C++, with help from AI
~comp
Article 609 words
18 comments

ladybird.org

February 23

33 votes
The Claude C Compiler: what it reveals about the future of software

~tech Article 3173 words

9 comments

modular.com

February 23

16 votes
Child’s play

~tech Link

0 comments

harpers.org

February 24

11 votes
Why doesn’t Anthropic use Claude to make a good Claude desktop app?

~tech Article 591 words

41 comments

manualdousuario.net

February 23

27 votes
The AI disruption has arrived, and it sure is fun (gifted link)

~tech Article

52 comments

The New York Times

February 18

29 votes
AI is coming for culture

~tech Article 7495 words, published Aug 25 2025

5 comments

The New Yorker

February 21

11 votes
AI fails at 96% of jobs (new study)

~tech Video 12:49

16 comments

YouTube: ColdFusion

February 13

28 votes
Something big is happening

~tech Article 4865 words, published Feb 9 2026

51 comments

shumer.dev

February 13

33 votes
A "Real BMO" local AI Agent with a Raspberry Pi and Ollama

~tech Video 23:16, published Feb 2 2026

2 comments

YouTube: brenpoly

February 6

14 votes
Building a C compiler with a team of parallel Claudes

~tech Article 2465 words

12 comments

anthropic.com

February 5

20 votes
Is the detachment in the room? - Agents, cruelty, and empathy
~tech
- social media
Article 1999 words
16 comments

hailey.at

February 7

15 votes
Passing question about LLMs and the Tech Singularity

~tech Ask

I am currently reading my way thru Ted Chiang's guest column in the New Yorker, about why the predicted AI/Tech Singularity will probably never happen...

I am currently reading my way thru Ted Chiang's guest column in the New Yorker, about why the predicted AI/Tech Singularity will probably never happen (https://www.newyorker.com/culture/annals-of-inquiry/why-computers-wont-make-themselves-smarter). ETA: I just noticed that article is almost 5 years old; the piece is still relevant, but worth noting.

Good read. Still reading, but so far, I find I disagree with his explicit arguments, but at the same time, he is also brushing up very closely to my own reasoning for why "it" might never happen. Regardless, it is thought-provoking.

But, I had a passing thought during the reading.

People who actually use LLMs like Claude Code to help write software, and/or, who pay close attention to LLMs' coding capabilities ... has anyone actually started experimenting with asking Claude Code or other LLMs that are designed for programming, to look at their own source code and help to improve it?

In other words, are we (the humans) already starting to use LLMs to improve their code faster than we humans alone could do?

Wouldn't this be the actual start of the predicted "intelligence explosion"?

Edit to add: To clarify, I am not (necessarily) suggesting that LLMs -- this particular round of AI -- will actually advance to become some kind of true supra-human AGI ... I am only suggesting that they may be the first real tool we've built (beyond Moore's Law itself) that might legitimately speed up the rate at which we approach the Singularity (whatever that ends up meaning).

30 comments

Eric_the_Cerise

February 4

19 votes
llOOPy lOOPs
~comp
- programming.object oriented
Article 1431 words, published Feb 3 2026
4 comments

autonoma.ca

February 6

12 votes
Youtube channel ServeTheHome describes how they use a locally running LLM to automate data collection, allowing them to forgo a planned hire
~tech
- social media
Video 17:15
6 comments

YouTube: ServeTheHome

January 30

20 votes
Supporting Markdown search for LLMs

~tech Link

21 comments

build.ms

February 2

15 votes
Evaluating LLMs by finding werewolves

~tech Link

2 comments

kaggle.com

February 3

18 votes
How AI assistance impacts the formation of coding skills

~tech Article 2140 words

5 comments

anthropic.com

January 31

18 votes
Pi: The minimal agent within OpenClaw

~tech Article 2003 words

4 comments

pocoo.org

January 31

13 votes
Wilson Lin on FastRender: a browser built by thousands of parallel agents
~tech
- browsers
Article 2033 words
10 comments

simonwillison.net

January 24

18 votes
Show HN: I wrapped the Zorks with an LLM

~games Link

1 comment

ycombinator.com

January 28

16 votes
Blocking Claude

~comp Article 213 words

4 comments

aphyr.com

January 27

28 votes
Why does ssh send 100 packets per keystroke?
~tech
- security
Article 1298 words
2 comments

eieio.games

January 22

28 votes
The assistant axis: situating and stabilizing the character of large language models

~tech Article 2160 words

23 comments

anthropic.com

January 20

15 votes
exe.dev, a service for creating Linux virtual machines and vibe-coding in them
~tech
- linux
Link
12 comments

exe.dev

December 25, 2025

23 votes
Apple to partner with Google for Gemini access on iPhones, Apple Intelligence to power on device assistant
~tech
- ios
- apple.intelligence
Link
12 comments

blog.google

January 13

29 votes
HiTeX Press: A spam factory for AI-generated books

~books Article 869 words, published Sep 10 2025

1 comment

le-brun.eu

January 13

15 votes
You are a better writer than AI (yes, YOU!)
~creative
- writing
Video 40:20, published Feb 8 2025
44 comments

NaraVara

January 3

25 votes
China drafts world’s strictest rules to end AI-encouraged suicide, violence

~tech Article 320 words

8 comments

Ars Technica

December 30, 2025

22 votes
The truth about AI (specifically LLM powered AI)

~tech Ask

The last couple of years have been a wild ride. The biggest parts of the conversation around AI for most of that time have been dominated by absurd levels of hype. To go along with the cringe...

The last couple of years have been a wild ride. The biggest parts of the conversation around AI for most of that time have been dominated by absurd levels of hype. To go along with the cringe levels of hype, a lot of people have felt the pain of dealing with the results of rushed and forced AI implementation.

As a result the pushback against AI is loud and passionate. A lot of people are pissed, for good reasons.

Because of that it would be understandable for people casually watching from a distance to get the impression that AI is mostly an investor fueled shitshow with very little real value.

The first part of the sentiment is true, it's definitely a shitshow. Big companies are FOMOing hard, everyone is shoehorning AI into everything they can in hopes of capturing some of that hype money. It feels like crypto, or Web 3.0. The result is a mess and we're nowhere near peak mess yet.

Meanwhile in software engineering the conversation is extremely polarized. There is a large, but shrinking, contingent of people who are absolutely sure that AI is something like a scam. It only looks like a valid tool and in reality it creates more problems than it solves. And until recently that was largely true. The reason that contingent is shrinking, though, is that the latest generation of SOTA models are an undeniable step change. Every day countless developers try using AI for something that it's actually good at and they have the, as yet nameless but novel, realization that "holy shit this changes everything". It's just like every other revolutionary tech tool, you have to know how to use it, and when not to use it.

The reason I bring up software engineering is that code is deterministic. You can objectively measure the results. The incredible language fluency of LLMs can't gloss over code issues. It either identified the bug or it didn't. It either wrote a thorough, valid test or it didn't. It's either good code or it isn't. And here's the thing: It is. Not automatically, or in all cases, and definitely not without careful management and scaffolding. But used well it is undeniably a game changing tool.

But it's not just game changing in software. As in software if it's used badly, or for the wrong things, it's more trouble than it's worth. But used well it's remarkable. I'll give you an example:

A friend was recently using AI to help create the necessary documents for a state government certification process for his business. If you've ever worked with government you've already imagined the mountain of forms, policies and other documentation that were required. I got involved because he ran into some issues getting the AI to deliver.

Going through his session the thing that blew my mind was how little prompting it took to get most of the way there. He essentially said "I need help with X application process for X certification" and then he pasted in a block of relevant requirements from the state. The LLM agent then immediately knew what to do, which documents would be required and which regulations were relevant. It then proceeded to run him through a short Q and A to get the necessary specifics for his business and then it just did it. The entire stack of required documentation was done in under an hour versus the days it would have taken him to do it himself. It didn't require detailed instructions or .md files or MCP servers or artifacts, it just did it.

And he's familiar with this process, he has the expertise to look at the resulting documents and say "yeah this is exactly what the state is looking for". It's not surprising that the model had a lot of government documentation in its training data, it shouldn't even really be mind blowing at this point how effective it was, but it blew my mind anyway. Probably because not having to deal with boring, repetitive paperwork is a miraculous thing from my perspective.

This kind of win is now available in a lot of areas of work and business. It's not hype, it's objectively verifiable utility.

This is not to say that it's not still a mess. I could write an overly long essay on the dangers of AI in software, business and to society at large. We thought social media was bad, that the digital revolution happened too fast for society to adapt... AI is a whole new category of problematic. One that's happening far faster than anything else has. There's no precedent.

But my public service message is this: Don't let the passionate hatred of AI give you the wrong idea: There is real value there. I don't mean this is a FOMO way, you don't have to "use AI or get left behind". The truth is that 6 months from now the combination of new generations of models and improved tooling, scaffolding and workflows will likely make the current iteration of AI look quaint by comparison. There's no rush to figure out a technology that's advancing and changing this quickly because most of what you learn right now will be about solving problems that will be solved by default in the near future.

That being said, AI is the biggest technological leap since the beginning of the public, consumer facing, internet. And I was there for that. Like the internet it will prove to be both good and bad, corporate consolidation will make the bad worse. And, like the internet, the people who are saying it's not revolutionary are going to look silly in the context of history.

I say this from the perspective of someone who has spent the past year casually (and in recent months intensively) learning how to use AI in practical ways, with quantifiable results, both in my own projects and to help other people solve problems in various domains. If I were to distill my career into one concept, it would be: solving problems. So I feel like I'm in a position to speak about problem solving technology with expertise. If you have a use for LLM powered AI, you'll be surprised how useful it is.

79 comments

post_below

December 22, 2025

58 votes
Science, large language models, and goal displacement

~science Article 2678 words

1 comment

Substack: Kevin Baker

December 24, 2025

7 votes
What I learned building pi, an opinionated and minimal coding agent

~tech Article 6472 words

1 comment

mariozechner.at

December 22, 2025

9 votes
AI might not be coming for lawyers’ jobs anytime soon

~tech Article 1713 words

3 comments

MIT Technology Review

December 22, 2025

7 votes
JustHTML is a fascinating example of vibe engineering in action

~tech Article 880 words

24 comments

simonwillison.net

December 14, 2025

47 votes
Useful patterns for building HTML tools

~tech Article 2981 words

1 comment

simonwillison.net

December 14, 2025

7 votes
Weird generalization and inductive backdoors: new ways to corrupt LLMs

~tech Article

2 comments

arXiv

December 12, 2025

17 votes
Animals versus ghosts

~tech Article 1776 words, published Oct 1 2025

3 comments

bearblog.dev

November 30, 2025

6 votes
GPT-5 has come a long way in mathematics

~tech Article 3904 words

24 comments

ritchot.me

November 23, 2025

21 votes
LLMs are bullshitters. But that doesn't mean they're not useful.

~tech Article 1919 words, published Nov 19 2025

3 comments

kagi.com

November 23, 2025

20 votes
Is trying to become an author insane in times of LLMs?

~tech Ask

A simple question. I know LLMs are currently not a replacement for authors. Will that remain true in 5 to 10 years? EDIT: No. I never expected to earn a living either mostly or exclusively by...

A simple question. I know LLMs are currently not a replacement for authors. Will that remain true in 5 to 10 years?

EDIT: No. I never expected to earn a living either mostly or exclusively by selling books. There are however many "side gigs" in my country that can greatly benefit from being published by a real company. Ultimately though, I'm not in it primarily for the money. But I wonder what the future holds for fiction as a whole.

16 comments

lou

November 16, 2025

21 votes
The worlds on fire. So lets just make AI porn.

~tech Article 5247 words

20 comments

itstoday.site

November 20, 2025

23 votes
Part of me wishes it wasn't true but: AI coding is legit

~tech Ask

I stay current on tech for both personal and professional reasons but I also really hate hype. As a result I've been skeptical of AI claims throughout the historic hype cycle we're currently in....

I stay current on tech for both personal and professional reasons but I also really hate hype. As a result I've been skeptical of AI claims throughout the historic hype cycle we're currently in. Note that I'm using AI here as shorthand for frontier LLMs.

So I'm sort of a late adopter when it comes to LLMs. At each new generation of models I've spent enough time playing with them to feel like I understand where the technology is and can speak about its viability for different applications. But I haven't really incorporated it into my own work/life in any serious way.

That changed recently when I decided to lean all the way in to agent assisted coding for a project after getting some impressive boilerplate out of one of the leading models (I don't remember which one). That AI can do a competent job on basic coding tasks like writing boilerplate code is nothing new, and that wasn't the part that impressed me. What impressed me was the process, especially the degree to which it modified its behavior in practical ways based on feedback. In previous tests it was a lot harder to get the model to go against patterns that featured heavily in the training data, and then get it to stay true to the new patterns for the rest of the session. That's not true anymore.

Long story short, add me to the long list of people whose minds have been blown by coding agents. You can find plenty of articles and posts about what that process looks like so I won't rehash all the details. I'll only say that the comparisons to having your own dedicated junior or intern who is at once highly educated and dumb are apt. Maybe an even better comparison would be to having a team of tireless, emotionless, junior developers willing to respond to your requests at warp speed 24/7 for the price of 1/100th of one developer. You need the team comparison to capture the speed.

You've probably read, or experienced, that AI is good at basic tasks, boilerplate, writing tests, finding bugs and so on. And that it gets progressively worse as things get more complicated and the LoCs start to stack up. That's all true but one part that has changed, in more recent models, is the definition of "basic".

The bit that's difficult to articulate, and I think leads to the "having a nearly free assistant" comparisons, is what it feels like to have AI as a coding companion. I'm not going to try to capture it here, I'll just say it's remarkable.

The usual caveats apply, if you rely on agents to do extensive coding, or handle complex problems, you'll end up regretting it unless you go over every line with a magnifying glass. They will cheerfully introduce subtle bugs that are hard to catch and harder to fix when you finally do stumble across them. And that's assuming they can do the thing you're asking then to do at all. Beyond the basics they still abjectly fail a lot of the time. They'll write humorously bad code, they'll break unrelated code for no apparent reason, they'll freak out and get stuck in loops (that one suprised me in 2025). We're still a long way from agents that can actually write software on their own, despite the hype.

But wow, it's liberating to have an assistant that can do 100's of basic tasks you'd rather not be distracted by, answer questions accurately and knowledgeably, scan and report clearly about code, find bugs you might have missed and otherwise soften the edges of countless engineering pain points. And brainstorming! A pseudo-intelligent partner with an incomprehensibly wide knowledge base and unparalled pattern matching abilities is guaranteed to surface things you wouldn't have considered.

AI coding agents are no joke.

I still agree with the perspectives of many skeptics. Execs and middle managers are still out of their minds when they convince themselves that they can fire 90% of their teams and just have a few seniors do all the work with AI. I will read gleefully about the failures of that strategy over the coming months and years. The failure of their short sightedness and the cost to their organizations won't make up for the human cost of their decisions, but at least there will be consequences.

When it comes to AI in general I have all the mixed feelings. As an artist, I feel the weight of what AI is doing, and will do, to creative work. As a human I'm concerned about AI becoming another tool to funnel ever more wealth to the top. I'm concerned about it ruining the livelihoods of huge swaths of people living in places where there aren't systems that can handle the load of taking care of them. Or aren't even really designed to try. There are a lot of legitimate dystopian outcomes to be worried about.

Despite all that, actually using the technology is pretty exciting, which is the ultimate point of this post: What's your experience? Are you using agents for coding in practical ways? What works and what doesn't? What's your setup? What does it feel like? What do you love/hate about it?

62 comments

post_below

November 16, 2025

50 votes
Former PM Katrín Jakobsdóttir has said the Icelandic language could be wiped out in as little as a generation due to the sweeping rise of AI and encroaching English language dominance

~humanities.languages Article 815 words

4 comments

The Guardian

November 16, 2025

18 votes
Duck Duck Go search AI curiously cited Tildes

~tech Ask

I was trying to find out why Lidarr wasn't matching my copy of The Cure's Greatest Hits. Found out I've got some bootleg Russian release that's catalogued on discogs (I eventually found the...

I was trying to find out why Lidarr wasn't matching my copy of The Cure's Greatest Hits. Found out I've got some bootleg Russian release that's catalogued on discogs (I eventually found the musicbrainz release and updated my profile to include bootlegs). So I search "Lidarr use specific discogs release" and the duck duck go search assist spat out some text about Lidarr not using discogs and cited this Tildes post.

It's curious because that post is 3yrs old and doesn't talk about discogs integration in Lidarr, just one mention of discogs in the post and some folks talking about Lidarr in the comments (It did cite a relevant GitHub issue about it though). The AI response mentioned that some users track new releases with Lidarr and downloads disabled, while covered in the post, it seems fairly tangential to my query.

I'm curious why it decided to check or cite a tildes post. No tildes posts came up in the first couple pages of search results. I use tildes from the same location, though on my phone where this query was on my desktop, and have done a couple DDG queries using "site:tildes.net" on my phone.

Has anyone else seen a search assist cite an unexpected site? Not unexpected as in irrelevant, that's all too common, but small and specific sources.

7 comments

Carrow

November 15, 2025

29 votes
How has AI positively impacted your life?

~tech Ask (survey)

I've been trying to get a more rounded understanding of the impacts that "AI" has had since ChatGPT went viral back in 2022. I've found it easy to gather a list of negative impacts, but have...

I've been trying to get a more rounded understanding of the impacts that "AI" has had since ChatGPT went viral back in 2022.

I've found it easy to gather a list of negative impacts, but have struggled to point to many positives.

I was curious if there were folks who have used any of these AI tools, and would willing to share any positive impacts those tools have had in their lives. I'm particularly interested in the text, audio, image, and video generation tools that have appeared since ChatGPT went viral, but please share anything else that you think fits.

80 comments

zoroa

November 10, 2025

50 votes