-
9 votes
-
AI: Where in the loop should humans go?
18 votes -
Vibe coding is just the return of Excel/Access, with more danger
I probably triggered some PTSD right there. Was just in a meeting at work, where we listed off everything that makes software development hard and slow. An excersize for the thread would be to...
I probably triggered some PTSD right there.
Was just in a meeting at work, where we listed off everything that makes software development hard and slow. An excersize for the thread would be to replicate that list. It turned out that Claude helps with like 1/5th or less of it....especially in a collaborative environment.
So, the situation we're now encountering is that random business areas can vibe code out something, tell nobody, throw it in AWS, have it become a critical part of a business process that fails when they quit, and nobody even has access to look at what was made.
It gives me comfort that in about 5 years there will be a new surge in demand for programmers to reign in all the rogue applications that need shutdown because of the immense risk to continual operation of a company, from data leaks to broken payroll.
It'll be Y2K all over again.
45 votes -
Static analysis, dynamic analysis, and stochastic analysis
For a long time programmers have had two types of program verification tools, static analysis (like a compiler's checks) and dynamic analysis (running a test suite). I find myself using LLMs to...
For a long time programmers have had two types of program verification tools, static analysis (like a compiler's checks) and dynamic analysis (running a test suite). I find myself using LLMs to analyze newly written code more and more. Even when they spit out a lot of false positives, I still find them to be a massive help. My workflow is something like this:
- Commit my changes
- Ask Claude Opus "Find problems with my latest commit"
- Look though its list and skip over false positives.
- Fix the true positives.
git add -A && git commit --amend --no-edit- Clear Claude's context
- Back to step 2.
I repeat this loop until all of the issues Claude raises are dismissable. I know there are a lot of startups building a SaaS for things like this (CodeRabbit is one I've seen before, I didn't like it too much) but I feel just doing the above procedure is plenty good enough and catches a lot of issues that could take more time to uncover if raised by manual testing.
It's also been productive to ask for any problems in an entire repo. It will of course never be able to perform a completely thorough review of even a modestly sized application, but highlighting any problem at all is still useful.
Someone recently mentioned to me that they use vision-capable LLMs to perform "aesthetic tests" in their CI. The model takes screenshots of each page before and after a code change and throws an error if it thinks something is wrong.
10 votes -
AI Coding agents are the opposite of what I want
I've been thinking a lot about LLM assisted development, and in particular why I keep dropping the available tools after a few attempts at using them. I realized recently that it's taking away the...
I've been thinking a lot about LLM assisted development, and in particular why I keep dropping the available tools after a few attempts at using them.
I realized recently that it's taking away the part of software development I enjoy: the creative problem solving that comes with writing code. What's left is code review tasks, testing, security checks, etc. Important tasks, but they all primarily involve heavy concentration, and much less creativity.
Why aren't agents focused on handling the mundane tasks instead? Tell me if I've just introduced a security vulnerability or a runtime bug. Generate realistic test data and give me info on what the likely output would be. Tell me that the algorithm I just wrote is O(n^2).
Those tasks are so much more applicable to matching against existing data, something LLMs should be extremely good at, rather than trying to get them to write something novel, which so far they've been mostly bad at, at least in my experience.
46 votes -
Gemma needs help
31 votes -
Designing an agent reading test
10 votes -
Google releases Gemma 4
28 votes -
Executing programs inside transformers with exponentially faster inference
14 votes -
Can coding agents relicense open source through a “clean room” implementation of code?
51 votes -
Is it worthwhile to run local LLMs for coding today?
I've made the decision to purchase a new M5 Macbook Air because of the memorypocalypse. My current M1 model is already upgraded to the amount of memory and storage as the current base model and...
I've made the decision to purchase a new M5 Macbook Air because of the memorypocalypse. My current M1 model is already upgraded to the amount of memory and storage as the current base model and I'm wondering if it's worth spending the extra 2-4 hundred dollars on memory upgrades today.
My current computer is more than good enough for today but I figure I should probably future proof just in case. I was thinking the 16GB would be enough, but I also know that I'm kind of falling behind by not embracing AI coding agents. According to my research the maximum 32GB is recommended for most coding-relevant models - almost as a minimum.
I work in education so coding is not actually much of a need, and obviously there are cloud providers I could use if I end up needing them in the future. I also have less than a teacher's salary because I work part time, which is the greatest reason why I'm sticking with the 16GB base for the moment, but other than that I also don't do many memory-intensive programs. But I thought I would get some recommendations before they start shipping.
I'd also be interested on people's opinions on trading in my old one, since it'll only get me ~$275 back. I'm considering reneging on that part and keeping it around to act as a web server or give it to my husband who has a computer that still runs Windows 7 and barely uses it.
35 votes -
Hacker used Anthropic's Claude chatbot to attack multiple government agencies in Mexico
21 votes -
microgpt - GPT in 200 lines
32 votes -
Updating Eagleson's Law in the age of agentic AI
Eagleson's Law states "Any code of your own that you haven't looked at for six or more months might as well have been written by someone else." I keep reading how fewer and fewer of the brightest...
Eagleson's Law states
"Any code of your own that you haven't looked at for six or more months might as well have been written by someone else."
I keep reading how fewer and fewer of the brightest developers are writing code and letting their AI agent to do it all. How do they know what's really happening? Does it matter anymore?
Curious to hear this communities thoughts
11 votes -
Ladybird chooses Rust as its successor language to C++, with help from AI
33 votes -
llOOPy lOOPs
12 votes -
Blocking Claude
28 votes -
Can AI tell if I'm writing AI slop? A machine learning journey.
21 votes -
Defeating nondeterminism in LLM inference
15 votes -
GPT 5 released
30 votes -
Applying Chinese Wall Reverse Engineering to LLM Code Editing
8 votes -
I wrote my first Chrome extension to simplify Wikipedia articles
15 votes -
User-friendly and privacy-friendly LLM experience?
I've been thinking perhaps I'll need to get one of the desktop LLM UI. I've been out of touch with the state of the art of end user LLM as I've been exclusively using it via API, but tech-y people...
I've been thinking perhaps I'll need to get one of the desktop LLM UI. I've been out of touch with the state of the art of end user LLM as I've been exclusively using it via API, but tech-y people (who are not developers) mostly talk about the end-user products that I lack the knowledge of.
Ethical problems aside, the problem with non-API usage is, even if you pay, I can't find one that have better privacy policy than API. And the problem with API version is that it is not as good as the completed apps unless you want to reinvent the wheel. The apps also may include ads in the future, while API technically cannot as it would affect some downstream usecases.
Provider Data Retention (API) Data Retention (Consumer) UI-only features ChatGPT Plus 30 days, no training Training opt-out, 30 days for temp. chat, unknown retention otherwise Voice, Canvas, Image generation in chat, screensharing, Mobile app Google AI Pro 0 72 hours if you disable history, or up to 3 years and trained upon otherwise Android assistant, Canvas, AI in Google Drive/Docs, RAG (NotebookLM), Podcast generation, Browser use (Mariner), Coding (Gemini CLI), Screensharing Gemini in Google Workspace See above 0-18 months, but no human review/training See above Claude Pro 30 days Up to 2 years (no training without opt-in) Coding, Artifact, Desktop app, RAG, MCP As a dual use technology, the table doesn't include the extra retention period if they detect an abuse. Additionally, if you click on thumbs up/down it may also be recorded for the provider's employee to review.
I don't think OpenWebUI, self hosted models, etc. would suffice if they are not built to the same quality as the first party products. I know I'm probably asking for something that doesn't exists here, but at least I hope it will bring to people's attention that even if you're paying for the product you might not get the same privacy protection as API users.
15 votes -
Personalized software really is coming, but not today. Maybe tomorrow?
13 votes -
Block AI scrapers with Anubis
27 votes -
The Long Context - Interactive fiction driven by an LLM
12 votes -
Bartosz Milewski - Understanding Attention in LLMs
6 votes -
Funny output from Claude.ai: a codeblock in a question about an early rennaissance theological work
I asked Claude for references to some ideas the ai had output earlier about a 16th century cardinal (Sadoleto). It's response was, Let me analyze the letter, and then this: async function...
I asked Claude for references to some ideas the ai had output earlier about a 16th century cardinal (Sadoleto). It's response was, Let me analyze the letter, and then this:
async function searchForSadoletoLetter() { try { const files = ['sadoleto_letter.txt', 'sadoleto.txt', 'sadoleto_to_geneva.txt', 'letter_to_geneva.txt']; for (const filename of files) { try { const content = await window.fs.readFile(filename, { encoding: 'utf8' }); console.log(`Found file: ${filename}`); console.log("First 500 characters:"); console.log(content.substring(0, 500)); return content; } catch (err) { // File not found, try the next one } } console.log("Could not find any files containing Sadoleto's letter"); return null; } catch (err) { console.error("Error searching for files:", err); return null; } } const content = await searchForSadoletoLetter(); // Since we don't have the letter available as a file, let's search for published versions online console.log("The letter isn't available as a file. We would need to rely on scholarly knowledge about Sadoleto's letter to Geneva (1539)."); console.log("Sadoleto's letter was published in 'A Reformation Debate: Sadoleto's Letter to the Genevans and Calvin's Reply' (ed. John C. Olin).");This strikes me as odd? Did Claude write a script to generate its own response? Have coders put something like this in as a gaurdrail?
edit: details about earlier convo
15 votes -
Building a personal, private AI computer on a budget
24 votes -
Task-Specific LLM Evals that Do & Don't Work
4 votes -
"Mechanistic interpretability" for LLMs, explained
6 votes -
Researchers describe how to tell if ChatGPT is confabulating
24 votes -
I will fucking piledrive you if you mention AI again
119 votes -
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B
9 votes -
"Badness 0", a suckerpinch/tom7 video dive into typesetting, LLMs, and Donald Knuth
29 votes -
How it feels to get an AI email from a friend
56 votes -
HeavyIQ: Understanding 220M flights with AI
2 votes -
Slop is the new name for unwanted AI-generated content
52 votes -
React, Electron, and LLMs have a common purpose: the labour arbitrage theory of dev tool popularity
31 votes -
When provided with CVE descriptions of 15 different vulnerabilities and a set of tools useful for exploitation, GPT-4 was capable of autonomously exploiting 13 of which, yielding an 87% success rate
17 votes -
On GitHub Copilot
23 votes -
Linus Torvalds on the state of Linux today and how AI figures in its future
26 votes -
Jina AI releases first open source 8k embedding model
8 votes -
Teaching LLMs to divide and conquer problems with hierarchical question decomposition
8 votes -
How I think about LLM prompt engineering: Prompting as searching through a space of vector programs
11 votes -
Language is a poor heuristic for intelligence
37 votes -
Using Redis VSS as a Retrieval Step in an LLM Chain
2 votes -
Working with GPT
7 votes -
Play Chess against GPT-2
@theshawwn: I am preparing to release a notebook where you can play chess vs GPT-2. If anyone wants to help beta test it: 1. visit https://t.co/CpWrFvtnY2 2. open in playground mode 3. click Runtime -> Run All 4. Scroll to the bottommost cell and wait 6 minutes If you get stuck, tell me.
5 votes