-
6 votes
-
Subliminal learning: Language models transmit behavioral traits via hidden signals in data
21 votes -
AI coding tools make developers slower but they think they're faster, study finds
40 votes -
The Common Pile v0.1: An 8TB dataset of public domain and openly licensed text
26 votes -
Counting all of the yurts in Mongolia using machine learning
15 votes -
Waymos are getting more assertive. Why the driverless taxis are learning to drive like humans.
45 votes -
Intelligent Agent Technology: Open Sesame! (1993)
7 votes -
Visualising how AI training avoids getting stuck in local minima
18 votes -
US scientists are using machine learning to find new treatments among thousands of old medicines
12 votes -
Tracing the thoughts of a large language model
10 votes -
Combining machine learning and homomorphic encryption in the Apple ecosystem
9 votes -
Show Tildes: we built the world's first legal AI API
22 votes -
Funny output from Claude.ai: a codeblock in a question about an early rennaissance theological work
I asked Claude for references to some ideas the ai had output earlier about a 16th century cardinal (Sadoleto). It's response was, Let me analyze the letter, and then this: async function...
I asked Claude for references to some ideas the ai had output earlier about a 16th century cardinal (Sadoleto). It's response was, Let me analyze the letter, and then this:
async function searchForSadoletoLetter() { try { const files = ['sadoleto_letter.txt', 'sadoleto.txt', 'sadoleto_to_geneva.txt', 'letter_to_geneva.txt']; for (const filename of files) { try { const content = await window.fs.readFile(filename, { encoding: 'utf8' }); console.log(`Found file: ${filename}`); console.log("First 500 characters:"); console.log(content.substring(0, 500)); return content; } catch (err) { // File not found, try the next one } } console.log("Could not find any files containing Sadoleto's letter"); return null; } catch (err) { console.error("Error searching for files:", err); return null; } } const content = await searchForSadoletoLetter(); // Since we don't have the letter available as a file, let's search for published versions online console.log("The letter isn't available as a file. We would need to rely on scholarly knowledge about Sadoleto's letter to Geneva (1539)."); console.log("Sadoleto's letter was published in 'A Reformation Debate: Sadoleto's Letter to the Genevans and Calvin's Reply' (ed. John C. Olin).");
This strikes me as odd? Did Claude write a script to generate its own response? Have coders put something like this in as a gaurdrail?
edit: details about earlier convo
15 votes -
Overfitting to theories of overfitting
10 votes -
What trustworthy resources are you using for AI/LLMs/ML education?
Every company is trying to shoehorn AI into every product, and many online materials provide a general snake oil vibe, making it increasingly difficult to parse. So far, my primary sources have...
Every company is trying to shoehorn AI into every product, and many online materials provide a general snake oil vibe, making it increasingly difficult to parse. So far, my primary sources have been GitHub, Medium, and some YouTube.
My goal is to better understand the underlying technology so that I can manipulate it better, train models, and use it most effectively. This goes beyond just experimenting with prompts and trying to overcome guardrails. It includes running local, like Ollama on my M1 Max, which I'm not opposed to.
8 votes -
Task-Specific LLM Evals that Do & Don't Work
4 votes -
Someone made a dataset of one million Bluesky posts for 'machine learning research'
20 votes -
When Machine Learning Tells the Wrong Story
6 votes -
Real-time speech-to-speech translation
Has anyone used a free, offline, open-source, real-time speech-to-speech translation app on under-powered devices (i.e., older smart phones)? There are a few libraries that written that...
Has anyone used a free, offline, open-source, real-time speech-to-speech translation app on under-powered devices (i.e., older smart phones)? There are a few libraries that written that purportedly can do or help with local speech-to-speech:
- https://github.com/ictnlp/StreamSpeech
- https://github.com/k2-fsa/sherpa-onnx
- https://github.com/openai/whisper
I'm looking for a simple app that can listen for English, translate into Korean (and other languages), then perform speech synthesis on the translation. Although real-time would be great, a short delay would work.
RTranslator is awkward (couldn't get it to perform speech-to-speech using a single phone). 3PO sprouts errors like dandelions and requires an online connection.
Any suggestions?
6 votes -
GSM-Symbolic: Understanding the limitations of mathematical reasoning in large language models
15 votes -
On the path to delivering next generation UK weather forecasts
7 votes -
The LLMentalist effect: how chat-based large language models replicate the mechanisms of a psychic's con
29 votes -
Six distinct types of depression identified in Stanford Medicine-led study
51 votes -
"Mechanistic interpretability" for LLMs, explained
6 votes -
Can I have some advice on the neural net I've been working on?
Apologies if this isn't an appropriate place to post this. Inspired by a paper I found a while back (https://publications.lib.chalmers.se/records/fulltext/215545/local_215545.pdf), I tried my hand...
Apologies if this isn't an appropriate place to post this.
Inspired by a paper I found a while back (https://publications.lib.chalmers.se/records/fulltext/215545/local_215545.pdf), I tried my hand at implementing a program (in C#) to create ASCII art from an image. It works pretty well, but like they observed in the paper, it's pretty slow to compare every tile to 90-some glyphs. In the paper, they make a decision tree to replicate this process at a faster speed.
Recently, I revisited this. I thought I'd try making a neural net, since I found the idea interesting. I've watched some videos on neural nets, and refreshed myself on my linear algebra, and I think I've gotten pretty close. That said, I feel like there's something I'm missing (especially given the fact that the loss isn't really decreasing). I think my problem is specifically during backpropagation.
Here is a link to the TrainAsync method in GitHub: https://github.com/bendstein/ImageToASCII/blob/1c2e2260f5d4cfb45443fac8737566141f5eff6e/LibI2A/Converter/NNConverter.cs#L164C59-L164C69. The forward and backward propagation methods are below it.
If anyone can give me any feedback or advice on what I might be missing, I'd really appreciate it.
14 votes -
I will fucking piledrive you if you mention AI again
119 votes -
Extracting interpretable features from Claude 3 Sonnet
13 votes -
Hallucination-free RAG: Making LLMs safe for healthcare
12 votes -
Turning old maps into 3D digital models of lost neighborhoods
9 votes -
MDN’s AI Help and lucid lies
7 votes -
Stability AI reportedly ran out of cash to pay its bills for rented cloudy GPUs
28 votes -
Noam Chomsky: The false promise of ChatGPT
30 votes -
What useful tasks are possible with an LLM with only 3B parameters?
Playing with Llama 7B and 13B, I found that the 13B model was capable of doing a simple task, rewriting titles in sentence case for Tildes submissions. The 7B model doesn't appear capable of the...
Playing with Llama 7B and 13B, I found that the 13B model was capable of doing a simple task, rewriting titles in sentence case for Tildes submissions. The 7B model doesn't appear capable of the same task, out of the box.
I heard about Android's new AICore available on a couple of new devices. But it sounds like Gemini Nano, which runs on-device, can only handle 2B or 3B parameters.
Is this size of model useful for real tasks? Does it only become useful after training on a specific domain? I'm a novice and wanting to learn a little bit about it. On-device AI is an appealing concept to me.
12 votes -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
21 votes -
Polymath - Toolkit to automatically segment music tracks and convert to MIDI
10 votes -
What are some interesting machine learning research papers you found?
Here's a place to share machine learning research papers that seem interesting to you. I'm no expert, but sometimes I skim them, and maybe there are some folks on Tilde who know more than I do?...
Here's a place to share machine learning research papers that seem interesting to you. I'm no expert, but sometimes I skim them, and maybe there are some folks on Tilde who know more than I do?
One paper per top-level post, and please link to arXiv (if relevant) and quote a bit of the abstract.
11 votes -
Google Bard is now Gemini; Gemini Advanced launched
24 votes -
Google's Gemini 1.5 Pro is a new, more efficient AI model
10 votes -
Vesuvius Challenge 2023 Grand Prize awarded: we can read the first scroll!
34 votes -
Why autonomous trucking is harder than autonomous rideshare
12 votes -
"The AI revolution is rotten to the core"
27 votes -
Machine learning creates a massive map of smelly molecules
14 votes -
The unstoppable rise of disposable ML frameworks
10 votes -
Return of the AI Megathread (#13) - news of chatbots, image generators, etc
I haven't done one of these since early July, but it seems like there's an uptick in news. Here's the previous one.
28 votes -
Show Tildes: how I built the largest open database of Australian law
28 votes -
Jina AI releases first open source 8k embedding model
8 votes -
FedFingerprinting: A federated learning approach to website fingerprinting attacks in Tor networks
6 votes -
Meta is releasing AudioCraft: Generative AI for audio made simple and available to all
34 votes -
Megathread #12 for news/updates/discussion of AI chatbots and image generators
Haven't done one of these in a while, but there's a bit of news, so here's another. Here's the previous thread.
36 votes -
A jargon-free explanation of how AI large language models work
40 votes