-
6 votes
-
Real-time speech-to-speech translation
Has anyone used a free, offline, open-source, real-time speech-to-speech translation app on under-powered devices (i.e., older smart phones)? There are a few libraries that written that...
Has anyone used a free, offline, open-source, real-time speech-to-speech translation app on under-powered devices (i.e., older smart phones)? There are a few libraries that written that purportedly can do or help with local speech-to-speech:
- https://github.com/ictnlp/StreamSpeech
- https://github.com/k2-fsa/sherpa-onnx
- https://github.com/openai/whisper
I'm looking for a simple app that can listen for English, translate into Korean (and other languages), then perform speech synthesis on the translation. Although real-time would be great, a short delay would work.
RTranslator is awkward (couldn't get it to perform speech-to-speech using a single phone). 3PO sprouts errors like dandelions and requires an online connection.
Any suggestions?
6 votes -
"Mechanistic interpretability" for LLMs, explained
6 votes -
Can I have some advice on the neural net I've been working on?
Apologies if this isn't an appropriate place to post this. Inspired by a paper I found a while back (https://publications.lib.chalmers.se/records/fulltext/215545/local_215545.pdf), I tried my hand...
Apologies if this isn't an appropriate place to post this.
Inspired by a paper I found a while back (https://publications.lib.chalmers.se/records/fulltext/215545/local_215545.pdf), I tried my hand at implementing a program (in C#) to create ASCII art from an image. It works pretty well, but like they observed in the paper, it's pretty slow to compare every tile to 90-some glyphs. In the paper, they make a decision tree to replicate this process at a faster speed.
Recently, I revisited this. I thought I'd try making a neural net, since I found the idea interesting. I've watched some videos on neural nets, and refreshed myself on my linear algebra, and I think I've gotten pretty close. That said, I feel like there's something I'm missing (especially given the fact that the loss isn't really decreasing). I think my problem is specifically during backpropagation.
Here is a link to the TrainAsync method in GitHub: https://github.com/bendstein/ImageToASCII/blob/1c2e2260f5d4cfb45443fac8737566141f5eff6e/LibI2A/Converter/NNConverter.cs#L164C59-L164C69. The forward and backward propagation methods are below it.
If anyone can give me any feedback or advice on what I might be missing, I'd really appreciate it.
14 votes -
I will fucking piledrive you if you mention AI again
119 votes -
MDN’s AI Help and lucid lies
7 votes -
What useful tasks are possible with an LLM with only 3B parameters?
Playing with Llama 7B and 13B, I found that the 13B model was capable of doing a simple task, rewriting titles in sentence case for Tildes submissions. The 7B model doesn't appear capable of the...
Playing with Llama 7B and 13B, I found that the 13B model was capable of doing a simple task, rewriting titles in sentence case for Tildes submissions. The 7B model doesn't appear capable of the same task, out of the box.
I heard about Android's new AICore available on a couple of new devices. But it sounds like Gemini Nano, which runs on-device, can only handle 2B or 3B parameters.
Is this size of model useful for real tasks? Does it only become useful after training on a specific domain? I'm a novice and wanting to learn a little bit about it. On-device AI is an appealing concept to me.
12 votes -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
21 votes -
Polymath - Toolkit to automatically segment music tracks and convert to MIDI
10 votes -
Vesuvius Challenge 2023 Grand Prize awarded: we can read the first scroll!
34 votes -
The unstoppable rise of disposable ML frameworks
10 votes -
Show Tildes: how I built the largest open database of Australian law
28 votes -
Jina AI releases first open source 8k embedding model
8 votes -
Numerically Stable RWKV Language Model
11 votes -
GradIEEEnt half decent: The hidden power of imprecise lines
9 votes -
Will Floating Point 8 Solve AI/ML Overhead?
6 votes -
Infinite AI Array
3 votes -
Introducing Whisper (OpenAI speech recognition model)
16 votes -
DreamCoder: Growing generalizable, interpretable knowledge with wake-sleep Bayesian program learning
5 votes -
An experiment to test GitHub Copilot's legality
11 votes -
GitHub Copilot - Your AI pair programmer
20 votes -
Uppestcase and Lowestcase Letters [advances in derp learning]
11 votes -
Exploiting machine learning models distributed as Python pickle files, and introducing Fickling: a new tool for analyzing and modifying pickle bytecode
3 votes -
Nx (Numerical Elixir) is now publicly available
7 votes -
Researching the potential of using machine learning to predict random number generation
11 votes -
Musings on Typicality
3 votes -
Neuroevolution of Self-Interpretable Agents
4 votes -
AutoML-Zero: Evolving Machine Learning Algorithms From Scratch
5 votes -
A new model and dataset for long-range memory
7 votes -
When artificial intelligence lost in translation is
9 votes -
Play Chess against GPT-2
@theshawwn: I am preparing to release a notebook where you can play chess vs GPT-2. If anyone wants to help beta test it: 1. visit https://t.co/CpWrFvtnY2 2. open in playground mode 3. click Runtime -> Run All 4. Scroll to the bottommost cell and wait 6 minutes If you get stuck, tell me.
5 votes -
A Look at Cerebras Wafer-Scale Engine: Half Square Foot Silicon Chip
6 votes -
OpenAI releases the largest version (1.5B parameters) of their GPT-2 language model, along with code and model weights
11 votes -
OpenAI Plays Hide and Seek…and Breaks The Game!
19 votes -
GPT-2 is not as dangerous as OpenAI thought
5 votes -
Specification Gaming Examples in AI
10 votes -
Puffer, a machine learning research study by Stanford University which allows you to stream live TV in your browser
13 votes -
Ludwig: Uber open sourced a config-based deep learning tool
4 votes -
Facebook and Carnegie Mellon's "Pluribus", the first AI to defeat professionals in 6-player poker
8 votes -
Generative Adversarial Networks - The story so far
6 votes -
Generating YouTube Titles Using Image Captioning
4 votes -
Few-Shot Adversarial Learning of Realistic Neural Talking Head Models
4 votes -
Synthetic Sensors: Towards General-Purpose Sensing
4 votes -
Tutorial on Automatic Machine Learning (NeurIPS2018)
5 votes -
The Foundations of Algorithmic Bias
8 votes -
Humble Bundle: Machine Learning by O'Reilly
15 votes -
Possible Python rival? Programming language Julia is winning over developers
12 votes -
Google Translate's deep dream: some translation requests yield weird religious prophesies
2 votes -
The Federalist Papers: Author Identification Through K-Means Clustering
12 votes -
Ways to think about machine learning
6 votes