• Activity
  • Votes
  • Comments
  • New
  • All activity
  • Showing only topics in ~comp with the tag "machine learning". Back to normal view / Search all groups
    1. Real-time speech-to-speech translation

      Has anyone used a free, offline, open-source, real-time speech-to-speech translation app on under-powered devices (i.e., older smart phones)? There are a few libraries that written that...

      Has anyone used a free, offline, open-source, real-time speech-to-speech translation app on under-powered devices (i.e., older smart phones)? There are a few libraries that written that purportedly can do or help with local speech-to-speech:

      I'm looking for a simple app that can listen for English, translate into Korean (and other languages), then perform speech synthesis on the translation. Although real-time would be great, a short delay would work.

      RTranslator is awkward (couldn't get it to perform speech-to-speech using a single phone). 3PO sprouts errors like dandelions and requires an online connection.

      Any suggestions?

      6 votes
    2. Can I have some advice on the neural net I've been working on?

      Apologies if this isn't an appropriate place to post this. Inspired by a paper I found a while back (https://publications.lib.chalmers.se/records/fulltext/215545/local_215545.pdf), I tried my hand...

      Apologies if this isn't an appropriate place to post this.

      Inspired by a paper I found a while back (https://publications.lib.chalmers.se/records/fulltext/215545/local_215545.pdf), I tried my hand at implementing a program (in C#) to create ASCII art from an image. It works pretty well, but like they observed in the paper, it's pretty slow to compare every tile to 90-some glyphs. In the paper, they make a decision tree to replicate this process at a faster speed.

      Recently, I revisited this. I thought I'd try making a neural net, since I found the idea interesting. I've watched some videos on neural nets, and refreshed myself on my linear algebra, and I think I've gotten pretty close. That said, I feel like there's something I'm missing (especially given the fact that the loss isn't really decreasing). I think my problem is specifically during backpropagation.

      Here is a link to the TrainAsync method in GitHub: https://github.com/bendstein/ImageToASCII/blob/1c2e2260f5d4cfb45443fac8737566141f5eff6e/LibI2A/Converter/NNConverter.cs#L164C59-L164C69. The forward and backward propagation methods are below it.

      If anyone can give me any feedback or advice on what I might be missing, I'd really appreciate it.

      14 votes
    3. What useful tasks are possible with an LLM with only 3B parameters?

      Playing with Llama 7B and 13B, I found that the 13B model was capable of doing a simple task, rewriting titles in sentence case for Tildes submissions. The 7B model doesn't appear capable of the...

      Playing with Llama 7B and 13B, I found that the 13B model was capable of doing a simple task, rewriting titles in sentence case for Tildes submissions. The 7B model doesn't appear capable of the same task, out of the box.

      I heard about Android's new AICore available on a couple of new devices. But it sounds like Gemini Nano, which runs on-device, can only handle 2B or 3B parameters.

      Is this size of model useful for real tasks? Does it only become useful after training on a specific domain? I'm a novice and wanting to learn a little bit about it. On-device AI is an appealing concept to me.

      12 votes
    4. Play Chess against GPT-2

      @theshawwn: I am preparing to release a notebook where you can play chess vs GPT-2. If anyone wants to help beta test it: 1. visit https://t.co/CpWrFvtnY2 2. open in playground mode 3. click Runtime -> Run All 4. Scroll to the bottommost cell and wait 6 minutes If you get stuck, tell me.

      5 votes