14
votes
What programming/technical projects have you been working on?
This is a recurring post to discuss programming or other technical projects that we've been working on. Tell us about one of your recent projects, either at work or personal projects. What's interesting about it? Are you having trouble with anything?
In a follow up to my comment from last week, I've successfully written my script!
I wrote a resilient (can pick up where if left off if interrupted) pipeline script, in the only shitty language I know (
bash
). The script functions to provide automated AI powered summaries of my weekly DnD sessions.There's only two steps to use it:
Step 1. Use "Craig" Discord bot to do per-user audio recordings of our DnD session
Step 2. Leave the .zip of all the .flac files in a directory and wait for cron to spawn the script
Script will:
whisperx
on each file to convert the speech to a text transcript2.5-pro
model), along with instructions to write up a Discord formatted recap of the sessionI recorded last week's session with Craig, and after a few days of tinkering, I have it fully functional! We got a nice recap of last week's session. Unfortunately, this week is our last session of the nearly three year campaign. So now I need to wait for another campaign to continue to use it. But I plan to dockerize it and publish it to GitHub/DockerHub, as I think it's actually something useful!
(I already had done what I’m writing about here before last week’s recurring post went up, but didn’t have the energy at the time to write about it.)
One of my favorite command line tools that sees near-daily use is also pretty much the only one I’ve written myself:
trl
. You can translate stuff with it. It can be piped into (for example from within editors!!) and otherwise follows shell mannerisms you’d expect to see. About two years ago, I wrote it so I wouldn’t have to go and open up a browser plus website or desktop app/widget every time I wanted to briefly get a translation for a single word. (And because, by chance, I had found out about the DeepL API’s generous 500k characters per month free tier limit.) Back then, on a measly 2015 Intel laptop chip, that context-switching meant real (felt) downtime, lol.Up until last week, the script was also not much more than a dead simple wrapper around that API. I’ve now rewritten it so I’ll be able to fairly easily add more translating services to it, in expectance of a future public Kagi Translate API. As a proof-of-concept (and sanity check) for the feature, I’ve added basic support for locally self-hosted LibreTranslate to it, which did work nicely.
But what I’m really looking forward to is just piping a paragraph or even entire pages/the whole document into it (from within my beloved Helix editor of course) for a future “proofread API” – that is, language in equaling language out, possibly with annotations too. Earlier this year, I wrote my first longer paper/thesis in my second language (English) and already used
trl
there to self-check on vocabulary questions by going EN (my phrasing) → first language → EN (machine translation), which albeit being nice because I could do it with the surrounding context in sight, felt a bit clunky still.I furthermore had this idea I’m not sure about yet: I could also use a classic LLM CLI tool/API wrapper like
aichat
with a corresponding system prompt/“role” to simulate yet another translating service, which could trivially be expanded to include the proofreading mode. So far I haven’t done that mostly because it feels like a cop-out and a bit removed from the original intention of having a small, translation-focused tool with few to none dependencies, and because I just set it up in the rewrite to always expect a (remote or local) base URL. I thinkaichat
does come with the ability to spin up its own web server locally, but it’d be neater if I could avoid needing one there. :PAnother use case this rewrite has enabled (theoretically) is easy comparison of different translation providers. In practice, you’d have to write a short automation wrapping around the script to make it not suck for more complex comparisons. Maybe with another LLM to grade the different providers’ results?
I think it was some comment on Tildes that got me hooked to FarmRPG about 2 months ago.
With many RPG games with quests, I dreamt of making an auto solver for those games. I tried making one for the second league of Old School RuneScape, but I don't like to build the dataset so it didn't work out.
Turns out for FarmRPG there's a site called Buddy.farm that has all the info I need. They even have an open GraphQL endpoint, but with how much I'm querying I'm avoiding that and I use the cached JSON instead.
The first solver I did when I started playing was an OpenWebUI tool that fetch data, so I can ask question about the game to LLM. It worked for really early game (before sawmill silver become online). As with any LLM, it lose context fast and OpenWebUI tool calling is either non-native or buggy.
Then I try n8n as it turns out quests and inventory list is important. I use Gemini 2.0 to extract quest list and item list from HTML (which I copy in by hand) to Google Sheet, then feed them to Gemini 2.5 Flash to help me solve them, with buddyfarm as tools. This approach is very unreliable and I think feeding the entire inventory to LLM doesn't work well.
The third version I wrote it in Go, which connect to the Sheets I made and try to solve it with hand written item cost. It worked really well and I made a lot of progress in game quickly. I then realize the bottleneck is now that game data is outdated very fast and Gemini is too slow and manual.
The fourth and current version is a Firefox extension. It runs a content script to parse the pages (it do not have UI or send any request to the game, only to buddyfarm). Then I rewrote the solver with recursion. For example, if a quest needs White Parchment, it will look into the crafting recipe and know that obtaining Feather slightly advances the quest. It does have limitations, but I'm satisfied of it knowing that I don't have immediate solution to many of them without exploding the search tree.
Currently I'm working on a void avoidance feature. The game has a per-item inventory cap, which the solver ignores and lots of items are permanently lost. The way I think to solve this is to find a way to remove those items from your inventory first, then retry the same action. Turns out balancing the item sink is very hard. Buddyfarm doesn't have item sell price (it does, but only in GraphQL and not json) and some actions do not generate any tracked resource - giving them to NPC for relationship XP, for example.
The code is on my GitHub by the way.
Per my last topic here i'm pounding away at getting cosmos cloud setup locally so that I can eventually open it up either directly or with a vpn.
At the state where I need to figure out what the actual architecture should be, and need to look into that. Ideally i'd like it to ALWAYS work locally, because that's kinda the whole point that even if the web is down, I should be able to access whatever I want on it.
From there it would be nice if the internet is working to be able to access remotely. I know I can VPN in, but it's a question of if I think the few other people I want to give access to can, and if not, what's a more reasonable workaround.
I made a little rust experiment of
library fsadd --fs
and published it as library-rs. I still don't really know what I'm doing with in Rust so a lot of it is vibe-coded. I spent a lot of time playing around with compiling libmagic on Windows (via GitHub Actions--I don't have access to a Windows machine). After trying both msys2 and a vcpkg approach for a few hours, not too mention that multi-threading with libmagic is a bit fiddly, I gave up and went withtree-magic-mini
which works but it has some shortcomings.I tried a bunch of different concurrency approaches via ChatGPT and Gemini but in the end the program is not quite as fast as the Python version which was somewhat surprising. For disks with a lot of really small files
library-rs
was meaningfully faster but most of the time it is a pretty close tie.I wrote an Atom feed generator for my alma mater's student newspaper, since they recently changed CMS and no longer provide their own RSS/Atom feed. Originally wrote it in Python with BeautifulSoup, but figured I'd give it a shot in Go to improve in that language a bit. I'd welcome any feedback. Here's the code.
I have my audio player case working close enough. It fits everything once I took a dremel and cut some areas away. I am having some issues with librespot not always detecting my USB audio device. I think the culprit might be the DIY cable I made or the micro USB breakout connectors I am using. It is kind of getting frustrating as I doubt the culprit is software, but I have tried most things on the hardware side and am getting no where
Edit: Came up with a new plan. Test it out using an OTG cable. If that works, then I can take my dissected micro USB to USB-A cable, plug the USB-A into the OTG cable and use my multimeter to double check that I am connecting each cable to the right pin.
So I have done some more troubleshooting and starting to hit a wall on the cause of this. Here are a list of things I have tried:
I am unsure the culprit, but my suspicion is that that the Pi is struggling with doing handshakes with USB devices directly.
Edit: After some further thinking, my theory is that the USB controller on my Pi Zero 2W is broken. The USB hub probably has a USB controller, so it works when connected to the hub, but not directly. Odds are that I shorted out a cable in my initial wiring of the system, so I will replace and buy a new RPi. The frustrating part is that shipping costs just as much as the RPi itself