11
votes
What programming/technical projects have you been working on?
This is a recurring post to discuss programming or other technical projects that we've been working on. Tell us about one of your recent projects, either at work or personal projects. What's interesting about it? Are you having trouble with anything?
I have two esoteric projects that I work on back and forth. Right now I'm in between them and considering which one I should pick back up.
I have a sorta-working hobby OS that is designed to target my retrocomputers (mid-90s to early 2000s era). I don't really have an end game, other than that I'd like to be able to have it boot on my 386 and have some sort of functionality, e.g. keeping a, journal, text-based games, etc.
I have a MUD that I decided to finally pull the trigger on and make somewhat public. It is listed on a MUD listing website and people can actually register and play, but it's nowhere near complete. So, I soft-launched it, and then I got sidetracked. Maybe I'll put this one back at the top of the priority list...
I accomplished a small, but fairly satisfying project: semi-automatically renaming downloaded scientific papers with unhelpful filenames into
The Actual Title of the Paper.pdf
.In order to do that, I've written a specification of how I want to rename the files in English, and then Claude Sonnet 3.5 did most of the work. It is still mind-boggling to me that the LLMs can take a specification in English and turn it into a working Python script faster (and, arguably, better) than I could have done it myself in the same amount of time it took me to write the spec. I used to think LLMs just weren't capable enough to do that kind of thing, but I guess it pays to just try and see if maybe they are.
The file renaming itself works as follows:
If the filename starts with a valid Arxiv.org ID, hit the Arxiv API and get the paper title that way.
Otherwise,
pdftotext
is used to extract the first page of the PDF, which usually contains the title of the paper. This step can fail to return readable text, so I have a manual confirmation here before proceeding.The paper's title is non-trivial to extract from the exported text though, because (a) it's not always right at the top and (b) it can be slightly corrupted (linebreaks, L E T T E R S P A C I N G, etc.).
To get around that problem, I just add more AI (of course): the script pipes the extracted text into Simon Willison's llm tool, which makes a call to
gpt-4o-mini
to extract the title of the paper.So far I've renamed over 200 PDFs that way, and GPT extracted every single paper title flawlessly, as long as
pdftotext
worked. The API cost for this was about 1 cent per 100 processed documents.Nice! I’ve also had this revelation multiple times over the past few weeks – if you don’t have a tendency for performance bottlenecks to occur at specific spots in your idea/project, LLMs can bridge the “syntax gap” between natural language and basic e.g. Python fairly well…… most of the time. Which is why it’s not replacing programmers anytime soon. At most, maybe coders, but coding IMO is the smallest (and often most enjoyable part) of programming or software engineering as a whole, as in architecture/design.
That tangent aside, have there been many/any failures at all? If so, for which reasons – malformed PDFs or rather strange text alignments or similar such issues within the files?
It does save a lot of time. Finding and reading the API documentation for Arxiv alone would have taken me at least 10-20 minutes, and Claude just did it within seconds, XML parsing and all (though I suppose I could have found a snippet on StackOverflow instead). The code is also nicely documented and commented, even though I did not ask for that specifically.
I thought using the LLM to extract the paper title would be the most error-prone part of the whole thing, but it turns out that it was
pdftotext
. PDF is a famously complicated file format, and some of the documents don't actually contain text but have scanned page-images instead (I suppose I could slot in OcrMyPDF for those), but others fail for mysterious reasons andpdftotext
outputs a jumbled mess of random characters. There were four or five of those in the batch of 200 papers that had accumulated in my Downloads folder.I've been using LLMs to write simple browser scripts for me, and I love it! I'll manually use the inspector to find the selectors I care about and then prompt something like:
and then it just DOES it!!!! sometimes i need to make a few extra modifications but usually i will just append what i want to the request and rerun it rather than doing something as banal as adding a style modifier to the element myself. I write WAY more userscripts now because I can just describe what I want and spend 20 seconds in the element inspector rather than spend 20 minutes to write a bunch of annoying js.
tableflip.gif
About two and a half years ago, I posted this comment:
I've been trying to solve this issue on-and-off since then, trying to figure out what went wrong. Due to me being laid off, I decided to start work on my own booru again, since I had free time. After finishing about half of what I was planning for the barebones version of it, I tried fixing the old booru again, and found that somehow, a default setting was switched between versions of Rails around the time that I updated, ruining the way Rails uses cookies in my setup.
Fixing the issue basically involved deleting a single
#
character.Two and a half years' worth of time for that fix.
I hate programming sometimes.
On another note, I'm not sure if I want to continue with my own implementation.
I have a theory that, in general, the longer you spend looking for a bug, the smaller the actual fix ends up being.
Last year for TiMaSoMo I made a pentominoes app, and at the end of the month it had a bug where clicking on mobile acted as if you were hovering it, and then you had to unhover by clicking somewhere else, it was kind of annoying.
I just fixed it last week, I had to change onMouseEnter/Leave to OnCursorEnter/Leave. took like 10s of asking chatgpt and then searching to verify this was the real answer and not a hallucination of what a fake-and-way-too-good-to-be-true answer would be. I facepalmed a bit.
I bought a minipc for travel use, we travel to places where network connections are either crap or intermittent, so I want some local media that always works.
Syncing my music collection is easy, just rsync from my NAS via Tailscale. Syncing movies and tv-shows for the Plex running on the minipc is a bit harder. I can't just sync everything, it takes way too much space. I could sync singular shows and movies (set up a directory with symlinks manually, rsync that), but it's hard for shows with a ton of episodes most of which I have watched already.
So I started building a script that uses the Plex API to list unwatched episodes from shows, grabs the filename, converts it to an actual path and rsyncs those via Tailscale to the minipc.
Gemini tried hard to give me Go libraries to use, but failed. I told it to switch to Python and it got 90% of the way there in the first go.
Now I'm making it prettier and configurable so that I can put it on the minipc's crontab to sync the oldest unwatched episodes of shows and remove the ones I've seen. ...or maybe run it on the NAS and create a symlink collection I could sync without having logic running on the minipc. 🤔
One thing I've discovered this week is how to use beautifulsoup a bit better. Before I was using tag.next_sibling in a lot of places but this can skip a lot of text content. For my situation, extracting text between links (capturing link text in a different step), it only makes sense to use tag.next_sibling once when looking forward in the document and to use tag.next_element everywhere else:
I’ve posted this last time but I just did an update for PrimeCert, my for-fun name-a-prime app that gives you a (relatively) easy-to-verify proof that your huge integer is prime.
https://primecert.guissmo.com
Also wrote a companion blog post for it.
https://guissmo.com/blog/primecert-v2-0/
In the future I plan on implementing a feature to print the full certificate as a PDF, etc.
I'm having such an annoyance using Nuxt. I love Vue and am building a fullstack app for my portfolio, however, I'm having some real problems with the auto imports not working. Like I'm trying to use defineNuxtRouteMiddleware in the Nuxt server middleware and it just keeps coming in as undefined. If I try importing it normally, it yells at me, if I don't the app crashes because it's undefined. Has anyone else fought with this?