12
votes
What programming/technical projects have you been working on?
This is a recurring post to discuss programming or other technical projects that we've been working on. Tell us about one of your recent projects, either at work or personal projects. What's interesting about it? Are you having trouble with anything?
I created the Library of Babel.
Okay I jest, of course, but it turns out that making a "Library of Babel" is not obvious at all.
If you haven't read the short story by Luis Borges, you probably should (it's pretty short), but the basic premise is that there is an astronomically large, possibly infinite library out there containing nothing but shelves and shelves of books, seemingly containing every possible arrangement of letters and punctuation. Most of these books appear totally useless, but if the library is complete then it has somewhere a compendium of all scientific discovery, every novel that has ever been written, and a transcript of every conversation that all people will ever have.
Naturally, we can't build a physical version of this (to my knowledge), but we can simulate the idea by creating an index into this library of pure information, taking in some input and returning a virtual book sampled from range of all possible such books.
You may already be familiar with a website that already did this, but if you read the "About" section carefully, you'll find that this online creation doesn't actually cover the full breadth of the Library of Babel. Namely, it only offers every possible page, which are compiled together to form books but wouldn't cover (for example) if an identical page were found in two different books.
If your brain is scrambled in the same way as mine, you know that this shortcoming is unacceptable and demands a complete solution.
Referring back to the original material, a book is 410 pages, each of which contains 40 lines of 80 characters from a symbol set of 25 characters.
In order to exactly recreate the Library in a way that is at least kind of interesting, we need a way to index into this set of books in a way that is:
Complete (so you can access every book, i.e. not just using a random number generator to extend the input into book-length*)
Unpredictable (to simulate the "ocean of information" feel, since otherwise just outputting the input would indeed access every book, but obviously not in the way we want).
*
If you read even more carefully into the About section of the Library of Babel Website you'll realize that the website author literally does use a random number generator. However, this was a specially crafted generator that had enough state to actually generate the full range required to represent every possible page. Most random number generators will only generate 264 or 2256 or even 219937-1 different values, which while good enough for normal uses, isn't enough for us here.
These constraints will immediately give us a few caveats. For one, pesky Information Theory dictates that no matter what we use to index the Library of Babel, our input will contain at least enough information to encode a book in that library. That is, most indices to books in the Library will be essentially be books themselves.
This is a little annoying (funnily enough Borges also wrote a short story about this exact scenario), but it is a necessary cost if we want a complete library.
For another, just retrieving a book will likely be expensive in memory and/or time depending on the final algorithm. A "Complete Library of Babel" website with any meaningful server-side activity will have ridiculous DOSability potential, so we won't focus on any online experience for now.
With these constraints in mind we can reuse random number generator idea from before and just make an extremely long and unwieldy one, but as is often the case with mathy things we can instead find out that if we state the problem in math terms it's actually a well known problem.
We return to those two requirements from before, and restate our indexing system as a function from some input space that we can specify to the set of books (the set of strings of length 1312000 containing which characters from an set of size 25*).
*
You might notice that 25 is notably not a power of 2 and less than the English alphabet length of 26. Apparently this character set is alluded to in a previous essay and is the result of cutting some "unnecessary" letters out of the Spanish alphabet to arrive at 22 letters and 3 punctuation characters.
In fact, the Library of Babel website uses 29 letters (26 + 3 punctuation) and for the sake of convenience we'll be fudging this alphabet size as well.
The function needs to be:
Surjective
Indistinguishable from a Random Oracle for an adversary given a number of queries and time polynomial in the input size
That second requirement is a mouthful, but captures what we mean by "Unpredictable": nobody should be able to figure out any patterns in this function, so much so that they can't tell it apart from random noise, even if they test it with some sample inputs or think about it for a while. Of course, this is also (almost) the definition of a crypto object, namely the Pseudorandom Function.
We now come across something that will actually solve our problem, the Pseudorandom Permutation, which is like a Pseudorandom Function (actually it provably is one) but is bijective. What is a common way to think about Pseudorandom Permutations? As a block cipher, of course! That is, to simulate going from our index to some book in the Library of Babel, we just encrypt the input with the appropriate encryption algorithm, like AES.
Well that's not the whole story, because there is no block cipher that directly operates on blocks the size of books. AES itself only operates on blocks of 128 bits, which returns to our problem of being not big enough. Block ciphers generally get around this size limit by using chaining, i.e. by slightly changing the cipher between blocks so that the output remains unpredictable between blocks. This actually doesn't solve our problem, because part of the way these algorithms stay secure is using randomization, which we can't do (lest the shelves of the Library shift and shuffle from beneath our feet), and otherwise if you change a part of the input, only part of the output changes (all blocks before the part that changed stay the same) which violates our unpredictability requirement.
My solution to this is to create my own scheme, which is a cardinal sin for cryptography but is is fine for me since I don't hide my secrets in near-infinite libraries. It's practically the bare minimum to have a small change in the input potentially affect every bit of the output, but it's enough to be nontrivial to crack.
With that, we arrive at an index for the Library of Babel:
Fix 2 256-bit keys K1, K2 and 2 128-bit initialization vectors IV1, IV2. Note that these should stay fixed between runs to keep the index consistent. You should also pre-select some character set of 32 symbols to get a book out at the end.
Take a length 6560000 string as input. (I just filled the initial bits with ASCII input and then padded the reset with zeros).
Encrypt with AES-256 under the CBC mode with K1 and IV1 to get a ciphertext C1.
Reverse C1 bitwise and encrypt with AES-256 under the CBC mode with K2 and IV2 to get the bitwise C2.
Interpret the resulting bits in blocks of 5 using the predefined size-32 character set to retrieve some book of 410 pages with 40 lines of 80 characters from the Library of Babel.
And it works! I made a C++ script that did this which I probably won't publish (for now; it kind of sucks) but it is enough to demonstrate that it's possible, and this algorithm is invertible! You can "search" the Library of Babel for a certain text and find out "where" it is in the index (though this index will literally be book-length, so be warned).
I did return to the "restate this in mathy terms" and found out that "variable length block ciphers" are a thing that people have thought about, cf. this and this.
One day when I feel bothered enough by the possible cipher-insecurity of rolling my own crypto and/or interested in the idea again (website that's basically all client-side? non-awful source code?) I might come back to it.
Nothing nearly as complex as others (or even as complex as what I am capable of), but I started coding a portfolio website for myself as I may have to search for work soon. It isn't for a development based job, so just using static HTML so far hosted on github pages, and using Bootstrap to speed up the development. I am having to learn Figma to help me design it because my license I got through school for Balsamiq is no longer valid. For simple GUI mockups, I do prefer Balsamiq over the free tier of Figma, so that has been slowing me down a bit (as well as just trying to design it, as UI design is not a strength of mine).
This is a perfect example of this classic XKCD about automation.
What started as a little project to make an auto-clicker for iOS games has turned into a moderately large and rather silly project to automatically play every aspect of Idle Apocalypse, which is a resource management game. For those who don't know, automating things on iOS is actually non-trivial without jailbreaking -- you can't really interact with other apps in any way. The solution I've come up with is to use a Raspberry Pi that acts as a screen mirroring host to get a video feed, and a Bluetooth pointer device to provide the inputs.
Fun elements of the system so far:
I'm currently debating whether I should implement a simulator for the purpose of building a deep reinforcement learning based controller.
I did something like this in college for an idle game. My strategy was to preload a series of "canary pixels" by taking screenshots at every event that I wanted to handle and doing the big crunchy math of comparing all of the screenshots outside of the game to find unique combinations that would be able to tell me which event was happening. It really lowered the time spent processing the image during gameplay.
Every time I took a screenshot, I would check the first canary pixel for each event, and then for any positives I would check an additional handful to ensure that the correct thing was happening, and pass off control to the event handler until the canary pixels went away.
I vouch for idle game manipulation being one of the best ways to practice practical programming. You have defined goals, you will inevitably want to extend the functionality, and it is very obvious when it is working.
The pixel trick is a good idea. It has some similarities to Haar cascades, in that they both use a collection of what amount to weak learners to achieve fast identification. It has some difficulties in this scenario because of the screen mirroring -- there's a moderate amount of compression artifacting, so many if not most pixels are somewhat variable. I'm mostly using a combination of OpenCV's matchTemplate and some small scikit-learn models.
I wrote a script that screenshots whatever game I'm playing every 2 minutes and then uses GPT-4 to say something about the screenshot. Its messages are pretty lame so far, but I'm impressed it understands the games I'm playing this well (I queued for an Overwatch match, switched to play as the healer Mercy, etc). My goal is to make it work so I can use it while I'm streaming on Twitch and have it be interesting for me and chatters to interact with. (I don't regularly stream on Twitch, but I like the idea of doing it to push me to make presentable software like this.)
I'm still experimenting in trying to make it say things that are more interesting. It sees its previous messages, a couple of the most recent screenshots, and my messages in the chat too when generating a message. At first I kept changing its hard-coded system prompt and restarting it to adjust its behavior, but then I finally realized I could just tell it in chat to act differently. As a lifelong programmer, it's still so trippy to me to chat with a program I made to change how it works.
It's written in Typescript using Deno, the discordeno library, the official openai npm library, and also a small Powershell script I wrote with ChatGPT's help to use the Windows API to screenshot a specific program.
I am learning zig and its build system. Its standard library is... lacking... UDP networking support beyond raw sockets. Which isn't necessarily bad, but I'd like DTLS.
I explored its build system and have gotten ENet working in it, which is nice, and I can say so far I certainly like Zig more than C/C++ for dealing with this kind of thing.
However I also realize that I always build encryption out on top of ENet, and it would be nice to have a way handle encryption and server authentication built into the networking layer.
I don't think I'm quite ready to implement DTLS at this point in my zig journey - instead I am working on a pseudo-DTLS protocol with zig's standard libraries
std.os.socket
andstd.crypto
. I'm doing the same handshake as DTLS but I'm not bothering with the standard packets and headers... just throwing[]u8
around withstd.mem
for ease. It's been a good exercise so far, I've learned a lot!Handshake is working and I can send encrypted packets, but there's no authentication or timeout or retransmission handling. I'm currently digging into the rabbit hole that is certificates and message signing to understand how authentication is to be implemented. I also want to do a little more research into ENet's protocol to see how it handles packet reordering and multi-channels communication. My sense is there's a nice way to represent channels with comptime zig and enums, but I'm not sure if/how to make it robust to malicious packets.
Once I've explored all this a bit more I think I'll be in a place to try working on a DTLS implementation. There is already
std.crypto.tls
, so I'll definitely look to that for inspiration once I understand the problem better.I have been working on a side project that I decided will use Hono and Bun to do everything - routes, templates (via JSX), domain. But then yesterday a friend of mine mentioned I should look at Phoenix and Elixir since I was talking about some of the popularity of Rails and I have to say, I'm pretty sold. I am going to look at moving my little project to Phoenix and see if it has the legs I think it would.
Its nice to step out of the JS/TS world when I can but I am just so drawn to its ease-of-use.
I have still been working on taskrabbit. I can’t remember if I posted about it here before. It’s a plugin to Commerce7, a POS that is common in the wine industry to provide automatic notifications and tasks for problematic orders. I was building it first with Vue, since that is what I was familiar with from audiobookcovers.com and audiobookshelf.
I watched a video from Theo the other day about how react was not made for the web. It is designed to be a more general purpose rendering and template engine, and react-dom is what is used to render to the web. That video also highlights a bunch of cool projects like react-pdf, react-email, a react framework to render mp4 video, and a react framework to render to three.js, which is basically OpenGL bindings for JavaScript.
This planted a seed that I couldn’t quite shake. I really would like the ability to create pdfs and emails programmatically, so react seemed like a skill I should learn eventually. Not to mention that the job market for react is much better.
In creating taskrabbit, I had a few false starts around components and styling. Commerce7 provides a react component library so your app can match their styles. I tried a few plugins to use react components in Vue, but I couldn’t get any to work. I resigned myself to building my own similar components in Vue, matching the styles using the web inspector. It worked well enough, but it wasn’t perfect.
I had that seed planted for wanting to learn react, and I had an issue that react seemed to solve. The only issue was I had already built a bunch of the taskrabbit front end in Vue. Despite that, I started playing with react, and now I have rebuilt taskrabbit in react to the same level of functionality as it was in Vue (still not finished though).
I have a confession to make: I think I like react better than Vue. Vue seemed much more simple than react at first, which is why I started with it. But now that I have used both with some level of competency, I like how react handles things better. I don’t really have any specifics to put into words, but react just feels better to me.
Just a suggestion about the name — Taskrabbit is already a rather well known multinational freelance labour company that has been around since 2008.
https://en.wikipedia.org/wiki/Taskrabbit
Here's that video for anyone interested. I had not heard of Theo before, and very much enjoyed the video. Thanks for sending me down this interesting rabbit hole! :)
That’s the one! Thanks for linking it. I only found Theo recently. In some of his longer videos, he starts to ramble and they loose focus, but he has lots of good information in there. I use them as background noise while I am coding, and tune in when something sounds interesting.
I've bought a Galaxy Watch6 because I wanted it... for reasons? And now that I have it, it's a neat little device and I love it, and I have no use for it.
I've mostly used fitness trackers like the Mi Band 6 in the past, because I just need to tell the time, quickly check notifications, control media playback, and count my daily steps (trying to do the bare minimum of 6000 steps per day, with moderate success), and the Mi Band series has been perfect over the years.
However, a while back I started getting a rash from the polymer band of the Mi Band 6, even though I cleaned and disinfected it regularly, so I stopped using it, and recently I felt the need to have a watch again.
So now I have this little gizmo that's way too overpowered for my needs. It also comes with a polymer band, so I ordered a fabric one and I'm waiting for it to arrive.
In the meanwhile I've been playing with it, trying to figure out new use cases. So far I guess I can take calls and reply to messages? Not enjoying the miniscule keyboard though, and I hate voice messages to death and wouldn't wish them even on my worst enemies - so I've been finding myself reaching for my phone anyway.
I can also ask Bixby for stuff, I guess? Google Assistant for Wear OS is not available in Romania for some godforsaken reason, so I'm stuck with Bixby's inferior functionalities. It can sometimes tell me the weather, if it feels like listening to my voice. I also got a lot of "I can't do that right now, but I'm working on it", which is cool.
So while I wait for the band to arrive, what uses do y'all have for your smartwatches? Anything interesting?
P.S. No, I'm not returning it, because if I do, I'll get the urge to buy a new watch later on, and I'll end up in this situation again. I could have just ordered a new band for my fitness tracker though...
Looking at google maps while cycling for directions, without having to whip out my phone (which isn't impossible but not as convenient)
Reading texts/answering calls while my hands are busy/dirty using gestures
An open source Sokoban client you can run on your phone and also reprogram on your phone.
I've made a fridge. You can talk to it on Telegram. 20 minute adventure. t.me/refrigerabot
It'll be live until 18.03 or so, so talk to it while it's
hotcoolI wrote an Anti-Spam/Fraud management plugin to fill gaps in other software and security plugins I use for WordPress. Things like analyzing customer email addresses for potential spam, order velocity checking, geolocation checks, customer behavior etc. Helped me score an A+ when undergoing security audits so decided to turn it into more of a professional tool and working on submitting it to WordPress repository for others to check out.
Still working on my FraXiNUs image sorter I talked about here:
https://tildes.net/~comp/1eei/what_programming_technical_projects_have_you_been_working_on#comment-c3ak
It has roughly 150,000 images in it and 30,000 metadata and trash objects right now. The tag system is up and running and I have screens to view unrated and untagged galleries and also some ability to browse by tag. I built a query DSL that can handle boolean expressions of tags and attribute values but I am continuing to expose that capability to the user.
It has no thumbnail generator, on the LAN it is not a problem to use the original size images for everything. On the go I find it is cumbersome so a Thumbnailer is on the agenda. If I go there I expect to add a celery server. Right now a series of batch jobs crawl the web, extract content and such and I would like to use something like celery to run it on my 16 core computer which should be able to, say, rapidly scan many HTML documents to test new data extraction routines.
I think it was a mistake to write it for asyncio, the more I think about it, the sooner I want to rewrite it to use Flask which should not be too hard (the code looks like a flask app) but will be a bigger task the more I invest in the current system. (YOShInOn did not stress the web server so much)
No progress on A.I. ambitions although the same toolkit YOShInOn uses
https://sbert.net/
should work for both images and the multilingual text my system ingests, going to an all synchronous architecture though should make code portable between the application and Jupyter notebooks, which is a help.
I'm currently writing a tabata/HIIT training app, mostly for myself. It's almost finished, and I really like it so far. What sets it apart from all competitors I found so far is that it's clean. It's only got the functionality I actually need. Also, I think it looks nice.