9
votes
What programming/technical projects have you been working on?
This is a recurring post to discuss programming or other technical projects that we've been working on. Tell us about one of your recent projects, either at work or personal projects. What's interesting about it? Are you having trouble with anything?
I made the dumbest Wordle solver and I love it.
I play the daily NYT Wordle on hard mode. Sometimes you get weird combinations of right letter, wrong location and can struggle to think of any word that fits all criteria. The web based Wordle solvers either give you spoilers or implement the logic incorrectly. E.g., you could put the letter A as a correct letter but in the wrong spot, say location 3. And then the Wordle solvers will suggest a dozen words with the letter A in the third location.
I decided to write my own solver. I wanted to be able to use it from my phone, didn't want to deal with cross compiling to a mobile app, and didn't want to make yet another container that I self host.
So I made it in Excel, using straight Excel functions, generating a massive sparse boolean matrix and doing chain boolean logic for the set logic of intersections of sets. I used one of the official Scrabble word lists for the dictionary, and scraped the shortlist of Wordle answers from the JS array of Wordle itself.
So now I have an Excel file on my phone that is a simple easy to use Wordle solver that actually properly implements the set logic to suggest only viable answers.
When I've wanted to cheat at Wordle, I've hacked together a Unix shell pipeline for it (which would be difficult, though not necessarily impossible, to do on a phone):
This contrived example finds the single word "batch".
Way too much detail explaining all that, for anyone who really wants to damage their brain with Unix shell:
cat
" principle: opening the pipeline withcat
makes the entire thing syntactically uniform (an important consideration in a language as obtuse as shell) at the cost of a single fork/exec, something which is difficult to even measure in the context of running a shell script.egrep
(sometimes spelledgrep -e
) is grep with "extended" regex syntax. At the limits of regex (way past what is advisable), the capabilities of extended regex are greater than those of regular, but I use it because the syntax requires a lot less escaping.^.....$
means "start of string, any letter (x5), end of string" and matches any five-letter word in the wordlist. There are some decent online regex explorer tools it's worth playing around with. Replace each.
(meaning "any single letter") with either[^ab]
("none of these letters") orc
(this specific letter) as you gain information from continued play.fgrep
(sometimes spelledgrep -f
) is grep with literal matching only (i.e. the match expression is not a regex). I use it habitually when I'm not intending to write a regex just to prevent accidents. The chainfgrep 'a' | fgrep 'b'
ensures that botha
andb
are present in all the matched words.egrep -v
means "everything which doesn't match".[de]
matches everything with ad
ore
, soegrep -v '[de]'
means "everything with neitherd
nore
". (Be careful about what you're negating;[^de]
is tempting, but means "match anything with any letter which is notd
ore
, which is definitely not what we want for this purpose.)Shell's expansion rules are arcane, so I opt out of them whenever I'm not intentionally trying to invoke them (and even then I often write things like
'prefix'"${inclusion}"'suffix'
) to avoid mistakes. (Double quotes mean "expand stuff inside these but don't do word splitting" while single quotes mean "don't mess with whatever's in here".)Yes, I am. There are some real good rants about Unicode out there; go read one of them if that's what you're looking for (I'm partial to eevee's, myself).
Wordle specifically is played with a pretty banal subset of Engilsh words, so you can get away with assuming that letter ⇔ "character" ⇔ Unicode codepoint ⇔ UTF-8 byte.
In all honesty, don't. Write this program in Python, instead. Shell is a very useful tool, but it's also fragile, dangerous, obtuse, difficult to learn, and basically impossible to use well. I don't know of a good way to learn it; I learned it as well as I did by having a job whose product was largely written in it, an approach which I strongly discommend.
You'd have to ask @krellor. ;) I'm predisposed to dislike Excel spreadsheets due to a career of trying to extract data from them (n.b. don't do this if you can possibly avoid it, make your clients give you data in almost literally any other format), but for making programmable worksheets, it's definitely not the worst tool.
The dirty secret of shell is that about 90% of it could be an
awk
script, and would probably be better that way, as well. This pipeline evolved as I needed to do more kinds of filtering on my wordlist, rather than being engineered as a whole in any way, and sticking on moregrep
s is a lot easier than rewriting the thing inawk
.Thanks for the fun reply! I've definitely hacked together some dubious shell scripts in my time.
As much as I've been exasperated by finding out an organizations "database" is really just a massive, fragile Excel workbook with a hot mess of VB script under the hood, I do appreciate it for what it does well. Which is basic calculations with tabular data, and in that sense it fit the bill for my Wordle solver.
To contrast the Excel with the shell script, the "user interface" part is definitely nicer for a non-technical user in that you don't need to know regex to filter possible words. In fact, it's really not any different than the web interfaces out there. It has cells in a row for correct letter correct position, a row of cells for right letter wrong position, and a row for wrong letter. As you fill the cells in, two words lists below start filtering out, one with all possible words and one with the most likely words.
However, building the behind the scenes worksheet wasn't hard but isn't pretty. You have a matrix with about 12,000 rows, one for each five letter word. Then you have about 36 columns, one for each user input field (right letter right place first position, etc).
There are formula that create a 1 in each cell when the given word in that row matches the criteria represented by that column. You then just select the rows with the most entries, and those are your elgible words. Then I made a second set to show likely suggestions of the possible words.
I have a few more staging columns, but that is the gist of it.
I suspect the boolean calculations would throw most of today's programmers who aren't used to those sorts of tricks. But having cut my teeth on low level languages, but arithmetic is pretty natural to me.
However, I agree that for a real tool, Python is the better solution. Or c#. Or pretty much anything that isn't the shell or a giant sparse matrix. But we can all agree that the worst way would be Windows batch script. 🙂
Been converting all my old websites to NextJs/Tailwind. It's a love/hate relationship right now. I feel like it will better once I'm done. But fucking so many little pain points are driving me up a wall.
If you don't mind my curiosity- what are you converting from? Another framework stack, or vanilla js/css?
pretty much, twitter bootstrap for css and backend is django
Still working on my forum / wiki thing (Keeper). Currently distracted by figuring out how best to test it. I like the fast-check library for property-based testing, but it can be difficult to generate valid examples of complicated data structures that use unique IDs.
The library seems easier to use when your test data has a lot of pieces that independently vary. When you have constraints between the pieces, you can filter, but it's inefficient.
In this kind of library, there are "arbitraries" that represent sets of potential test values and methods like map() and chain() that let you work with individual values within the callback function. It seems easier to work with individual test values and generate more when you need them in the test. The somewhat obscure fc.gen function lets you do this.
I have done quite a bit of research in fuzz testing, though I'm not familiar with this library or that experienced with JS/TS.
The constraint problem for complex inputs was always challenging because the more you shape the fuzzing inputs, the more you bias the tests and/or reimplement the application logic. One of the value propositions for fuzzing is that it finds different bugs than functional testing, but the deeper you go in the constraints the less different it becomes.
One of the ideas we got a fair amount of traction with was the idea of invariants – what is the code NOT supposed to do. If you can model the invariants as a simpler outer bound of the functional behavior, then the fuzzing inputs can be less constrained. Anything that violates an invariant is a test failure, and the rest is GIGO.
For example, the simplest invariant is "doesn't crash" (though it might not be applicable to JS given the boundaries of the runtime, idk). Another invariant would be things like speed limit. If you are supposed to be able to set a speed limit, then you fuzz the rest of the command inputs and see if you can get it to violate the speed limit.
Admittedly, we were doing this work with safety critical applications, so we did have safety requirements to follow in modeling invariants. Not sure how well it would translate to app development.
I'm actually doing functional testing, but with a fuzzer to make it exercise more possibilities without having to write them out. There are similar considerations. A classic example that it works well for is something like parsing and serialization, where you can test that an arbitrary input round-trips.
However, I do want to test specific cases too. There's a question of how explicitly to do it; I could write one test and assume the fuzzer will find the corner cases, or write multiple tests to make sure it tests each one at least once.
I've also been doing more testing this week. Wrote something to fail my CI when I don't have tests for a specific file:
Last month I wrote this: cptree.py to try and solve the problem:
And while it created the test files, it didn't give me the motivation to actually write tests so now I have to actually fill in the empty files. Maybe I'll switch to actual code coverage in a few years...
I bought a Prusa MK4 but had been stalled assembling it and the enclosure, so I made a recent push to get that done. It's amazing how much of an improvement over the MK3 it is.
Prusa Research has cloud tool called Prusa Connect that supports remote control and monitoring of printers. I had been dithering about using it because there is no camera support built into the control board, so if I was going to dedicate a raspberry pi for uploading camera images, I might as well use Octoprint and get full streaming video, vs one image every 10s. But OTOH, setting up multiple octoprint instances meant figuring out how to have a single TLS proxy in front of them so they are all securely accessible outside my firewall. I don't have to deal with any of that with PrusaConnect.
I decided I was going to give Prusa Connect a try because this kind of integration is probably where future development is going. Given how capable the MK4 is, I can't imagine that I don't either retire/sell the MK3 or do a MK3.9 upgrade, which makes it make even more sense.
They recently released an official version of an ESP32-Cam camera board build that automatically uploads a picture to the API. I had a different ESP32 camera board (Freenove WROVER board), so I ported the firmware over and designed a magnetic adjustable mount for it. It was a neat foray into the Arduino space, which I haven't spent much time in.
Super glad I set the camera up, because right after, I had a print fail. I was able to catch it before it spewed out too much spaghetti or trashed the hot end.
I like Prusa Connect because I can set up a print queue and it will automatically start the next print once I mark the printer ready. It also streams the print file to the printer and starts printing as soon as it has enough buffer, so less downtime waiting on transfers.
I read in the forums that they are working on implementing STUN so that Prusa Connect in the browser can connect directly to a camera stream without their servers mediating the traffic. Once that is in, having full motion video will probably make Prusa Connect feature complete for me.
Working on, and now publicly released, a sequential task runner in Python - boku.
This was made as a personal tool to help me automate recurring tasks without having to define them with code or makefiles. Nothing revolutionary, but it's done to my spec which always feels nice.
For example, when I need to create a fresh Python project with Mise, I use a task that looks like this:
I developed and released a chess vision exercise trainer, made with Flask: https://github.com/3d12/rookognition
This is mainly significant to me as it's the first project I've released in v1.0 form. It's exactly as feature-complete as I first imagined it, and I completely defeated scope creep this time, turning the whole project around in 2 days. I have a long list of improvements in mind, but I'm comfortably letting them brew in the backlog while I play with my new toy. I can only hope this training helps me fare better against my local club members next month...
Fulfilling my client projects, both PHP web applications and deadline is by coming weekend.
I've been meaning to work on an open source idea but not getting enough time for that! Ironically, the project idea is about a time management app (facepalm!).
rewiring my 3d printer for cable management. Im putting cable chains on all 3 axis and moving all electronics to a dedicated electronics case. Its a bit nerve wracking because it involves a lot of printed parts and now that Ive started, if I need to reprint a part for any reason, I cant do it myself. Thankfully I have friends with printers.