8 votes

What programming/technical projects have you been working on?

This is a recurring post to discuss programming or other technical projects that we've been working on. Tell us about one of your recent projects, either at work or personal projects. What's interesting about it? Are you having trouble with anything?

15 comments

  1. [3]
    krellor
    Link
    I made the dumbest Wordle solver and I love it. I play the daily NYT Wordle on hard mode. Sometimes you get weird combinations of right letter, wrong location and can struggle to think of any word...

    I made the dumbest Wordle solver and I love it.

    I play the daily NYT Wordle on hard mode. Sometimes you get weird combinations of right letter, wrong location and can struggle to think of any word that fits all criteria. The web based Wordle solvers either give you spoilers or implement the logic incorrectly. E.g., you could put the letter A as a correct letter but in the wrong spot, say location 3. And then the Wordle solvers will suggest a dozen words with the letter A in the third location.

    I decided to write my own solver. I wanted to be able to use it from my phone, didn't want to deal with cross compiling to a mobile app, and didn't want to make yet another container that I self host.

    So I made it in Excel, using straight Excel functions, generating a massive sparse boolean matrix and doing chain boolean logic for the set logic of intersections of sets. I used one of the official Scrabble word lists for the dictionary, and scraped the shortlist of Wordle answers from the JS array of Wordle itself.

    So now I have an Excel file on my phone that is a simple easy to use Wordle solver that actually properly implements the set logic to suggest only viable answers.

    6 votes
    1. [2]
      whbboyd
      Link Parent
      When I've wanted to cheat at Wordle, I've hacked together a Unix shell pipeline for it (which would be difficult, though not necessarily impossible, to do on a phone): # This wordlist is present...

      When I've wanted to cheat at Wordle, I've hacked together a Unix shell pipeline for it (which would be difficult, though not necessarily impossible, to do on a phone):

      # This wordlist is present on most Unixes, and isn't the Wordle wordlist, but
      # works well enough
      cat /usr/share/dict/words | \
      	# Specific letters: '[^ab]' if a letter is known to be present but not one
      	# of the above (yellow), 'c' if the letter is known to be the specific
      	# letter, '.' otherwise
          egrep '^..[^ab]c.$' | \
      	# Known present (i.e. yellow) letters: there's probably a better way to
      	# express this, but meh
          fgrep 'a' | fgrep 'b' | \
      	# Known absent (i.e. grey) letters:
          egrep -v '[de]'
      

      This contrived example finds the single word "batch".


      Way too much detail explaining all that, for anyone who really wants to damage their brain with Unix shell:

      • I strongly disagree with the "useless use of cat" principle: opening the pipeline with cat makes the entire thing syntactically uniform (an important consideration in a language as obtuse as shell) at the cost of a single fork/exec, something which is difficult to even measure in the context of running a shell script.
      • egrep (sometimes spelled grep -e) is grep with "extended" regex syntax. At the limits of regex (way past what is advisable), the capabilities of extended regex are greater than those of regular, but I use it because the syntax requires a lot less escaping.
      • The regex ^.....$ means "start of string, any letter (x5), end of string" and matches any five-letter word in the wordlist. There are some decent online regex explorer tools it's worth playing around with. Replace each . (meaning "any single letter") with either [^ab] ("none of these letters") or c (this specific letter) as you gain information from continued play.
      • fgrep (sometimes spelled grep -f) is grep with literal matching only (i.e. the match expression is not a regex). I use it habitually when I'm not intending to write a regex just to prevent accidents. The chain fgrep 'a' | fgrep 'b' ensures that both a and b are present in all the matched words.
      • egrep -v means "everything which doesn't match". [de] matches everything with a d or e, so egrep -v '[de]' means "everything with neither d nor e". (Be careful about what you're negating; [^de] is tempting, but means "match anything with any letter which is not d or e, which is definitely not what we want for this purpose.)

      Why the single quotes?

      Shell's expansion rules are arcane, so I opt out of them whenever I'm not intentionally trying to invoke them (and even then I often write things like 'prefix'"${inclusion}"'suffix') to avoid mistakes. (Double quotes mean "expand stuff inside these but don't do word splitting" while single quotes mean "don't mess with whatever's in here".)

      You're playing fast and loose with what you mean by "letter".

      Yes, I am. There are some real good rants about Unicode out there; go read one of them if that's what you're looking for (I'm partial to eevee's, myself).

      Wordle specifically is played with a pretty banal subset of Engilsh words, so you can get away with assuming that letter ⇔ "character" ⇔ Unicode codepoint ⇔ UTF-8 byte.

      How would you recommend learning how to do this?

      In all honesty, don't. Write this program in Python, instead. Shell is a very useful tool, but it's also fragile, dangerous, obtuse, difficult to learn, and basically impossible to use well. I don't know of a good way to learn it; I learned it as well as I did by having a job whose product was largely written in it, an approach which I strongly discommend.

      Wait, you think this is worse than Excel?

      You'd have to ask @krellor. ;) I'm predisposed to dislike Excel spreadsheets due to a career of trying to extract data from them (n.b. don't do this if you can possibly avoid it, make your clients give you data in almost literally any other format), but for making programmable worksheets, it's definitely not the worst tool.

      This could all be an awk script.

      The dirty secret of shell is that about 90% of it could be an awk script, and would probably be better that way, as well. This pipeline evolved as I needed to do more kinds of filtering on my wordlist, rather than being engineered as a whole in any way, and sticking on more greps is a lot easier than rewriting the thing in awk.

      1 vote
      1. krellor
        Link Parent
        Thanks for the fun reply! I've definitely hacked together some dubious shell scripts in my time. As much as I've been exasperated by finding out an organizations "database" is really just a...

        Thanks for the fun reply! I've definitely hacked together some dubious shell scripts in my time.

        As much as I've been exasperated by finding out an organizations "database" is really just a massive, fragile Excel workbook with a hot mess of VB script under the hood, I do appreciate it for what it does well. Which is basic calculations with tabular data, and in that sense it fit the bill for my Wordle solver.

        To contrast the Excel with the shell script, the "user interface" part is definitely nicer for a non-technical user in that you don't need to know regex to filter possible words. In fact, it's really not any different than the web interfaces out there. It has cells in a row for correct letter correct position, a row of cells for right letter wrong position, and a row for wrong letter. As you fill the cells in, two words lists below start filtering out, one with all possible words and one with the most likely words.

        However, building the behind the scenes worksheet wasn't hard but isn't pretty. You have a matrix with about 12,000 rows, one for each five letter word. Then you have about 36 columns, one for each user input field (right letter right place first position, etc).

        There are formula that create a 1 in each cell when the given word in that row matches the criteria represented by that column. You then just select the rows with the most entries, and those are your elgible words. Then I made a second set to show likely suggestions of the possible words.

        I have a few more staging columns, but that is the gist of it.

        I suspect the boolean calculations would throw most of today's programmers who aren't used to those sorts of tricks. But having cut my teeth on low level languages, but arithmetic is pretty natural to me.

        However, I agree that for a real tool, Python is the better solution. Or c#. Or pretty much anything that isn't the shell or a giant sparse matrix. But we can all agree that the worst way would be Windows batch script. 🙂

  2. [3]
    supported
    Link
    Been converting all my old websites to NextJs/Tailwind. It's a love/hate relationship right now. I feel like it will better once I'm done. But fucking so many little pain points are driving me up...

    Been converting all my old websites to NextJs/Tailwind. It's a love/hate relationship right now. I feel like it will better once I'm done. But fucking so many little pain points are driving me up a wall.

    5 votes
    1. [2]
      lynxy
      Link Parent
      If you don't mind my curiosity- what are you converting from? Another framework stack, or vanilla js/css?

      If you don't mind my curiosity- what are you converting from? Another framework stack, or vanilla js/css?

      1 vote
      1. supported
        Link Parent
        pretty much, twitter bootstrap for css and backend is django

        pretty much, twitter bootstrap for css and backend is django

        1 vote
  3. [3]
    skybrian
    (edited )
    Link
    Still working on my forum / wiki thing (Keeper). Currently distracted by figuring out how best to test it. I like the fast-check library for property-based testing, but it can be difficult to...

    Still working on my forum / wiki thing (Keeper). Currently distracted by figuring out how best to test it. I like the fast-check library for property-based testing, but it can be difficult to generate valid examples of complicated data structures that use unique IDs.

    The library seems easier to use when your test data has a lot of pieces that independently vary. When you have constraints between the pieces, you can filter, but it's inefficient.

    In this kind of library, there are "arbitraries" that represent sets of potential test values and methods like map() and chain() that let you work with individual values within the callback function. It seems easier to work with individual test values and generate more when you need them in the test. The somewhat obscure fc.gen function lets you do this.

    4 votes
    1. [2]
      first-must-burn
      Link Parent
      I have done quite a bit of research in fuzz testing, though I'm not familiar with this library or that experienced with JS/TS. The constraint problem for complex inputs was always challenging...

      I have done quite a bit of research in fuzz testing, though I'm not familiar with this library or that experienced with JS/TS.

      The constraint problem for complex inputs was always challenging because the more you shape the fuzzing inputs, the more you bias the tests and/or reimplement the application logic. One of the value propositions for fuzzing is that it finds different bugs than functional testing, but the deeper you go in the constraints the less different it becomes.

      One of the ideas we got a fair amount of traction with was the idea of invariants – what is the code NOT supposed to do. If you can model the invariants as a simpler outer bound of the functional behavior, then the fuzzing inputs can be less constrained. Anything that violates an invariant is a test failure, and the rest is GIGO.

      For example, the simplest invariant is "doesn't crash" (though it might not be applicable to JS given the boundaries of the runtime, idk). Another invariant would be things like speed limit. If you are supposed to be able to set a speed limit, then you fuzz the rest of the command inputs and see if you can get it to violate the speed limit.

      Admittedly, we were doing this work with safety critical applications, so we did have safety requirements to follow in modeling invariants. Not sure how well it would translate to app development.

      2 votes
      1. skybrian
        Link Parent
        I'm actually doing functional testing, but with a fuzzer to make it exercise more possibilities without having to write them out. There are similar considerations. A classic example that it works...

        I'm actually doing functional testing, but with a fuzzer to make it exercise more possibilities without having to write them out. There are similar considerations. A classic example that it works well for is something like parsing and serialization, where you can test that an arbitrary input round-trips.

        However, I do want to test specific cases too. There's a question of how explicitly to do it; I could write one test and assume the fuzzer will find the corner cases, or write multiple tests to make sure it tests each one at least once.

  4. xk3
    Link
    I've also been doing more testing this week. Wrote something to fail my CI when I don't have tests for a specific file: unique_modules = list(set(s.rsplit(".", 1)[0] for s in modules.keys())) #...

    I've also been doing more testing this week. Wrote something to fail my CI when I don't have tests for a specific file:

    unique_modules = list(set(s.rsplit(".", 1)[0] for s in modules.keys()))  # chop off function names
    
    def get_test_name(s):
        path = s.replace("xklb.", "tests.", 1).replace(".", "/")
        parent, name = os.path.split(path)
        path = os.path.join(parent, "test_" + name + ".py")
        return path
    
    @pytest.mark.parametrize("path", [get_test_name(s) for s in unique_modules])
    def test_pytest_files_exist(path):
        Path(path).touch(exist_ok=True)
        assert os.path.getsize(path) > 0, f"Pytest file {path} is empty."
    

    Last month I wrote this: cptree.py to try and solve the problem:

    cptree.py xklb/ tests/ -v --simulate --ext py --file-prefix test_ | grep -v '__'
    

    And while it created the test files, it didn't give me the motivation to actually write tests so now I have to actually fill in the empty files. Maybe I'll switch to actual code coverage in a few years...

    2 votes
  5. first-must-burn
    Link
    I bought a Prusa MK4 but had been stalled assembling it and the enclosure, so I made a recent push to get that done. It's amazing how much of an improvement over the MK3 it is. Prusa Research has...

    I bought a Prusa MK4 but had been stalled assembling it and the enclosure, so I made a recent push to get that done. It's amazing how much of an improvement over the MK3 it is.

    Prusa Research has cloud tool called Prusa Connect that supports remote control and monitoring of printers. I had been dithering about using it because there is no camera support built into the control board, so if I was going to dedicate a raspberry pi for uploading camera images, I might as well use Octoprint and get full streaming video, vs one image every 10s. But OTOH, setting up multiple octoprint instances meant figuring out how to have a single TLS proxy in front of them so they are all securely accessible outside my firewall. I don't have to deal with any of that with PrusaConnect.

    I decided I was going to give Prusa Connect a try because this kind of integration is probably where future development is going. Given how capable the MK4 is, I can't imagine that I don't either retire/sell the MK3 or do a MK3.9 upgrade, which makes it make even more sense.

    They recently released an official version of an ESP32-Cam camera board build that automatically uploads a picture to the API. I had a different ESP32 camera board (Freenove WROVER board), so I ported the firmware over and designed a magnetic adjustable mount for it. It was a neat foray into the Arduino space, which I haven't spent much time in.

    Super glad I set the camera up, because right after, I had a print fail. I was able to catch it before it spewed out too much spaghetti or trashed the hot end.

    I like Prusa Connect because I can set up a print queue and it will automatically start the next print once I mark the printer ready. It also streams the print file to the printer and starts printing as soon as it has enough buffer, so less downtime waiting on transfers.

    I read in the forums that they are working on implementing STUN so that Prusa Connect in the browser can connect directly to a camera stream without their servers mediating the traffic. Once that is in, having full motion video will probably make Prusa Connect feature complete for me.

    2 votes
  6. hxii
    Link
    Working on, and now publicly released, a sequential task runner in Python - boku. This was made as a personal tool to help me automate recurring tasks without having to define them with code or...

    Working on, and now publicly released, a sequential task runner in Python - boku.

    This was made as a personal tool to help me automate recurring tasks without having to define them with code or makefiles. Nothing revolutionary, but it's done to my spec which always feels nice.

    For example, when I need to create a fresh Python project with Mise, I use a task that looks like this:

    version: 0.0.4
    information: |
        Create a Mise config in the folder.
    variables:
        config: |
            [tools]
            python = "latest"
            
            [env]
            _.python.venv = { path = ".venv", create = true }
    tasks:
        create_config:
            run: |
                cat > .mise.toml << EOL
                variables.config
                EOL
        trust:
            run: mise trust
    
    1 vote
  7. pyeri
    Link
    Fulfilling my client projects, both PHP web applications and deadline is by coming weekend. I've been meaning to work on an open source idea but not getting enough time for that! Ironically, the...

    Fulfilling my client projects, both PHP web applications and deadline is by coming weekend.

    I've been meaning to work on an open source idea but not getting enough time for that! Ironically, the project idea is about a time management app (facepalm!).

  8. Toric
    Link
    rewiring my 3d printer for cable management. Im putting cable chains on all 3 axis and moving all electronics to a dedicated electronics case. Its a bit nerve wracking because it involves a lot of...

    rewiring my 3d printer for cable management. Im putting cable chains on all 3 axis and moving all electronics to a dedicated electronics case. Its a bit nerve wracking because it involves a lot of printed parts and now that Ive started, if I need to reprint a part for any reason, I cant do it myself. Thankfully I have friends with printers.

  9. 3d12
    Link
    I developed and released a chess vision exercise trainer, made with Flask: https://github.com/3d12/rookognition This is mainly significant to me as it's the first project I've released in v1.0...

    I developed and released a chess vision exercise trainer, made with Flask: https://github.com/3d12/rookognition

    This is mainly significant to me as it's the first project I've released in v1.0 form. It's exactly as feature-complete as I first imagined it, and I completely defeated scope creep this time, turning the whole project around in 2 days. I have a long list of improvements in mind, but I'm comfortably letting them brew in the backlog while I play with my new toy. I can only hope this training helps me fare better against my local club members next month...