13
votes
What programming/technical projects have you been working on?
This is a recurring post to discuss programming or other technical projects that we've been working on. Tell us about one of your recent projects, either at work or personal projects. What's interesting about it? Are you having trouble with anything?
I sort of threw up my hands in frustration at the state of open source, self hostable, shared notes apps and decided to build my own.
For context, I've been an Obsidian user for a few years now, and I used the Self-Hosted LiveSync plugin to have my notes sync across my own personal devices, as well as with my wife's devices for our shared notes.
The two big issues are:
The thing is, this is like my whole thing. I led the team at the NYT that built their collaborative rich text editor, I maintain a popular rich text editing library, and I do consulting for companies that need help with collaborative text editors.
But I ALSO have a newborn, so I'm very strapped on free time. So I set a time limit for myself of one week to get a working mobile app that had working collaborative markdown editing (and folder-based note organization). Today is the last day of that week, and I've got about an hour left of work before I feel comfortable trying to use this with my wife.
Feeling pretty good about myself. Also just managed to get my totally-losing-it newborn to go down for a nap, and she's been asleep for 45 minutes so far, so that's a double win.
I saw this the other day on Tildes and it might be helpful:
I empathize. I feel this. Sometimes a lot. It's a great feeling when tools that we write work 100% of the time! But there are a couple tools that I rely on frequently (which I wrote for myself) which are nefarious and vexing. It's easy to blame the software that it interacts with--ie. Firefox's caching sometimes empties out what I'd rather keep permanently in Local Storage... Software which talks incrementally over the network is also difficult to write correctly... But sometimes the simplest of programs can vex you when all troubleshooting yields that the program is correct but your feeling of what the output should be does not quite match what the output is...
Oh, yeah, the Ink&Switch folks are fantastic! Been following/chatting with them for a while now. Luckily, since I'm using Markdown (which is technically plaintext), I don't need anything as complex as Peritext. Plain old operational transforms work just fine! I'm using CodeMirror and its collaborative editing plugin for this, which the author goes into in depth in this blog post.
I ended up just using CouchDB as the backend. It's very simple to self host, and it's one of the very few databases with a realtime subscription API built in. Not what I would choose for a production app, but perfect for a little self hosted notes app!
In case you (or anyone) is interested, here's the repo for the app: https://gitlab.com/smoores/little-notes. It ended up being pretty tidy!
Tangent: Peritext (and Automerge, which is where the team's latest iteration of their rich text CRDT lives) is very cool, and I love it because it demonstrates two things:
If you aren't actually building a collaborative editor with no central server, there's no reason at all to use CRDTs. Truly, the ProseMirror collab algorithm is better in every single metric!
Could you elaborate on why CRDTs are a bad fit for rich text?
Absolutely! To start, I think the best resource for this is actually the original blog post announcing Peritext. It's quite long and detailed, but it stays high-level enough that I think it should be pretty approachable for most folks.
The very short version is: semantics. The slightly longer version is:
CRDTs rely on the notion that any two document states can be deterministically merged. It turns out that, in a rich text editor, the best merge algorithm is highly context-dependent. That is, you need to merge text insertions differently from formatting changes, and formatting changes differently from hyperlinking, and all of these differently from commenting. This becomes even more complex in when you add complex nesting rules, like a document structure that allows the insertion and editing of quizzes with multiple groupings of questions and answers. Merging divergent document states becomes very challenging!
Note: I keep saying "document", but it's worth noting that CRDTs like Y.js and Automerge more or less store the entire edit history in tbe document. So when we "merge a document", we're really merging edit histories.
To be clear, this semantics issue is just a hard problem — CRDTs are not uniquely bad at it. The other primary category of collaborative editing algorithms, Operational Transformations (OT) is also poorly suited for rich text. That's why ProseMirror's collab plugin uses a pseudo-OT approach, that relies on a centralized server for ordering document changes. This allows clients to recieve a stream of a minimal set of ordered changes, and then use a git-style rebase algorithm to undo their local changes, apply the new remote changes, and then re-apply their local changes on top of the resulting document.
In practice, this results in semantics that are about as good as you can get without two people intentionally coordinating their changes, because it's essentially the same process as if two people were typing in the same document on the same machine with two different keyboards (cue NCIS firewall breach scene). This can also be a performance improvement over CRDTs, which, like I mentioned, essentially store the entire document history (it is compressable to an extent) in the state. In ProseMirror's pseudo-OT approach, clients can load just the current state of the document from the central authority, without needing to download any past history.
Aaand finally, ProseMirror's pseudo-OT approach is much easier to inspect and debug than most existing CRDT implementations, which is great for development!
Hopefully that was helpful, and hopefully I haven't misrepresented anything. One of my clients also wrote a great blog post about how weird and complicated the conversations around collaborative rich text editing are at the moment. Also worth a read!
Thank you for taking the time to write a thoughtful reply while you’re preoccupied with a newborn! Your comment is the exact level of detail I was hoping for and you didn’t disappoint.
I wasn’t imaginative enough to think of the different contexts between text insertions, I’m pleasantly intrigued by the number of ways to crack this egg. Your client’s blogpost is enlightening, I also thought collaborative/offline editing was a solved problem. Except I believed CRDTs were the silver bullet doing all the solving. I’m worried I’ve been nerdsniped and doomed to see how far this rabbit hole goes.
Now that I know a tad more than nothing at all I agree wholeheartedly you’ve chosen the right tools for the job.
You're very welcome! Hahaha welcome to the rabbit hole, it's spacious in here.
If you're interested in further reading, I think Marijn Haverbeke (author of ProseMirror and CodeMirror) has some of the most approachable writing on the subject. Here he explains the faux OT system that ProseMirror (rich text) uses, and here he explains the actual OT system that CodeMirror (plaintext) uses.
Also, if you really just want to spend an afternoon with a headache, here's a somewhat less approachable article that argues that OT and CRDTs are just different implementations of essentially the same underlying principles.
A few more screenshots of my VTT progress this past week.
It has been an interesting week, as many entries of my ToDo list that I kept pushing to a later point turned out to, just like last week, actually be pretty easy to accomplish, such as displaying some resizable and movable divs above their related Fog of War region position on the canvas and syncing the changes with every client and the game database. I mean, I already knew how to do it in terms of the concept, since translating canvas coordinates to window coordinates and vice versa is basically the core of this entire project, but I expected it to be a bit more bumpy on the way to make it work properly.
This sunday, we have a Call of Cthulhu game again and I am planning on using my VTT this time, as at this point, I'm only missing a few core features that should be ready until then. Namely to finish the Token Library and reuse it similarly for the Map Library.
I still haven't started on the drawing tools, but those are not necessary for sunday's game.
Token menu and nameplate editing
Menu for tokens attached to other tokens
Fog of War editing menu view
Update: the Token editing window is now basically done.
That means I can finally use it for Token creation and then reuse it for the Map Library and finish the final steps to make this VTT usable for our next game.
I'm excited.
Coming from a 5 hours session of Call of Cthulhu today!
For 99% of the session, everything went fine. It was mainly minor issues I was able to easily fix afterwars, such as 2 client crashes because of a layer switching feature I wrote this morning, not taking into account that if the object is hidden, obviously the object doesn't exist on the player's side.
The other issues are rendering related. I made the mistake of drawing some things twice at the same time when a remote player was moving something around, such as tokens or their pointing-cursor, resulting in almost twice the GPU load.
The rest was mainly related to unnecessary greyscale filters for hidden objects causing excessive load for the GM or players who CAN see them. I also reduced object draw function calls by only requesting draws for objects that are actually in the visible area.
I reduced the GPU usage on a very cluttered map with multiple pictures of big file sizes (map related) when someone else is moving a token around from 47% zoomed in (less to draw) to 23% when fully zoomed out (all the available objects in view) on my device.
Someone on one of the social media platforms I am on posted that they wished an FF extension existed that replaced all Youtube/Vimeo embedded videos on webpages with URLs/links and asked if there was any way to get something like this working in FF.
In the process of trying to help I located a (MIT-licensed) userscript that does this (that you can install in Tampermonkey/similar extensions), and decided to try my hand at converting the userscript into an FF extension
Well, having gone through the process I realize now that, though it probably depends on what the script does, plenty of scripts can be rather easily converted to an extension without much modification, other than surrounding it with the right meta stuff (manifest.json, icon stuff, etc). So that's neat. I personally like and use userscripts, but I like the idea of making standalone FF extensions that can accomplish the same thing (which provides that functionality then to a wider range of people, often less tech-inclined).
So here is the resulting extension.
I don't know if I'll update it, but I potentially would like to make it toggleable and so on, but I haven't exactly figured out how to code the extra stuff needed to accomplish that. I've found some info on it, but I'm definitely realizing that extension development doesn't have a clear and easy way to learn "how do I do X". Some of mozilla's documentation is good, but sometimes it feels like there's a learning piece missing of knowing how to put it all together. There needs to be more involved/better tutorials for this kind of thing.
Like many, I am conflicted about the current rise of AI assisted coding agents and tools. A lot is written about the pros and cons already, but the thing that bothers me most when actually using something like Claude Code is that by letting the tool write the code, the developer is not really internalizing the problem and the solution that is implemented. In the short term, that is great because it takes less effort and concentration to build something useful. In the long term, it is disastrous because it is the quickest route to an immense, unmanageable pile of technical debt.
So I had the idea for a coding agent that would do all the things you'd expect, except for one thing: it cannot edit your code. Instead, it would present suggestions, snippets, advice, etc. that you would need to add to the code base yourself. UI is crucial here, obviously. It must be easy, but not too easy, to implement the suggested changes. But the good thing is: you can use the editor/IDE you already know to edit the code.
Unfortunately, I am very much a backend person, so progress is slow. But I can say that the most basic prototype I have now (a chatbot + tools to read files) already sparks more joy than using something like Claude Code, even though featurewise this is comparing a paper plane to a fighter jet.
Hopefully, that will give me the momentum to continue working on it.
So... uhhh... I did a thing in the past two days. In the middle of a rewrite within a rewrite, I did another rewrite.
My previous rewrite was using Deno and Fresh. I have gone back to nodejs and react. Honestly, I am happier with the new tech stack. Deno is a fantastic runtime. I absolutely love deno. If I can, I will continue to use Deno in the future. But there are some things that Deno does that I think make things worse.
First: node_modules. I know the meme is that the node_modules folder is terrible. But I think that is only true with npm. I have tried a few different packages managers, and I think pnpm is the best. pnpm caches downloads globally, so my shitty internet can install project dependencies in a reasonable time. I have a mac with APFS, which supports copy on write. So installing new projects doesn't take any extra disk space. This negates Deno's promise of unified global folder of packages. And not having a node_modules kinda sucks for debugging. I think this is a tooling issue, but I don't want to wait for the tools to get better. If I command click a function to view it's implementation with Deno, my IDE just throws it's hands up and shows me nothing. And deno's handling of node_modules makes Dockerfiles more complicated. With pnpm, just install the dependencies in a separate build stage and copy the node_modules folder. With Deno, you have to make sure that Deno's cache is going to a certain folder and copy that to your final docker stage. If you mess this up, deno will install the packages at runtime, increasing cold start times and making you susceptible to supply chain attacks. Deno is simpler for scaffolding simple projects, but more complex for handling complex projects.
Second: node compatibility. I found an incompatibility with Deno that causes it to not correctly close a websocket to a Postgres server. node-postgres package makes it so that you can't connect to Postgres over ssl. Not having ssl on a database connection just isn't an option. Period. I found a workaround, but with node I didn't need a workaround.
Third: Compatibility with shadcn/ui and other component helper libraries. I like writing code, but I am not good at making a pretty component library. Shadcn gives me a place to start, and the ability to modify as needed. This part was the straw that broke the camel's back. I spent a few hours trying to reimplement the asChild property of the shadcn Button component within Fresh. Then I started looking at my options for using just plain react.
Before I changed frameworks, I learned a ton about website design. Since I am a new web developer, I never adequately learned how things like HTML forms work. Since Fresh is server rended (with some exceptions), I got to learn how those worked. I will say, compared to a full stack framework, it was indeed "fresh". But now I want my powerful tools back. Also, the implementation on my new framework isn't very different, so it was pretty easy to switch.
What framework did I choose? Tanstack Start. I wanted true React, no vercel integration, and shadcn compatibility. Tanstack fits all of these requirements. And it has some really cool features. I can define a function that only runs on the server, and easily call it from the client. And, as long as you are careful, the server functions can coexist with client code and not leak server secrets.
Here is an example file
This is an admin route. The database calls are only used in this route, so there wasn't a good reason to encapsulate them in a different file. With these server functions, I can run sql code directly in the same file as my frontend code.It may sound weird, but being able to have this code coexist in the same file is fantastic. They are for the same task, why should they be in different files?
In other news, I guess I am now a vibe coder. I have used AI for coding since chatGPT was released. But it was always just copy and paste between the AI and the code. And usually asking the AI for advice or sample implementations and writing the actual code myself. I never thought that the AI code editors could be worth it. But I tried out Cursor.
I really like the WebStorm editor, but it has one major flaw. On macOS, the first click on a non-focused window should ONLY focus the window, and not interact with the window. VScode has a setting to do this. It should be the default, but at least it is possible. After annoying me for months, I finally tried to ask support if there was a setting I could change. As a side-note, I am currently using an educational free license for Jetbrains, and I was HIGHLY considering paying for it once I get a job again. But support responded to my question by linking a public support request from eight years ago, that was still unfulfilled. I didn't think that following the macOS human interface guidelines was a big request for an IDE that costs hundreds of dollars each year. But apparently that is to big of a request. So I decided that WebStorm with this serious flaw isn't worth paying for, and started looking for a new editor.
VScode is almost really good, and doesn't have the click-through issue that WebStorm has. But I didn't really want to go back to that. On a whim, I decided to try Cursor, since it is based on VScode. It's actually really good. AI still hallucinates like normal, but Cursor keeps it grounded very well. When it is writing code, if it caused a typescript or linter error, cursor will send the error to the AI and have it rewrite the code until the errors are gone. Remember how I had most of the site already written in fresh? For multiple parts of this rewrite, I would open a blank tsx file, give the AI some small instructions, and a copy of my old Fresh code, and it would write the file without issues. Could I have written that code myself? Yep. Did I want to rewrite it again in another framework? Not really. After my trial is over, I will very likely be paying for Cursor. Over the course of two days, with not much spare coding time, my Tanstack Start rewrite has feature parity with my Fresh rewrite, and even some extra features beyond that.
For anyone else who is using AI, but not tried an AI code editor yet, give cursor a try. I can't say it's perfect, but it is probably better than you think.
I've been using Cursor at work which has actually done a much better job than I expected. I think because I sort of refused to use AI for a few years my mind is still stuck in 2022. Especially after the Claude 4 release, I was shocked to see how well it was able to write code.
To the doubters, let me give you a scenario that Cursor has massively assisted me with:
I'm a newer member of the team and have had some work to do in larger legacy codebases with a framework I'm not super familiar with. Cursor saved me probably hours of reading documentation and writing boilerplate code to implement a new feature. Now, I still did a good chunk of the feature implementation, but Cursor helped me get there so much faster than if I did it myself. I made a joke in the Cloudflare outage thread the other day about writing unit tests, but I'm being serious now, Cursor again saved me hours of writing unit tests and it actually did a good job. Again, I have been very surprised by its performance.
It felt like I was cruising through the code. "Cruise coding" if you will.
We tried cursor at work, and I liked it pretty well, but I didn't like it being a fork of VS code. Their way of distributing it for Linux (as an App Image) is absurd.
We have a pretty robust/complicated devcontainer configuration that leans heavily on vs code's ability to run in the container context. This is possible with cursor, but it was buggy and required a lot of extra steps.
Right now we're trying Cline backed by Anthropic's models running on AWS bedrock. It seems to be a similar experience, but Cline is just a vs code plugin. We've been hitting some issues with rate limits, especially with Claude. I'd rather use Claude than Sonnet, and it is the company dime. If anyone has experience with a better provider for anthropic's models, I'm all ears.
I made Everything remote - ESP32 wifi based remote to work with Home Assistant.
I'm not original author, I just printed the enclosure, got PCB made, sourced the parts, soldered it together and then set it up.
It is really nice thing! You flash the ESP with ESPHome firmware via USB and from then you write YAML code for it and flash it over wifi! Then you have to set up Home Assistant to react on button presses with actions you want - there is no premade anything, you can contrl your robot vacuum with it or your lights or TV... I control Music Assistant speaker (basic play/pause) and my TV (which uses Kodi) and I still have a few buttons unused.
I loe the idea and the execution!
And I also learned that having ESP32 usibg ESPHome gives you kinda limitless possibilities.
I'd been trying copilot agents a bit more.
They haven't been doing very well in my main code base, but that code base is a bunch of bespoke stuff in C++ that heavily uses Vulkan as well as some patterns LLMs don't love. For example, LLMs love std::vector, so much that I've had them emit vectors of vectors for 2D arrays. While the LLMs like that, my CPU and RAM don't so I often am doing slightly lower level stuff to avoid such patterns. Copilot does not play very nicely with such things. I can't say I'm surprised it doesn't like this project though. It's far enough from the beaten path that I've triggered GPU driver bugs that no one else has encountered, or at least not that I could find any evidence of in my searches.
I was curious how it would handle a more isolated domain that isn't real time code and so I care less about: implementing a transpiler for a little DSL in my project. This it has been doing remarkably well on. I started with some example code in the DSL and basically told an agent to start writing a transpiler that would covert that to C++. I've found it very useful to guide it along through TDD because the agent is capable of running tests, looking at the failure, and fixing whatever the failure says. The code it's writing isn't great, but the transpiler results are good.
This is the first time I think I've had AI written code take less time to produce something acceptable than just doing it myself from scratch. Usually the AI code is too low-quality for what I'm looking for in real-time graphics code. It just does too many silly things like allocating a bunch of extra memory every frame when trying to upload data to the GPU rather than doing more sensible stuff like rearranging execution order to minimize needed allocations, reusable scratch buffers, whatever else. It is also not helped by that I usually know what the resulting code should be. With a real junior letting them implement it "wrong" and then talking them through how to make it "right" has value that is worth the time of that slower approach, but with AI that's just a way to achieve the result slower than writing code.
Not really starting, just thinking about the how: been thinking about making a mini-solar system model with an Arduino. It'd probably be way more work than you'd expect, but still. I've been wanting to do something with my hands lately.
I'll probably be ordering parts sometime soon. But first need to think a bit more about the how.
Trying, finally, to get ArchLinux running on a Snapdragon X Elite device!
I've installed the latest ArchLinuxArm generic aarch64 image to a USB stick, I've arch-chrooted into it using QEMU (x86 -> aarch64), I've spent an amount of wasted time building and inserting the DTB files for the device from the Ubuntu Concept sources, I've updated the aarch64 kernel from the generic image (which was as old as version 6.2.1-1! they really haven't been keeping up well), I've realised that the updated kernel includes the DTBs I needed all along, and now I'm attempting to patch and build the linux-aarch64 sources with a patchset for this particular device in order to fix an issue with the OLED display which the Ubuntu concept already fixed but has yet to upstream.
I'm learning a bunch, but I've also wasted so much time.