10
votes
What programming/technical projects have you been working on?
This is a recurring post to discuss programming or other technical projects that we've been working on. Tell us about one of your recent projects, either at work or personal projects. What's interesting about it? Are you having trouble with anything?
TLDR; After being ignored for 2 years and dismissed for six months, I finally got to demo a PoC of my idea.
I don't know if projects for work count, but I'm really excited about this one. Sorry for the lengthy rant, but this one is emotional for me.
About two and a half years ago, I was moved to a cyber security compliance team. I was supposed to do the compliance work, but once they saw how much difference scripting out automation could make that part became my sole job.
I was able to make a lot of small improvements that had positive impacts. But one big issue was lingering. Details aside, their current process had them storing log files in lengthy nested directories, with none of the information within them being searchable or cross reference-able. This meant they had to do hours upon hours of manually pouring over thousands of lines across dozens of documents.
I had a vision to fix this. To properly parse the log files into tables, to build a relational database, and to surface that to the team. Something they could query, something where they could see data by any attribute, not just by date or server name. For two years I tried to get the green light on this. I laid out the project to three different managers, all to no avail. It was never considered a big enough priority. I ended up getting moved to a sister team, but I never lost this dream.
Six months ago, after a horrendous audit that devastated the compliance team emotionally and mentally, senior leadership suddenly decided it was time to fix the problem. My team is responsible for helping to fix it. I was excited! Until... I found that one of my team mate and our manager were pushing to move everything over to our.... ticketing system. They wanted to cram it all into a proprietary cloud based CMDB product. A product not built for this. A product that doesn't solve the real issue. I've tried for six months to explain why this is a bad idea. Why using a product not built for this purpose would require a bunch of ugly hacks and would end up a disaster. They aren't developers, and my expertise in that aspect hasn't been respected. I've been dismissed out of hand at best, and ignored at worst.
But finally, my other coworker (we are a team of three) got involved in the project, and he asked me to create a proof of concept and demo it to our team and the compliance team. After being straight up shut down for so long I was really excited!
Tuesday I gave the demo, and it was a HUGE SUCCESS!!! The compliance team was super impressed, they loved it, and were really excited about it. One of the arguments I constantly run up against is that I'm the only one with a developer background, so building something ourselves isn't an option because no one else can maintain it. But for the demo I used a Microsoft product called PowerBI for the front end, which is super user friendly and very realistic for anyone to learn. The back end is really simple Python. Even my manager who has been dismissive is open to it and congratulated me on the demo. I was nearly in tears over this.
My other coworker still seems pretty set against it (he almost seems annoyed that it's being seriously considered now). The decision between my vision and the proprietary CMDB option (which costs an additional 50k btw) hasn't been made yet. The CMDB could still be selected, for political reasons (sunk cost fallacy) if nothing else, but I'm just happy I was at least given the opportunity to show off my vision, and the validation of others loving it has helped me to feel a lot better.
I'm currently hunting for a cheap/free 3d SLAM library that I can use with a depth camera and imu (like a d435i or oak-d).
I've got my twitter clone project that is crawling along. Of course I had big dreams when I started, but man there are so many little issues to solve that add up. I work on it all the time but rarely do big changes show up in the UI. So many little background and data integrity issues to always iron out :D
gittuf that I've mentioned before is fast approaching an alpha release. We've applied to join the OpenSSF as a sandbox project which we think should happen in the next few weeks!
Edit: gittuf is a security layer for Git repositories that can handle things like key distribution and write access control policies for Git repositories in a distributed and transparent way.
How do the permissions work? I don’t understand what that means for a git repo.
(Edited: missing “don’t”)
We started with a variant of TUF-style delegations. In a gittuf policy, you define rules specifying the authorized signing keys for the namespaces you want to protect. For example,
protect-main: {git:refs/heads/main} -> (1, {Alice, Bob, Charlie})
is a rule that says you authorize one of Alice, Bob, and Charlie (rather their keys) to modify the state of your main branch, and this is verified against a signed entry in the repository's reference state log (akin to an authenticated and synced reflog). For file policies, the current (pre-alpha) implementation applies similar mechanisms on raw commit signatures. This doesn't scale to all scenarios and can't handle more complex workflows that need multiple authorizations, so we have support for in-toto / SLSA source attestations (still being developed) on our roadmap.Edit: We have a demo here: https://github.com/gittuf/demo. Note that gittuf needs Go 1.20 or higher.
Sorry, I meant more the basics. I assume this is all about signing commits? It seems like you could always do what you like in a git repo, you just can’t sign it. When are signatures checked?
I’ve never used commit signing, though I know git has that feature.
What is a “TUF-style delegation?”
Ah my bad, I got misled by the do / don't.
I'm not sure I fully understand this.
More generally, you can sign git commits and tags (and pushes but setting that aside for now...) using a few different options. The default is PGP but Git also supports X.509 (used by gitsign for example) and, as of a few months ago, SSH keys. However, Git doesn't let you define policies around who can issue commit or tag signatures. That's left entirely to the user, you've got to use the web of trust to determine verification keys if the sigs are from GPG keys, elsewhere for the other options. Even when using the web of trust, you don't arguably have a clear association between the identity claimed by a key and the repository developers. To restate, you can have a valid signature from a key but it's not obvious that key is the right key for the repository.
This means that git signatures are largely not used at all, or if commits are in fact signed, verification is quite rare outside of possibly some major repos like the kernel. Git has
verify-commit
andverify-tag
but that again relies on you to populate your keyring with the right public keys. (Also, verifying signatures across different methods, say you use GPG but someone signed using gitsign on a repo, is also broken as your default signing mechanism is used to verify all signatures. Overall, the defaults for signing and the verification story for commit / tag signatures is less than ideal.)What gittuf does is use a subset of the TUF spec (which is a secure software delivery framework) to solve these key management issues. While TUF has an overall focus on software distribution, it embeds several targeted PKI like features. So you can use TUF to distribute, rotate, and revoke trusted keys, and in gittuf, the keys apply to the repository itself. in gittuf, this is expressed as "policy" files, which are embedded in the Git repo in question.
In addition to the key distribution and management, TUF has a notion of roles and delegations. My example above:
protect-main: {git:refs/heads/main} -> (1, {Alice, Bob, Charlie})
is an example of a delegation. A "role" that is trusted to write to a namespace may "delegate" a subset of this trust to another party. Here, the policy delegates trust for the main branch to Alice, Bob, and Charlie. gittuf and TUF differ a fair bit in how delegations work but the overall idea is about the same.By default, in Git, you can use verify-commit, verify-tag.
git merge
has a--verify-signatures
option which is inherited bygit pull
. But see above about policy management being left to the verifier. With gittuf, we want to associate the repo with its keys sogittuf verify-commit
uses the policy to verify signatures, also solving the multiple signing methods problem in the meantime.I hope this is clearer!
Thanks! Yes, that agrees with my understanding that verifying git commits is rare, so signing them doesn't do much.
I guess this all depends on people running 'gittuf verify-commit?' It seems like making that happen automatically as part of 'git clone' would be the next step.
I'm reminded of how Go developer tools do checksum verification. There's a checksum database of all Go modules, but it's 'trust on first use,' where the first use is whenever anyone asked the proxy to download that version of a Go module for the first time. The chain of trust ends at GitHub (or similar).
That's pretty good, but it assumes GitHub is secure and none of the developers' GitHub accounts got broken into. Maybe that chain of trust could be pushed back earlier in the process, so GitHub is just a cache.
It depends on the specific application / workflow. gittuf verify-commit and so on are helpers to align with guy’s own workflow but we’re building in some clone and fetch capabilities to verify transparently. The thing is, on one end you may want verification by all users all the time but that’s just hard without building it into git itself. On the other end of the spectrum, I think there’s a lot to be gained from even just the maintainers using gittuf to verify all the time. Related is also Guix’s attempt to embed GPG keys trusted for their repository: https://arxiv.org/pdf/2206.14606.
Absolutely! I think there’s also real scope to allow for signing go releases using the underlying git signatures because that’s the overwhelming majority of go packages anyway.
IMO there’s a lot of scope to work with GitHub, GitLab etc. for gittuf once we start exploring policy transparency and auditability of historic policy compliance. I’m hoping the OpenSSF is a good venue to sketch all of this out with the right people. :)
Already posted about Bot Typist in the AI megathread.
Trying to learn the web framework svelte(kit) by building a web implementation of the game Phutball. My goal is to at least have some sort of usable web interface so I can show weird abstract strategy games to my friends. The main roadblock right now is that I have no idea what I am doing.
Also somewhere in my brain's orbit is the thought of diving deeper into Veilid which seems to be a pretty interesting communication protocol with an oddly forgettable name (I heard of it some time in July and then forgot its name and had to wait until DEFCON to learn more about it. My mnemonic is Veil-ID which hooks into its whole privacy-oriented theme even if it's not the true etymology). Right now if I have the motivation the most obvious first step is just spinning up a node, though I haven't found the time to figure it out yet.
I'm working on a drawing/note-taking app, that I plan to use on e-ink stylus tablets - like what the Mobiscribe/ReMarkable etc have, but open-source. I still don't know how I'm going to do the lasso tool*, but that's a future problem. I'm having fun so far.
It's part of an overarching project of mine: to write a replacement FOSS e-note stack (the other part of the project will be some sort of homescreen/launcher, tailored speciifically for the e-note use-case). Everyone who hacks an e.g. ReMarkable either uses the default proprietary stack, or wants to turn it into a desktop instead of drawing. There's technically Xournal++, but that thing's interface has so many buttons it could go toe-to-toe with LibreOffice - it seems massively overcomplicated and unpolished, and is apparently quite slow on e-ink tablets. That said, it is possible I'm NIHing.
*the lasso tool works as follows (please note: I'm not asking for help here, it's just an interesting problem):
a lasso is a line that you use to surround other lines (you can cheat and connect the start and the end of the lasso line, so it makes a polygon). A line is an array of points. Lots of points, so treating it as a polygon is ill-advised.
Any line can be an arbitrary shape, and the lasso can be convex (because it's an arbitrary line). Is there a good solution that doesn't check whether every single point of every single line on the page is within the 1000-gon lasso-line? Also, the user is sloppy and might nick the edges of a line they're lassoing, so there's a fuzzy and I-don't-know-the-exact-definition cut-off of how much of the line can be outside of the lasso. It might be 90%, but maybe distance from the lasso matters? I don't know.
For bonus points, the lasso should update in real-time as new points are added to the end of the line (not that it's relevant; that's not a feature well-suited to e-ink.
The key questions here are about what assumptions I can make about the line: for example, can I assume the dots of the line are more-or-less evenly spaced along the line?
This is mostly a question about performance, which I really ought to measure before making decisions on. I'm having fun thinking on it, though - my first bet is probably to try to put a handful of rotated rectangular bounding boxes around segments of the line, such that I can check if... it intersects? Wait, that doesn't help for lassos, only erase-by-line.
(1) RSS reader: implemented better controls for controlling the order of my social media submission queue, better internal API for arangodb transaction handling, thinking about better recommendation algorithms but the real thing on my mind is that I find a lot of articles that are syndicated and I often manually trace back original sources, check to see if they are open access, and substitute the original source. I thought previously of doing it in one of the batch jobs but I may code something up that does it for a single page when I hit a button. Currently the system doesn’t keep track of where a feed item came from because I thought I only needed the URL but when I get links from something like the Tildes RSS feed my system doesn’t know it came from Tildes so I need to rework the import system but don’t really feel like it.
(2) Blog. Little progress. Still rewriting the first post.
(3) Three-sided cards. Finished an art project that was hanging over my head.
https://mastodon.social/@UP8/111052075781351382
I recently upgraded the QR codes so that they are unique for each card, but I am not really doing anything based on it. I was thinking of keeping a database on my local computer to keep track of individual cards using arangodb or sqllite to keep track of the prints but also thinking of using DynamoDB which will cost a few $ a month but will be useful for making public services based on the data. (One thing I’m thinking of is a parody of NFTs where people could register cards and trade them with other people.)
After a lot of efforts including fixing my tripod and trying some different cameras I gave up on my project of making stereograms of flowers so to regroup from that I need to try another stereography project and that will lead to more software work on the stereogram processing software.
I'm learning perl and how to manage about 3 million user identities.
Should be fun.
Pulling Google Analytics data into BigQuery. That one's a pain because all of the parameters are in arrays so I'm learning how to use window functions to make it usable.
Also working in Python with the HubSpot API. Why I can't just get this stuff into json is beyond me. It's all paginated so I guess I need to set up a for loop to pull the next page's URL and somehow get it all into one json.
This is all because I'm tired of manually exporting HubSpot, Google Analytics, Google Ads, Google Search Console into tables to do anything useful with the data. Trying to automate bringing all this into one DB to build my reports & dashboards. If anyone has any shortcuts I'd love to know. GA4 seems to have a good bit of YouTube tutorials in BigQuery but the HubSpot API doesn't seem to have any in-depth tutorials. I'm fairly new to both BigQuery SQL and Python.
I made a thread here recently about alternatives to Virtualhere. Got some good leads on how to use usbip. I figured out how to get it to do what I need, but it's a little clunky to get going after a reboot so I'm currently working on a GUI to detect and allow for connecting/disconnecting devices on usbip. I'd like to share it if I ever get it somewhere I'm happy with.
I'm working on building a riddle game in the style of Notpron or Oddpawn. I loved those games and no one makes them anymore.
Learning godot with some tutorials. Plan on making maybe a small little action/platformer to get the ropes or something. Hoping off the unity ship that I've been dabbling with the past few years.