DataWraith's recent activity
-
Comment on Has anyone ever used NixOS as daily-drive distro? in ~tech
-
Comment on Has anyone ever used NixOS as daily-drive distro? in ~tech
DataWraith I ran NixOS and home-manager on all my machines for about a year using flakes and deploy-rs. The upsides were nice: NixOS is among the largest package repositories, so what you want to install is...I ran NixOS and home-manager on all my machines for about a year using flakes and deploy-rs.
The upsides were nice:
-
NixOS is among the largest package repositories, so what you want to install is likely available already, and it often takes just a single flag to install something with a sane base configuration.
This is especially nice for self-hosted stuff (e.g. an RSS reader). Instead of messing with Docker containers (though you can do that too), you just flip a flag, and it installs postgres and whatever other dependencies are needed.
-
Configuration can be versioned with git and is mostly shareable between machines, so you can have the same shell configuration and whatnot even when SSHing into a server, and deploying the configuration to said server is trivial with
deploy-rs
.
However, compatibility was a big problem for me:
- CUDA and PyTorch and, really, anything involving the GPU was a PITA to get working and would break once in a while for non-obvious reasons.
- Installing software that is not yet packaged is a bit arcane. I've had a lot of trouble with third-party python packages in particular, which sometimes resulted in hours of wasted effort before I gave up.
- NixOS doesn't adhere to the FHS standard, so third-party software will usually not work, or only work after a lot of trial and error. This includes projects that are packaged as AppImage, which normally runs everywhere because it includes all dependencies.
I've migrated off of NixOS a few months ago and switched to Bluefin, which is an immutable distribution based on Fedora Silverblue, and I'm very happy with that decision.
Just like NixOS, the system root is immutable, and it automatically updates to new versions in the background using OSTree. Almost all apps I need are available via
homebrew
or Flatpaks, everything else is either distributed as an AppImage -- that just works --, or in the usual way (e.g. Python pip), and if something is really missing,distrobox
can run packages from other distributions in containers, though I haven't needed to do that so far.While I can't speak of the long term yet, so far the distro has been rock-solid. Easily the least problems I've ever had with a Linux installation -- less trouble than Ubuntu, even Debian and god-forbid Arch; it just works.
-
-
Comment on What programming/technical projects have you been working on? in ~comp
DataWraith I've been learning the Zig language during the past few weeks. Overall, it's a nice language that bills itself as "a better C", but I do occasionally miss the safety provided by the Rust borrow...I've been learning the Zig language during the past few weeks. Overall, it's a nice language that bills itself as "a better C", but I do occasionally miss the safety provided by the Rust borrow checker. I've made mistakes with pointers multiple times -- mistakes that the Rust compiler would have caught -- and gotten obscure error messages or segfaults as a consequence. Segfaults are kind of hard to debug for someone without the skill to wield a proper debugger, because all you get is a "Segfault at address xyz" message and then the program aborts.
The standard library is very cool and makes heavy use of the
comptime
feature. Zig allows you to execute emulated Zig code at compile time, and that can be used to operate on types. For example, the built-instd.ArrayList
function takes a type parameter (e.g.u32
) and then gives you a specializedArrayList(u32)
type that can store unsigned 32-bit integers, similar to Rust'sVec<u32>
. It somehow feels more elegant than Rust's approach to generics, though I can't quite say why.My learning project was to implement yet another Battlesnake, because I tend to do that for every new programming language I try. It's a nice way to exercise a language's features and ecosystem because you need JSON parsing, a webserver and game tree search logic.
Besides the occasional segfault, everything was very smooth with regards to the language, though I'm disappointed that the playing strength of the Zig snake is not quite where I was at with the Rust version. I've got a few scripts that will run matches between different versions of my snakes and incrementally calculate ratings with a new variant of Elo, which I find to be fairly accurate when compared to a (much slower) maximum-likelihood estimate of the ratings. It's endlessly frustrating to write code that you think should be superior to what you had before, and then it just under-performs even in the local league, though.
Right now, I'm implementing a way to tune the evaluation function using an Evolution Strategy (DiscoveredES). That has worked well with Rust, so I'm hoping that this will close the performance gap when I finish it in Zig.
-
Comment on What programming/technical projects have you been working on? in ~comp
DataWraith It does save a lot of time. Finding and reading the API documentation for Arxiv alone would have taken me at least 10-20 minutes, and Claude just did it within seconds, XML parsing and all (though...It does save a lot of time. Finding and reading the API documentation for Arxiv alone would have taken me at least 10-20 minutes, and Claude just did it within seconds, XML parsing and all (though I suppose I could have found a snippet on StackOverflow instead). The code is also nicely documented and commented, even though I did not ask for that specifically.
I thought using the LLM to extract the paper title would be the most error-prone part of the whole thing, but it turns out that it was
pdftotext
. PDF is a famously complicated file format, and some of the documents don't actually contain text but have scanned page-images instead (I suppose I could slot in OcrMyPDF for those), but others fail for mysterious reasons andpdftotext
outputs a jumbled mess of random characters. There were four or five of those in the batch of 200 papers that had accumulated in my Downloads folder. -
Comment on What programming/technical projects have you been working on? in ~comp
DataWraith I accomplished a small, but fairly satisfying project: semi-automatically renaming downloaded scientific papers with unhelpful filenames into The Actual Title of the Paper.pdf. In order to do...I accomplished a small, but fairly satisfying project: semi-automatically renaming downloaded scientific papers with unhelpful filenames into
The Actual Title of the Paper.pdf
.In order to do that, I've written a specification of how I want to rename the files in English, and then Claude Sonnet 3.5 did most of the work. It is still mind-boggling to me that the LLMs can take a specification in English and turn it into a working Python script faster (and, arguably, better) than I could have done it myself in the same amount of time it took me to write the spec. I used to think LLMs just weren't capable enough to do that kind of thing, but I guess it pays to just try and see if maybe they are.
The file renaming itself works as follows:
If the filename starts with a valid Arxiv.org ID, hit the Arxiv API and get the paper title that way.
Otherwise,
pdftotext
is used to extract the first page of the PDF, which usually contains the title of the paper. This step can fail to return readable text, so I have a manual confirmation here before proceeding.The paper's title is non-trivial to extract from the exported text though, because (a) it's not always right at the top and (b) it can be slightly corrupted (linebreaks, L E T T E R S P A C I N G, etc.).
To get around that problem, I just add more AI (of course): the script pipes the extracted text into Simon Willison's llm tool, which makes a call to
gpt-4o-mini
to extract the title of the paper.So far I've renamed over 200 PDFs that way, and GPT extracted every single paper title flawlessly, as long as
pdftotext
worked. The API cost for this was about 1 cent per 100 processed documents. -
Comment on Game and programming exercise based on The Prisoner's Dilemma (need Beta Testers) in ~games
DataWraith (edited )LinkI've cobbled together a strategy in Rust, and I thought I'd provide some feedback. Most of these are me-problems, but I thought I'd share my thoughts. The Mario messages are silly. It's probably...I've cobbled together a strategy in Rust, and I thought I'd provide some feedback. Most of these are me-problems, but I thought I'd share my thoughts.
- The Mario messages are silly. It's probably for fun, but String matching also feels very error-prone (capitalization, accidentally using a different kind of apostrophe). Maybe use single-letter messages next time? Or at least make the messages have a distinct first character, so that one can match on that.
- The requirement for JSON strings is similarly onerous. Why not accept bare
Cooperate
(or even justC
) on the server-side without having them be wrapped in quotation marks? AbbreviatingCooperate
toC
would also make it much harder to misspell the word. - I tested my strategy using
curl
POST requests, and it seems to work fine, but the server says that it does not receive a response from the strategy for the game-start message. I added some more logging to try and catch what is going on, but it would be nice if there was a "test this strategy" button that plays a 5-turn game with the strategy when it is clicked. Maybe gated behind the strategy's secret key. - The documentation says in multiple places that the secret key is provided to you on registration, but I could enter my own secret key when registering?
- Removing badly-performing strategies might apply some evolutionary pressure, but I think it'd be nice to actually be able to play as long as you're willing to host the strategy. People are going to drop out of the rotation naturally over time anyway.
- The documentation says games happen every 4h, but actual frequency appears to be 1h.
Edit: Figured out why it wasn't working using my logs. Apparently the
opponent_id
is sent as an integer instead of the String suggested in the documentation (theopponent_id
value is enclosed in quotation marks in the example). -
Comment on What programming/technical projects have you been working on? in ~comp
DataWraith I would have liked to try that, but I don't think it is feasible when using a single computer only. Multi-agent reinforcement learning is supposedly even more difficult to get working than normal...I would have liked to try that, but I don't think it is feasible when using a single computer only. Multi-agent reinforcement learning is supposedly even more difficult to get working than normal reinforcement learning, and I don't think I have enough compute to train a good policy anyway.
Running neural networks in Rust is also a big pain -- I've looked into using decision trees instead, but didn't get any useful results.I have a few more ideas how to improve the win-rate, but getting the bot into a state that is ready for submission to the ladder will likely take two or three more weeks.
-
Comment on What programming/technical projects have you been working on? in ~comp
DataWraith I've been working on a bot for the StarCraft II Micro-Arena on sc2ai.net. Since I'll likely never get a proper bot for the whole game done, I figured I'd try the micro league for now, though now...I've been working on a bot for the StarCraft II Micro-Arena on sc2ai.net.
Since I'll likely never get a proper bot for the whole game done, I figured I'd try the micro league for now, though now that I've been working on it a couple of weeks, I have my doubts about my ability to even write something that works well on this small scale...
The MicroArena starts you off with a handful of units and the objective to destroy the enemy's pylon on the other side of the (small) map.
I'm currently developing my bot by pitting it against the simplest possible strategy, one that just attack-moves towards my pylon.Unfortunately I just can't seem to get a win-rate higher than 62% even against this simple opponent. And 62% is a pretty bad winrate, since the attack-move bot will basically lose automatically if it spawns on the left side of the map. In that case, the built-in pathfinding will, probably due to an asymmetry in the map, go through the ravine in the center of the map where it can be shot at with impunity. So factoring out the guaranteed wins, that's an abysmal effective winrate of 24%.
Some of it comes down to randomness due to the way the units pathfind and bump into each other. Much of it is due to bad decision making though. I often see behavior that is objectively bad, but I can't really think of a way to teach the bot not to do that. Or if I find a way to teach it not to do that, something else breaks in a neverending cycle.
The bot itself is made up of a few components:
- I'm using a Timer lawn to schedule the decisions of my bot (e.g. a unit may be busy with attacking and can't take orders right now, so we check it again in 5 frames), though that would be much more useful in a bot for the full game.
- The bot uses its own pathfinding algorithm based on a fast variant of Dijkstra's algorithm that does not need to keep a priority queue. It basically works exactly like the timer lawn in that it keeps multiple FIFO queues and dequeues the next item by checking all of them for the highest priority item. It builds a map of distances to the opponent's pylon and uses that whenever it is not in combat to advance towards the objective.
- The pathfinding algorithm uses Otsu's Thresholding method to partition the map into high ground and low ground, since high ground is advantageous to stand on when the enemy doesn't. StarCraft terrain can range from height 0 to 255, so histogram thresholding works perfectly here, though there's probably a simpler way to do this. High ground gets a cost bonus, so the pathfinder will automatically avoid going through the ravine, even when starting from the left.
- The combat behavior uses a forward simulation of who is firing at what and tries to avoid overkill. I've been tweaking the heuristics endlessly, but I can't seem to break the 62% winrate barrier. I'd love to use something like linear programming to improve this, but I'm worried about the runtime -- despite being written in Rust, the bot is taking 5ms per frame already as it is, which is kind of slow.
-
Comment on Backdoor in upstream libxz targeting sshd in ~comp
DataWraith There's the lzip format, which is is supposed to be that. Focused on simplicity and security, lzip is supposed to be a drop-in replacement for gzip with better compression ratio (via the LZMA...There's the lzip format, which is is supposed to be that. Focused on simplicity and security,
lzip
is supposed to be a drop-in replacement forgzip
with better compression ratio (via the LZMA algorithm) and a more modern file format. It was introduced because of weaknesses in the .xz-format, but from what I can tell, it is not very popular; most users either seem to continue to use.xz
or have switched tozstd
. -
Comment on Signal messenger releases 'usernames' so you no longer need to tell someone your phone number in order for them to message you in ~tech
DataWraith I was debating whether or not to post this in the thread, but I guess I may as well, the post can always be ignored if it isn't helpful. The Simplex.chat messenger uses neither usernames nor phone...I refuse to believe that it isn't possible to make a secure messaging system that doesn't require a phone or phone number.
I was debating whether or not to post this in the thread, but I guess I may as well, the post can always be ignored if it isn't helpful.
The Simplex.chat messenger uses neither usernames nor phone numbers nor any persistent (human-readable) identifiers. That's the upside. The price you pay is that it isn't as convenient to use, because you need to invite every person you want to chat with with a QR-Code or Hyperlink. It also seems to mostly be a one-man show instead of having the backing of a corporation, so the bus factor is quite low.
I quite like how they actually deliver the encrypted messages: You send it to one or more random public relay servers, and only the recipient knows which one(s) to query and can then decrypt the message, but because it is somewhat difficult to use, nobody I know was willing to give it a try...
-
Comment on What programming/technical projects have you been working on? in ~comp
DataWraith Ah, okay. Then you've probably tried Naive Bayes filters already, and those have somewhat comparable accuracy. I mostly mentioned the algorithm because it is so easy to implement, and it deals...Ah, okay. Then you've probably tried Naive Bayes filters already, and those have somewhat comparable accuracy.
I mostly mentioned the algorithm because it is so easy to implement, and it deals with context better than the usual implementation of Naive Bayes filters because it uses short phrases instead of single words for classification.
There does not appear to be a well-maintained, ready-to-use implementation of it however, so the point is moot.
-
Comment on What programming/technical projects have you been working on? in ~comp
DataWraith I wonder whether a more effective filter than regular expressions would help here. To be honest, I've only used this on toy projects, but a Winnow classifier with Orthogonal Sparse Bigrams as...When you have a high positive rate people ignore the channel more often than not, making it a dead tool. Additionally, the amount of false positives with a simple word filter will already be several factors higher anyway, as you can only input those words where you are absolutely sure some shit has hit the fan.
I wonder whether a more effective filter than regular expressions would help here.
To be honest, I've only used this on toy projects, but a Winnow classifier with Orthogonal Sparse Bigrams as features was both very simple to implement and fairly accurate; this paper (PDF) describes the approach in the context of email spam.
The basic idea is to track small word groups and adjust their weights upwards or downwards whenever the classifier is wrong about a message. That way, you might be able to train a cheaper alternative to GPT-4 using GPT-4 feedback over time.
-
Comment on Are any of you AI gurus? in ~comp
DataWraith The project sounds interesting, so I'll chime in with what I think may be helpful, though I lack direct experience with some of these. Take it as a springboard for further research. Analyzing...- Exemplary
The project sounds interesting, so I'll chime in with what I think may be helpful, though I lack direct experience with some of these. Take it as a springboard for further research.
Analyzing faces in Videos
Finding (frontal) faces in still images and hence videos is relatively simple using Haar Cascades (also known as Viola-Jones method).
Knowing whether an image or video frame contains a face, and how many, may already be valuable on its own. You can use the detected positions to crop the image and then use something more sophisticated (e.g. a neural network) to actually find out who it is.
If you have reference images, you might be able to leverage research (or already available open source projects) that do Face verification to check if the person on screen matches a reference, though I haven't done anything in this space yet.
General Object-Detection
There are a bunch of available object-detection methods (YOLO, RCNN), but I don't know too much about the recent developments here. These tend to be able to identify a lot of different object classes (cars, boats, trees, etc.).
Audio Transcription
You'll probably want to extract what is said in the videos -- I was very impressed with OpenAI's Whisper speech recognition. There is an open source project, Whisper.cpp that you can also use to run the models.
In my testing I've had slight problems with hallucinations on the smaller models, but for the most part, it has been fairly reliable. You can output timestamps and the recognized text; it should certainly be good enough for a search engine or the like.
OCR
I'm aware of many older attempts at doing video OCR. They generally try to use adaptive binarization to separate foreground text from background and then run a stock OCR engine like Tesseract. However, I'm sure there are also newer deep learning-based methods for this.
LLaVA
One interesting thing that has appeared lately is LLaVA, which is a language model you can interrogate about the contents of a specific image; the results range from mildly infuriating to very impressive -- it can do a lot of OCR by itself and recognizes some celebrities, for example.
There is an extension that applies it to video, but I have not looked into that yet.
Labeling Functions
This may not be useful, so I'm listing it last. However, given the scale of your project, it may be handy to let the computer combine 'weak' labeling signals into a stronger classifier. You could then classify the different scenes with greater ease (e.g. indoor vs. outdoor).
I've worked with Snorkel on toy-projects, but I don't see a reason why it wouldn't work for something like your project.
The idea behind Snorkel is that you combine many weak and possibly conflicting signals into a strong one. For example, you could combine a simple detector that checks if the audio transcript contains a word with estimates from a YOLO object detector and/or with the scene description from something like LLaVA or even the opinions of different people who may manually label a subset of the data.
-
Comment on Syncthing on a VPS questions in ~comp
DataWraith I usually just forward the web-interface port through SSH. Then I can just access Syncthing's web-interface and configure it that way. ssh -L 8080:localhost:8384 username@server (Note that this...- Exemplary
I've been just editing the config.xml file and restarting it. It feels clunky editing it in nano especially when I have to delete a folder or remove a device.
I usually just forward the web-interface port through SSH. Then I can just access Syncthing's web-interface and configure it that way.
ssh -L 8080:localhost:8384 username@server
(Note that this will also start a normal SSH session, so you will see the shell prompt of the remote server in your terminal).
With this command, SSH forwards the local port 8080 to the remote port 8384, where the web-interface should be listening by default. You can then access the remote web-interface from
http://localhost:8080/
.From what I understand, the web-interface only listens on 127.0.0.1, so having the web-interface enabled is not a security problem because nobody can access it if they aren't using an SSH tunnel or work on the machine itself. You can also set a password for the web-interface if that is not good enough.
-
Comment on Google's Say What You See - Come up with a prompt to match an already generated image in ~comp
DataWraith This was called the ESP game, in case anyone is curious.This was called the ESP game, in case anyone is curious.
-
Comment on Day 20: Pulse Propagation in ~comp.advent_of_code
DataWraith Thank you for your kind words. I've slept on it for a night now, and realize that I probably was too harsh on myself with the "no spoilers" rule I had apparently made up for myself. After all,...Thank you for your kind words. I've slept on it for a night now, and realize that I probably was too harsh on myself with the "no spoilers" rule I had apparently made up for myself. After all, even the "wrong answer"-page links to the subreddit, so I guess looking for hints is legitimate.
I also hadn't thought I could just continue with a star missing (thanks, @wycy), hence my huge disappointment. I knew the next day's puzzle automatically unlocks, but somehow I didn't think of just skipping this part 2. I legitimately thought I was out for the year. And while the puzzles can be frustrating, I genuinely enjoy working on them, even if it takes hours where other people spend minutes.
-
Comment on Day 20: Pulse Propagation in ~comp.advent_of_code
DataWraith Well, today is where I jump ship. Part 1 was a lot of fun, and I was done quickly, but part 2 was too hard for me to solve within five hours. :( I guess I could look up the solution, but I think...Well, today is where I jump ship. Part 1 was a lot of fun, and I was done quickly, but part 2 was too hard for me to solve within five hours. :(
I guess I could look up the solution, but I think that defeats the purpose, so I'll just give up here.
Thoughts
Part 1 was fun, but I have no idea how to solve part 2. Made a visualization of my graph, which shows several nasty cycles, and noticed that every second button press seems to repeat the same sequence for my input, but none of my attempts to suss out any regularity besides that came to fruition. Tried to see if there are any cycles for sub-graphs, but that didn't work, and I made all kinds of crazy guesses -- up to and including importing a SAT solver library to try and work backwards from the output before taking a step back and realizing that that can't possibly be required....
Part 1 (Rust)
pub fn part1(input: PuzzleInput) -> isize { let mut sim = CircuitSim::new(input); let mut low_pulses = 0; let mut high_pulses = 0; for i in 0..1000 { sim.button(); while sim.tick() { // Nothing to do here } let pulses = sim .pulse_sequence .iter() .map(|(_, pulse, _)| pulse) .collect::<Vec<_>>(); low_pulses += pulses.iter().filter(|&pulse| **pulse == Pulse::Low).count(); high_pulses += pulses .iter() .filter(|&pulse| **pulse == Pulse::High) .count(); } (low_pulses * high_pulses) as isize } #[derive(Clone, Debug)] pub struct CircuitSim { pub modules: HashMap<String, Module>, pub queue: VecDeque<(String, Pulse, String)>, pub pulse_sequence: Vec<(String, Pulse, String)>, pub button_presses: usize, } pub fn tick(&mut self) -> bool { if self.queue.is_empty() { return false; } let (from, pulse, name) = self.queue.pop_front().unwrap(); self.pulse_sequence .push((from.clone(), pulse.clone(), name.clone())); if let Some(module) = self.modules.get_mut(&name) { for output in &module.outputs { match module.state { ModuleState::Broadcaster => { self.queue.push_back(( "broadcaster".to_string(), pulse.clone(), output.clone(), )); } ModuleState::FlipFlop { ref mut on } => { if pulse == Pulse::Low { if *on { self.queue .push_back((name.clone(), Pulse::Low, output.clone())); } else { self.queue .push_back((name.clone(), Pulse::High, output.clone())); } } } ModuleState::Conjunction { memory: ref mut cur_memory, } => { cur_memory.insert(from.clone(), pulse.clone()); let pulse = if cur_memory.values().all(|pulse| pulse == &Pulse::High) { Pulse::Low } else { Pulse::High }; self.queue.push_back((name.clone(), pulse, output.clone())); } } } match module.state { ModuleState::Broadcaster => {} ModuleState::FlipFlop { ref mut on } => { if pulse == Pulse::Low { *on = !*on; } } ModuleState::Conjunction { memory: ref mut m } => {} } } true } }
-
Comment on Day 19: Aplenty in ~comp.advent_of_code
DataWraith After yesterday's frustrating math homework, today's part 1 was great fun. Unfortunately I then proceeded to waste hours on a simple bug in part 2 that I could just not find. I hate it when that...After yesterday's frustrating math homework, today's part 1 was great fun. Unfortunately I then proceeded to waste hours on a simple bug in part 2 that I could just not find. I hate it when that happens; you waste hours debugging everything, except the small piece of code where the bug actually is... that greatly diminished the fun.
Seems to be a theme for me this year: Figure out the correct algorithm quickly, then completely bungle its implementation.
Rust
pub fn part2(input: PuzzleInput) -> isize { let mut flows = HashMap::default(); let mut accepted = Vec::new(); let workflows = Rc::new(input.workflows); let flow = Flow { part: PartRange { x: 1..4001, m: 1..4001, a: 1..4001, s: 1..4001, }, current_workflow: "in".to_string(), current_index: 0, workflows: workflows.clone(), accepted: 4000 * 4000 * 4000 * 4000, }; flows.insert(flow, 1); loop { let mut new_flows = utility_belt::misc::state_iteration(&flows, transition, ()); for (flow, count) in new_flows.iter_mut() { if flow.current_workflow == "A" { accepted.push(flow.clone()); } } new_flows.retain(|f, _| f.current_workflow != "A" && f.current_workflow != "R"); if new_flows.is_empty() { break; } flows = new_flows; } accepted.iter().map(|x| x.accepted).sum::<usize>() as isize } #[derive(Clone, Debug, Hash, PartialEq, Eq)] pub struct PartRange { pub x: Range<usize>, pub m: Range<usize>, pub a: Range<usize>, pub s: Range<usize>, } #[derive(Clone, Debug)] pub struct Flow { pub part: PartRange, pub current_workflow: String, pub current_index: usize, pub workflows: Rc<BTreeMap<String, Workflow>>, pub accepted: usize, } impl Hash for Flow { fn hash<H: std::hash::Hasher>(&self, state: &mut H) { self.part.hash(state); self.current_workflow.hash(state); self.current_index.hash(state); } } impl PartialEq for Flow { fn eq(&self, other: &Self) -> bool { self.part == other.part && self.current_workflow == other.current_workflow && self.current_index == other.current_index } } impl Eq for Flow {} fn transition(from: &Flow, input: &()) -> Vec<Flow> { let mut result = Vec::default(); let workflow = from.workflows.get(&from.current_workflow).unwrap(); let rule = workflow.rules.get(from.current_index); if rule.is_none() { let mut new_flow = from.clone(); new_flow.current_workflow = workflow.default.clone(); new_flow.current_index = 0; result.push(new_flow); return result; } let rule = rule.unwrap(); let ((lower_next, lower_next_index), (upper_next, upper_next_index)) = match rule.comparison { Comparison::LessThan => ( (rule.next.clone(), 0), (from.current_workflow.clone(), from.current_index + 1), ), Comparison::GreaterThan => ( (from.current_workflow.clone(), from.current_index + 1), (rule.next.clone(), 0), ), }; match rule.category { Category::X => { let full_count = from.part.x.end - from.part.x.start; let (lower_range, upper_range) = match rule.comparison { Comparison::LessThan => { (from.part.x.start..rule.value, rule.value..from.part.x.end) } Comparison::GreaterThan => ( from.part.x.start..(rule.value + 1), (rule.value + 1)..from.part.x.end, ), }; if !lower_range.is_empty() { let lower_count = lower_range.end - lower_range.start; result.push(Flow { part: PartRange { x: lower_range, m: from.part.m.clone(), a: from.part.a.clone(), s: from.part.s.clone(), }, current_workflow: lower_next, current_index: lower_next_index, accepted: from.accepted * lower_count / full_count, ..from.clone() }); }; if !upper_range.is_empty() { let upper_count = upper_range.end - upper_range.start; result.push(Flow { part: PartRange { x: upper_range, m: from.part.m.clone(), a: from.part.a.clone(), s: from.part.s.clone(), }, current_workflow: upper_next, current_index: upper_next_index, accepted: from.accepted * upper_count / full_count, ..from.clone() }); }; } Category::M => { [...] } Category::A => { [...] } Category::S => { [...] } } result }
The only slightly notable thing about this is that I abstracted the "Iterate a state transition function" into a re-usable library function after day 12. I didn't think I'd need it so soon again.
-
Comment on Day 17: Clumsy Crucible in ~comp.advent_of_code
DataWraith (edited )LinkA pathfinding puzzle! My wish from yesterday was granted! :D Algorithm This is a very straightforward application of pathfinding (Dijkstra, A*, etc.), but the state transition function was a...A pathfinding puzzle! My wish from yesterday was granted! :D
Algorithm
This is a very straightforward application of pathfinding (Dijkstra, A*, etc.), but the state transition function was a little more complicated than usual for grid pathfinding.
I wanted to experiment with a search algorithm called Anytime Focal Search (PDF), which doesn't really improve over plain A* a lot here, but there was no way of knowing that without trying it. (Edit: I just realized that the version of AFS I implemented is exactly equivalent to A* here, because I don't have a more sophisticated heuristic than the manhattan distance).
Anytime Focal Search is good when there are many paths of similar cost, or when you need an answer quickly, even if it is sub-optimal. Giving the algorithm more time to run then successively improves the path found, until it is optimal.
I'm also using a RadixHeap, a monotone priority queue useful for pathfinding problems, because that's slightly faster than the BinaryHeap that comes with the standard library.
Rust
use std::rc::Rc; use std::{cmp::Reverse, hash::Hash}; use ahash::HashSet; use glam::IVec2; use radix_heap::RadixHeapMap; use crate::{ parser::{Grid2D, HeatLoss, PuzzleInput}, part1::Direction, }; // Search state #[derive(Clone, Debug)] pub struct State { grid: Rc<Grid2D<HeatLoss>>, // Grid2D is a wrapper around ndarray::Array2 coordinate: IVec2, direction: Option<Direction>, minimum_straight_moves_remaining: u8, maximum_straight_moves_remaining: u8, } impl PartialEq for State { fn eq(&self, other: &Self) -> bool { self.coordinate == other.coordinate && self.direction == other.direction && self.minimum_straight_moves_remaining == other.minimum_straight_moves_remaining && self.maximum_straight_moves_remaining == other.maximum_straight_moves_remaining } } impl Eq for State {} impl Hash for State { fn hash<H: std::hash::Hasher>(&self, state: &mut H) { self.coordinate.hash(state); self.direction.hash(state); self.minimum_straight_moves_remaining.hash(state); self.maximum_straight_moves_remaining.hash(state); } } // States reachable from `s` and the cost of moving there from s (the heat loss incurred) fn successors(s: &State) -> Vec<(State, usize)> { let mut result: Vec<(State, usize)> = Vec::new(); let mut add_successor = |direction: Direction| { let mut new_state = s.clone(); let going_straight = s.direction == Some(direction) || s.direction.is_none(); if going_straight && s.maximum_straight_moves_remaining == 0 { return; } if !going_straight && s.minimum_straight_moves_remaining > 0 { return; } new_state.direction = Some(direction); match new_state.direction { Some(Direction::North) => new_state.coordinate.y -= 1, Some(Direction::South) => new_state.coordinate.y += 1, Some(Direction::West) => new_state.coordinate.x -= 1, Some(Direction::East) => new_state.coordinate.x += 1, None => unreachable!("Direction should never be None"), } if going_straight { new_state.maximum_straight_moves_remaining -= 1; new_state.minimum_straight_moves_remaining = new_state.minimum_straight_moves_remaining.saturating_sub(1); } else { new_state.maximum_straight_moves_remaining = 10 - 1; new_state.minimum_straight_moves_remaining = 4 - 1; } if new_state.coordinate.x >= 0 && new_state.coordinate.x < s.grid.width && new_state.coordinate.y >= 0 && new_state.coordinate.y < s.grid.height { let x = new_state.coordinate.x; let y = new_state.coordinate.y; result.push(( new_state, s.grid.data.get([y as usize, x as usize]).unwrap().0 as usize, )); } }; if s.direction.is_none() { add_successor(Direction::South); add_successor(Direction::East); } else { add_successor(s.direction.unwrap()); add_successor(s.direction.unwrap().turn_left()); add_successor(s.direction.unwrap().turn_right()); } result } // Are we at the goal? fn success(s: &State) -> bool { s.coordinate.x == s.grid.width - 1 && s.coordinate.y == s.grid.height - 1 && s.minimum_straight_moves_remaining == 0 } // Manhattan distance heuristic. I'm sure this could probably be improved, since // we have to turn around if we're facing the wrong direction. fn heuristic(s: &State) -> usize { (s.coordinate.x.abs_diff(s.grid.width - 1) + s.coordinate.y.abs_diff(s.grid.height - 1)) as usize } // Pathfinding algorithm, similar to A* pub fn anytime_focal_search<FN, IN, FS>( start: &State, mut successors: FN, mut success: FS, ) -> Option<usize> where FN: FnMut(&State) -> IN, IN: IntoIterator<Item = (State, usize)>, FS: FnMut(&State) -> bool, { let mut focal = RadixHeapMap::new(); let mut seen = HashSet::default(); let mut cur_bound = usize::MAX; let mut cur_sol = None; focal.push(Reverse(0), (0, start.clone())); loop { cur_bound -= 1; let mut cost = usize::MAX; while !focal.is_empty() { let (f, (g, n)) = focal.pop().unwrap(); if f.0 > cur_bound { continue; } if success(&n) { cost = g; break; } for (s, c) in successors(&n).into_iter() { let f = g + c + heuristic(&s); if f <= cur_bound && seen.insert(s.clone()) { focal.push(Reverse(f), (g + c, s)); } } } if cost == usize::MAX { return cur_sol; } cur_sol = Some(cost); cur_bound = cost; } } // Just initialize the first state and call AFS pub fn part2(input: PuzzleInput) -> isize { let initial_state = State { grid: Rc::new(input.grid), coordinate: IVec2::new(0, 0), maximum_straight_moves_remaining: 10, minimum_straight_moves_remaining: 4, direction: None, }; anytime_focal_search(&initial_state, successors, success).unwrap() as isize }
Performance
My solution was really slow at first, because I forgot to filter out duplicate states, but now it is reasonably fast:
day_17_bench fastest │ slowest │ median │ mean │ samples │ iters ├─ parse 28.09 µs │ 62.11 µs │ 28.24 µs │ 28.78 µs │ 100 │ 100 ├─ part1 37.71 ms │ 63.05 ms │ 44.29 ms │ 45.14 ms │ 100 │ 100 ╰─ part2 124.9 ms │ 142.1 ms │ 125.5 ms │ 125.9 ms │ 100 │ 100
-
Comment on Day 14: Reflector Dish in ~comp.advent_of_code
DataWraith I think I came up with an interesting way to solve this. My first attempt at implementing the idea didn't work, so I gave up and solved it the way everybody else did. Today I had a bit of time to...I think I came up with an interesting way to solve this.
My first attempt at implementing the idea didn't work, so I gave up and solved it the way everybody else did. Today I had a bit of time to revisit this puzzle, and I managed to make my initial idea work.
Algorithm
If you think about the problem of applying all the iterations as a walk through a graph with cycles, the natural thing to reach for is a cycle-detection algorithm, such as Brent's. My idea is to instead apply a trick from the pathfinding literature: contractions.
For example, given a path from A to E,
A -> B -> C -> D -> E
, we can start by moving fromA
toB
, and then fromB
toC
. Now that we know where the path leads, we can add a shortcut fromA
toC
, skippingB
.Short-cuts are themselves subject to being short-cut: When we're at
A
again due to the cyclic nature of the problem, we move through the short-cutA--->C
. If there is already a short-cutC--->E
, we can combine the shortcuts to form a new shortcutA--->E
.We also store the length of the path that the shortcut shortcuts in order to be able to track how far along the overall 'path' of applying the required number of iterations we are.
If a shortcut overshoots the target, we clear all shortcuts and continue building new, shorter, shortcuts from the current position until we reach our target number of iterations.
The path length covered by shortcuts increases quickly until we overshoot -- at that point we start again with a small step size (no shortcuts) and build up momentum (increasingly longer shortcuts) until we reach the goal or overshoot again.
Rust
pub fn path_contraction<N, FN>(start: &N, mut successor: FN, iterations: usize) -> N where N: Eq + Hash + Clone, FN: FnMut(&N) -> N, { let mut shortcuts: HashMap<N, (N, usize)> = HashMap::default(); let mut cur = start.clone(); let mut cur_iter = 0; loop { if cur_iter == iterations { return cur; } // Step 1 let (next1, iters_step1) = if let Some((next1, length1)) = shortcuts.get(&cur) { (next1.clone(), *length1) } else { let next1 = successor(&cur); (next1, 1) }; // Step 2 let (next2, iters_step2) = if let Some((next2, length2)) = shortcuts.get(&next1) { (next2.clone(), *length2) } else { let next2 = successor(&next1); (next2, 1) }; // Combine if cur_iter + iters_step1 + iters_step2 <= iterations { shortcuts.insert(cur, (next2.clone(), iters_step1 + iters_step2)); cur = next2; cur_iter += iters_step1 + iters_step2; } else if cur_iter + iters_step1 <= iterations { shortcuts.insert(cur, (next1.clone(), iters_step1)); cur = next1; cur_iter += iters_step1; } else { let next = successor(&cur); cur = next; cur_iter += 1; shortcuts.clear(); } } }
I considered mentioning
appimage-run
but the comment was long as it was.It didn't work for me, as the software I wanted to run with it (pCloud) was using libFUSE and that was broken with
appimage-run
. I spent quite a bit of time trying to fix that. pCloud is also packaged in NixOS, but (used to be?) in a permanently broken state.