DataWraith's recent activity

  1. Comment on What programming/technical projects have you been working on? in ~comp

    DataWraith
    Link
    TL;DR: I cheat at a choose-your-own-adventure book using a computer. In an attempt to get back into reading, I looked through my shelves for something I haven't read in a while. What I settled on...

    TL;DR: I cheat at a choose-your-own-adventure book using a computer.

    In an attempt to get back into reading, I looked through my shelves for something I haven't read in a while.

    What I settled on was Das Mandala der Donnerdrachen, a choose-your-own-adventure book published before the turn of the century. I don't think it was ever translated into English, but the name of the book is a reference to the native name of Bhutan -- the plot has you travel to the himalaya region in order to bring about a new era of peace on earth. It's a wonderful book.

    The idea is that you have to make a choice at the end of many of the ~250 chapters on how to proceed. "If you want to do X, go to chapter Y".
    In addition, there are sometimes pre-conditions of the form "if you have/haven't read chapter X, then go to chapter Y".
    I used Graphviz dot to make a map of the book fifteen years ago, but at the time I didn't include the pre-conditions in said map, so it was a mere curiosity to see the structure of the chapters.

    This time I wanted to leverage my now more-developed programming skills to answer a few questions:

    • How many paths in total are there?
    • What is the shortest path that loses?
    • What is the longest path that loses?
    • What is the shortest path that wins?
    • What is the longest path that wins?

    My plan was to just generate all possible paths through the book using a breadth-first search. Can't be that many, right? Right??

    After going through the book twice to extract and proof-read the chapter transitions, I started the program and waited. And waited. And waited.

    I expected a few thousand to ten thousands of paths through the book, but unfortunately, due to combinatorial explosion, there are at least hundreds of millions (that's when I aborted the program...).

    Hm. Should have done some back-of-the-envelope math first. If only every 4th chapter has a binary choice, that makes for 2^62 distinct paths... 🤦.

    Now what?

    Well, looks like I can't answer the first question, but maybe the others will yield to a Monte Carlo approach: just play the game randomly for one hundred million times and then extract the wanted statistics from that.

    Results

    There were about 1 million distinct games after filtering out duplicates.

    What is the shortest path that loses?

    Huh. The shortest path is 1->3, the one where you immediately reject the call to action in your hero's journey. Never did that in a read-through.

    Among the sampled games, what is the longest path that loses?

    The longest path that still loses has 144 chapters.

    1 13 10 6 2 14 17 20 15 22 59 53 64 23 51 43 61 70 62 45 25 34 41 72 67 27 50 36 38 31 33 44 54 40 73 79 113 109 76 111 100 82 104 95 75 98 78 88 83 112 97 110 103 94 114 116 92 105 84 99 117 130 149 178 123 156 146 165 118 122 170 179 151 163 132 126 137 173 175 147 167 177 154 171 158 164 162 180 190 204 197 198 193 182 187 183 181 199 192 206 189 188 194 191 195 209 216 225 215 212 223 219 220 214 222 227 211 217 211 214 222 227 220 219 216 225 223 212 215 213 218 221 210 217 226 224 228 244 229 247 243 242 237 232

    Among the sampled games, what is the shortest path that wins?

    The shortest path that still wins was 99 chapters long.

    1 13 10 4 15 22 28 64 30 24 26 65 49 69 25 55 34 52 72 67 27 50 36 38 31 33 44 54 40 73 79 113 109 76 111 100 82 95 115 98 106 88 83 102 97 110 107 91 80 116 92 81 117 130 149 138 156 176 174 118 135 147 177 154 171 145 129 180 190 204 197 182 200 206 208 184 188 194 186 209 220 214 222 217 211 217 226 224 228 244 229 247 243 242 236 233 246 235 249

    Among the sampled games, what is the longest path that wins?

    The longest winning path has 148 chapters.
    (Fun fact: finding the longest path in a graph is, in general, NP-hard)

    1 9 2 5 17 7 11 22 59 35 64 30 68 71 62 45 25 55 34 41 72 67 50 36 58 31 33 44 42 54 40 73 100 113 79 76 90 109 82 104 95 75 98 106 88 83 112 97 110 103 91 80 116 92 105 117 161 178 123 156 176 124 165 120 144 170 179 172 151 148 126 137 173 175 147 139 127 177 143 154 131 142 171 145 169 180 190 185 197 198 193 182 187 183 205 199 192 206 208 184 188 194 186 209 220 219 216 212 223 225 215 213 218 221 210 217 210 221 218 213 215 212 223 225 216 219 220 214 211 227 222 217 226 224 228 244 240 247 243 242 236 238 241 248 233 246 235 249

    This is the order I want to use when I re-read the book. :)

    6 votes
  2. Comment on Prototyping group decision making with automatic delegation in ~tech

    DataWraith
    Link Parent
    I admit that it does sound fascinating on paper, but the delegation algorithm would have to be rock-solid, and even Netflix and Amazon with their army of data scientists screw recommendations up...

    I'm interested in auto-delegation as an alternative to representative democracy, where you have to trust people who are going to represent you in the future (sometimes for 4 years).

    I admit that it does sound fascinating on paper, but the delegation algorithm would have to be rock-solid, and even Netflix and Amazon with their army of data scientists screw recommendations up constantly, so I'm a little skeptical whether it is possible at all.

    The biggest problem I see is that, even given a perfect algorithm, votes alone will not necessarily reveal people's opinions before they suddenly diverge from mine, especially in orthogonal topics. Maybe the system delegates my vote to Mr. X who happens to agree with me on tax policy or education so far, but will absolutely vote for, say, a ban of end-to-end encryption because he "thinks of the children". Retroactively un-trusting Mr. X isn't going to help once the law is passed.

    You could split the votes into multiple shares and delegate them to different people in a PageRank-style system to solve this, but that is even harder to understand than delegating a full vote to a single person...

    Contrary to LinkLonk, where a wrong recommendation is easily rectified with a downvote, I'd be pissed if a vote was registered on the "wrong" side of a political issue I cared about. Yes, maybe I should have voted directly in that case, but maybe I couldn't for whatever reason (depression, sheer volume of things to vote on, etc.).
    When I abstain from voting currently, at least I don't do extra harm by potentially auto-voting against my interests.

    That said, the idea is fascinating, and while I have difficulty believing that it can be done well, it is not impossible that someone can come up with a good system that solves most of these concerns. I'd definitively follow your progress with interest.

    1 vote
  3. Comment on Prototyping group decision making with automatic delegation in ~tech

    DataWraith
    Link Parent
    Hm, it's probably not case #2 because the recommended articles tend to come from the same few feeds (e.g. this one), most of them food-related. I must've downvoted them a hundred times or so. I've...

    To help me understand which case this is, could please comment on LinkLonk under an item that you don't expect to be recommended to you? I am pretty sure it is case #1 and if so I should be able to fix it.

    Hm, it's probably not case #2 because the recommended articles tend to come from the same few feeds (e.g. this one), most of them food-related. I must've downvoted them a hundred times or so. I've commented on one of the articles.

    In Quadratic Voting you are able to purchase votes for money, but the price for each additional vote increases quadratically, so you cannot have unlimited power. This may try to solve some issues with voting but it introduces other issues: in such a system the vote is no longer democratic (1 person = 1 vote) and this may make people doubt the legitimacy and fairness of such decision making mechanism.

    Yes, you're right about that. If the system is not trusted, it does not matter how much better it is in a game-theoretic sense. But doesn't that exact same criticism apply to automatic vote delegation?

    That said, I've looked into it a bit more after I posted, and quadratic voting, while it would -- in theory -- work better than the current democratic voting systems because it is more robust and more truth-revealing, it apparently doesn't work as well at small, i.e. book-club, scales.
    Quadratic voting may have been a bad suggestion. :/

    My goal is to try out the autodelegation idea in a toy project. There are a lot of things to figure out - just like with the bugs in LinkLonk - and it would be good to have some real-life use-case for it. The goal of autodelegation is to cultivate trust in the decision making process, find people who are good at representing the interests of other people and let them use that expertise for good.

    I still think auto-delegation would be a problem though. It'd be fine to suggest who I could delegate my vote to, so I can check out their political views, but I wouldn't want the system to vote/delegate for me automatically. You wouldn't trust Quadratic Voting, and I wouldn't trust auto-delegation -- so, hm. I'm not sure what to make of it.

    Except for the auto-delegation though, the idea sounds a lot like Liquid Democracy. Google has a nice write-up on an experiment they did on that, where you could manually delegate your votes to someone else. Their first use case was finding good meal plans for on-campus cafes.

    1 vote
  4. Comment on Prototyping group decision making with automatic delegation in ~tech

    DataWraith
    Link
    I like LinkLonk, but I wouldn't really trust it to vote for me, not even for book club stuff. It keeps showing articles from feeds I consistently downvote, for example. If the goal is to make a...

    I like LinkLonk, but I wouldn't really trust it to vote for me, not even for book club stuff. It keeps showing articles from feeds I consistently downvote, for example.

    If the goal is to make a voting system, I'd probably just give the users 'vote credits' ala Quadratic Voting. If someone doesn't care for an issue, they could just hoard their votes for next time or maybe directly give the credits to someone else. It might be nice to have a "suggest me a way to spend my credits given my previous voting behavior"-button though, because the thing about quadratic voting is that it is a bit onerous.

    How to invite to the group - by sending email or a link.

    Why not both? Display the link and put a button there that automatically sends an email, maybe with a textbox for a customized message.

    5 votes
  5. Comment on When will we learn? in ~comp

    DataWraith
    Link Parent
    Thanks, I will do that. Some of his older posts were interesting, but lately it's all inflammatory and non-helpful stuff.

    Thanks, I will do that. Some of his older posts were interesting, but lately it's all inflammatory and non-helpful stuff.

    4 votes
  6. Comment on When will we learn? in ~comp

    DataWraith
    Link Parent
    Yes, but again, who pays for the effort? There will be some volunteers, but I'd bet that there's orders of magnitude more code than volunteers can feasibly review -- which likely leaves the...

    Yes, but again, who pays for the effort? There will be some volunteers, but I'd bet that there's orders of magnitude more code than volunteers can feasibly review -- which likely leaves the smaller software packages that are most likely to be backdoored unreviewed.

    That said, I think something like CREV has some promise. The idea is that everybody who reviews code (which, in theory, they have to do anyway to make sure there's no backdoor) can then vouch for that code, and you can choose to trust a given reviewer or organization. That way (again, in theory) the burden is spread among more shoulders.

    3 votes
  7. Comment on When will we learn? in ~comp

    DataWraith
    Link
    Yawn. Another polemic by Drew DeVault. I mean, sure, in an ideal world, every package should be peer reviewed and all commits carefully scrutinized, tested, fuzzed, etc. The problem is, who is...

    Yawn. Another polemic by Drew DeVault.

    I mean, sure, in an ideal world, every package should be peer reviewed and all commits carefully scrutinized, tested, fuzzed, etc. The problem is, who is going to do that? On whose time/dime?

    At some point you have to trust someone (see Reflections on Trusting Trust). DeVault says a Linux distribution package manager is a good candidate for that. This argument may be a bit obvious, but I'll make it anyway: a lot of stuff you need isn't actually packaged by any distribution -- this is especially apparent in machine learning stuff for me. It is unlikely that someone is going to thoroughly review and package and maintain forever every obscure piece of software you need for a one-off project, even if you ask nicely.

    Yes, if you run untrusted code you have a problem if it turns out to be malicious. But there's really no way around that other than not using code from the internet -- even somewhat well-reviewed code could potentially contain hidden backdoors. Drew seems to be a fan of "I don't need dependencies, I write everything in C myself", but this post still seems disingenious.

    14 votes
  8. Comment on Why store code as text files? in ~comp

    DataWraith
    Link
    I'm cursorily following the Unison programming language, which is stored in content-addressable AST form. It is a neat idea. A given hash uniquely identifies a given piece of code, and the...

    Code is usually version controlled nowadays in git or some other VCS. These typically operate on text files and record the changes applied to the files over their history. One drawback from this is that formatting of the code can introduce changes to the files that make no semantic difference, e.g. newlines are added/removed, indentation is altered etc.

    I'm cursorily following the Unison programming language, which is stored in content-addressable AST form. It is a neat idea.

    A given hash uniquely identifies a given piece of code, and the encoding makes it very simple to, e.g. rename variables, because those names are a non-essential detail that the language itself doesn't need. Different people could even call the same code by different names, and because the language is content-addressed, it is trivial to distribute code in a cloud environment (which is one of the stated goals of Unison).

    Instead of source code files, they use a database of code (SQLite, I think) that can be edited by making .u scratch files that get incorporated into the local database once saved. Interestingly, I think they did have some problems with effectively sharing code though, because GitHub & co are so text-file centric.

    11 votes
  9. Comment on Do you track your time? in ~life

    DataWraith
    Link
    I kind of have to track hours at work, but privately I've never really stuck to tracking my time beyond a couple of weeks. I started a new attempt yesterday though, so it's quite funny that the...

    I kind of have to track hours at work, but privately I've never really stuck to tracking my time beyond a couple of weeks. I started a new attempt yesterday though, so it's quite funny that the thread came up just in time, so to say.

    There's really only three options for time tracking:

    1. Manually clock in and out
    2. Run a completely automatic time tracker like arbtt
    3. Run a semi-automatic TagTime-style stochastic tracker

    Number 1 never worked for me outside of work because I would get distracted and forget to clock in or out.

    Number 2 kind of works, but you'll have to be able to deduce exactly what you are doing based on the windows that are open, which can be tricky. It's also a bit of a privacy liability because it tracks everything you do on the computer.

    Number 3 is what I'm currently using. The idea is that the program randomly pings you (with a pop-up on the computer or a notification on your phone), and you 'tag' what you're currently doing (e.g. "Youtube" or "Tildes"). Since the pings are in somewhat random intervals, you won't get minute-accurate reports, but it'll tell you, for example, how much time you tend to spend on YouTube in the long run.

    It does take a while until enough pings accumulate and you can start drawing conclusions, but for someone trying to improve productivity by seeing where the time-wasters are, I think this is certainly the easiest way to do it.

    3 votes
  10. Comment on Keep your numbers off of me: Why tournaments support better communities than ladders in ~games

    DataWraith
    Link
    I agree that winning a tournament (even if it is just one out of thousands going on) probably feels better than an equivalent win-streak on the ladder, but I think we do need the ladder for when...

    I agree that winning a tournament (even if it is just one out of thousands
    going on) probably feels better than an equivalent win-streak on the ladder,
    but I think we do need the ladder for when players can't conform to a fixed
    tournament schedule.

    A tournament uses your win-rate as a proxy for skill and matchmaking; if you
    think about it, the ladder just scales that concept up by using your skill-rating
    as a proxy for skill, because that works better than raw winrate mathematically.

    I have very little experience with competitive gaming, but I did play a bit of
    Overwatch when it came out. The vast majority of games was either unremarkable
    or actively un-fun. I guess that comes with the territory of being in the
    bottom 30% of the playerbase skill-wise, but even then there were a few gems of
    games as well; just not enough of them to bother continue playing.

    The big question I always had was whether it was possible in theory to make
    more good games with smarter matchmaking. Interestingly, the article seems to
    oppose that idea and says that the less-rigorously fair tournament is better,
    because even an underdog has a chance to win the tournament through chance
    or clutch performance above their level.

    The question is what smarter matchmaking would even look like. Maybe the ladder
    should be more random? One idea I like is to make matches based on probable fun
    instead of balanced skill.

    For example, Ghost Recon Online uses/used/planned to use (?) a neural
    network-based matchmaking system (or at least there's a research paper
    describing the approach). Instead of just win and loss, the system takes into
    account the performance stats and proclivities (weapon/hero use stats, etc.) of
    all players before predicting a balance score. The authors planned to let
    players vote on whether or not a match was fun, and then train the NN to
    predict how much fun any given matchup would be and matchmake on that instead
    of skill rating. Microsoft is pursuing a similar approach with their TrueSkill v2 (though they use win/loss instead of fun/unfun).

    Blizzard tried to let players rate the matches in Overwatch, but later removed
    that and (IIRC) said they didn't learn anything from it other than that winning
    is more fun than losing. I always wondered if there really is nothing more to
    learn from it than that -- for example, you could ignore win->good and
    loss->bad votes and work with the remainder to extract some signal, because in
    my experience the "gem" games were the ones where the losing team had fun
    despite losing, and the bad ones often had the winning team not having fun
    despite winning.

    2 votes
  11. Comment on What programming/technical projects have you been working on? in ~comp

    DataWraith
    Link Parent
    I'm not sure how well-known this is, but Dropbox's Lepton format can losslessly compress JPEGs by, as they claim, an average of 22%. JPEG XL also has a mode that converts old JPEGs into the new...

    One problem I see is that a lot of the source images I'm going to be using are JPGs (high quality JPGs, but JPGs nonetheless), which will probably make me rethink a bunch of stuff.

    I'm not sure how well-known this is, but Dropbox's Lepton format can losslessly compress JPEGs by, as they claim, an average of 22%.

    JPEG XL also has a mode that converts old JPEGs into the new format losslessly, with similar 20-30% savings in my experience.

    Unfortunately neither of those formats is well-supported currently, but it might still make sense to use them internally.

    3 votes
  12. Comment on What programming/technical projects have you been working on? in ~comp

    DataWraith
    Link Parent
    There are various so-called sketch data structures that approximate larger data structures. They are also useful in many networking use-cases (traffic monitoring, etc.). The most-famous sketch is...

    There are various so-called sketch data structures that approximate larger data structures. They are also useful in many networking use-cases (traffic monitoring, etc.).

    The most-famous sketch is probably the bloom filter, but there's also the count-min sketch (which allows you to estimate frequencies of items in a data stream -- the linked paper cleverly uses that to store the classifier weights instead). Another fascinating application area is caching, where, for example, the Window Tiny-LFU sketch allows you to better utilize your cache memory by keeping items out of the cache that probably won't be requested again.

    1 vote
  13. Comment on What programming/technical projects have you been working on? in ~comp

    DataWraith
    Link
    I've implemented a binary classifier based on the paper Sketching Linear Classifiers over Data Streams in Rust. The program is a commandline tool that allows you to discriminate between two...

    I've implemented a binary classifier based on the paper Sketching Linear Classifiers over Data Streams in Rust.

    The program is a commandline tool that allows you to discriminate between two classes using logistic regression. The interesting bit is that, due to the clever data structure described in the paper, the classifier uses very little memory (about 32 KiB) while still having reasonable accuracy (98.35% on the TREC2007 email spam/ham corpus when using Orthogonal Sparse Bigrams (PDF) as features). You can give the classifier more memory, which results in increased accuracy (e.g. 99.2% with 16MiB); this isn't state-of-the-art, but it is pretty good, and I'm really happy that it works as well as it does with limited memory.

    However, to go beyond spam filtering as a use-case, I now need to figure out if I can somehow leverage the code for multi-label classification instead of binary classification (we need this at work). There are some methods that convert a set of binary classifiers into a multi-label one, but I haven't done enough research yet to settle on one. For now I'm using one classifier per label (presence vs. absence), but that probably won't scale well even with the small memory footprint, and it assumes that all labels are independent, which most definitively isn't true.

    Eventually I would like to write a microservice that allows you to do arbitrary text classification: you give it an API key, a piece of text and the possible labels, and it'll figure out which ones apply to your given input after it sees enough training examples. As a pie-in-the-sky thought, it could theoretically even work as a public service for spam filtering (or general text classification) -- you can do Feature hashing in a way that makes it hard for spammers to corrupt your classifier unless they are in the majority.

    5 votes
  14. Comment on What programming/technical projects have you been working on? in ~comp

    DataWraith
    (edited )
    Link Parent
    This is pretty cool. I used to use Audioscrobbler back in the day, before they turned it into last.fm, but it never occurred to me to make a local database. Nowadays I'm streaming most of the...

    This is pretty cool. I used to use Audioscrobbler back in the day, before they turned it into last.fm, but it never occurred to me to make a local database. Nowadays I'm streaming most of the music I'm listening to though, so getting the artist and album information out might be tricky...

    As for a user interface: I haven't had the opportunity to play with it myself, but Datasette seems to be a nice way to slice-and-dice an SQLite database.

    Edit: I just tried loading the musyca-created database in Datasette and it works reasonably well. It doesn't like that your artist table ist called "artists" while the "albums" table tries to reference "artist" (singular), but it still works. Without plugins it doesn't seem to display much in the way of statistics, but there might be a suitable plugin in the 50+ that the project supports.

    3 votes
  15. Comment on Day 19: Beacon Scanner in ~comp.advent_of_code

    DataWraith
    Link Parent
    I feel stupid posting this, because the ragequit wasn't pretty. I've cooled off a little, but this puzzle kept gnawing on me to the point that I couldn't sleep well, so I had to solve at least...

    I feel stupid posting this, because the ragequit wasn't pretty.
    I've cooled off a little, but this puzzle kept gnawing on me to the point that I couldn't sleep well, so I had to solve at least Part I in order to save my sanity.

    I won't do Part II or the remaining puzzles -- they just don't provide enough entertainment for the time they take up.
    I've always heard about how awesome they are, and admired people solving them in Haskell, but they're more tedious than interesting now that I did them myself, at least up until day 19. Maybe the cool stuff starts on day 20, who knows, but I'm now done with them for the year at least.

    As for the actual solution, ironically my problem on this puzzle was that I tried to do it in Ruby because I thought Rust wasn't great for exploratory programming.
    It turns out that the Set data structure is just completely and silently broken for custom classes if you don't implement both #eql? and #hash. I had implemented #hash and #==, but not #eql? for my custom Coordinate class, so set intersection was broken, and that led me to believe my initial algorithm idea would not work.

    I redid everything in Rust, and I hope with this post, I can finally put all my thoughts of the contest to rest.

    Data structures

    First, I needed a way to represent 3D coordinates.

    use std::{
        collections::{BTreeSet, HashSet, VecDeque},
        ops::{Add, Mul, Sub},
        rc::Rc,
    };
    
    use float_ord::FloatOrd;
    
    #[derive(Debug, Hash, Clone, Copy, Eq, PartialEq, Ord, PartialOrd)]
    pub struct Coordinate {
        x: i32,
        y: i32,
        z: i32,
    }
    
    impl Add for Coordinate {
        type Output = Self;
    
        fn add(self, rhs: Self) -> Self::Output {
            Coordinate {
                x: self.x + rhs.x,
                y: self.y + rhs.y,
                z: self.z + rhs.z,
            }
        }
    }
    
    impl Sub for Coordinate {
        type Output = Self;
    
        fn sub(self, rhs: Self) -> Self::Output {
            Coordinate {
                x: self.x - rhs.x,
                y: self.y - rhs.y,
                z: self.z - rhs.z,
            }
        }
    }
    
    impl Mul for Coordinate {
        type Output = Self;
    
        fn mul(self, rhs: Self) -> Self::Output {
            Coordinate {
                x: self.x * rhs.x,
                y: self.y * rhs.y,
                z: self.z * rhs.z,
            }
        }
    }
    

    We need a way to calculate the distance between two beacons, as well as a way to rotate a coordinate so it points 'up' (though we generally don't know where 'up' is). The puzzle description helpfully lists that there are 24 different directions -- I had to Google this, but it is obvious in retrospect -- 3 axis, times 2 for positive and negative orientation, times 4 for the possible rotations around one of the axes.

    impl Coordinate {
        pub fn distance(&self, other: &Self) -> f32 {
            f32::sqrt(
                ((self.x - other.x).pow(2) + (self.y - other.y).pow(2) + (self.z - other.z).pow(2))
                    as f32,
            )
        }
    
        pub fn rotate(&self, rotation: usize) -> Self {
            // This rotates the coordinate so that one of the positive/negative
            // directions of the axes is up
            let axis = match rotation % 6 {
                0 => *self,
                1 => Coordinate {
                    x: -self.x,
                    y: self.y,
                    z: -self.z,
                },
                2 => Coordinate {
                    x: self.y,
                    y: -self.x,
                    z: self.z,
                },
                3 => Coordinate {
                    x: -self.y,
                    y: self.x,
                    z: self.z,
                },
                4 => Coordinate {
                    x: self.z,
                    y: self.y,
                    z: -self.x,
                },
                5 => Coordinate {
                    x: -self.z,
                    y: self.y,
                    z: self.x,
                },
                _ => unreachable!(),
            };
    
            // With a given axis being up, we only need to check for the
            // four rotations around one of the axes
            match (rotation / 6) % 4 {
                0 => axis,
                1 => Coordinate {
                    x: axis.x,
                    y: -axis.z,
                    z: axis.y,
                },
                2 => Coordinate {
                    x: axis.x,
                    y: -axis.y,
                    z: -axis.z,
                },
                3 => Coordinate {
                    x: axis.x,
                    y: axis.z,
                    z: -axis.y,
                },
                _ => unreachable!(),
            }
        }
    }
    

    Then we have the Scanners themselves. They have an offset (their center) from Scanner 0 as well as a rotation. They keep the beacon coordinates in the local coordinate system, although the beacons() method transforms those into the global coordinate system.

    My initial insight (what I called trivial in the ragequit post), if you want to call it that, is that you can fingerprint the beacons by their distance to the other beacons of the same scanner. The same beacon seen by two different scanners will share 11 distance values exactly. This gives us an easy way to match beacons.

    #[derive(Debug, Clone)]
    pub struct Scanner {
        id: u8,
        center: Coordinate,
        rotation: usize,
        local_beacons: Rc<Vec<Coordinate>>,
        fingerprints: Rc<Vec<Vec<FloatOrd<f32>>>>,
    }
    
    impl Scanner {
        fn rotate(&self) -> Self {
            Scanner {
                rotation: self.rotation + 1,
                fingerprints: self.fingerprints.clone(),
                local_beacons: self.local_beacons.clone(),
                ..*self
            }
        }
    
        fn translate(&self, offset: Coordinate) -> Self {
            Scanner {
                center: self.center + offset,
                local_beacons: self.local_beacons.clone(),
                fingerprints: self.fingerprints.clone(),
                ..*self
            }
        }
    
        fn transform(&self, b: Coordinate) -> Coordinate {
            b.rotate(self.rotation) + self.center
        }
    
        fn beacons(&self) -> Vec<Coordinate> {
            self.local_beacons
                .iter()
                .map(|c| self.transform(*c))
                .collect()
        }
    
        fn new(id: u8, beacons: Vec<Coordinate>) -> Self {
            let mut fingerprints = Vec::new();
    
            for _ in 0..beacons.len() {
                fingerprints.push(Vec::new());
            }
    
            for (i, a) in beacons.iter().enumerate() {
                for (j, b) in beacons.iter().enumerate() {
                    if i >= j {
                        continue;
                    }
    
                    let d = a.distance(b);
                    fingerprints[i].push(FloatOrd(d));
                    fingerprints[j].push(FloatOrd(d));
                }
            }
    
            for f in fingerprints.iter_mut() {
                f.sort_unstable();
            }
    
            Scanner {
                id,
                rotation: 0,
                center: Coordinate { x: 0, y: 0, z: 0 },
                fingerprints: Rc::new(fingerprints),
                local_beacons: Rc::new(beacons),
            }
        }
    }
    
    Parsing
    mod parse {
        use super::{Coordinate, Scanner};
    
        use nom::{
            bytes::complete::tag,
            character::complete::{i32, line_ending, u8},
            combinator::{eof, opt},
            multi::many1,
            IResult,
        };
    
        fn coordinate(input: &str) -> IResult<&str, Coordinate> {
            let (input, x) = i32(input)?;
            let (input, _) = tag(",")(input)?;
            let (input, y) = i32(input)?;
            let (input, _) = tag(",")(input)?;
            let (input, z) = i32(input)?;
            let (input, _) = opt(line_ending)(input)?;
    
            return Ok((input, Coordinate { x, y, z }));
        }
    
        fn scanner(input: &str) -> IResult<&str, Scanner> {
            let (input, _) = tag("--- scanner ")(input)?;
            let (input, id) = u8(input)?;
            let (input, _) = tag(" ---")(input)?;
            let (input, _) = line_ending(input)?;
            let (input, beacons) = many1(coordinate)(input)?;
            let (input, _) = opt(line_ending)(input)?;
    
            Ok((input, Scanner::new(id, beacons)))
        }
    
        pub fn parse(input: &str) -> IResult<&str, Vec<Scanner>> {
            let (input, scanners) = many1(scanner)(input)?;
            let (input, _) = eof(input)?;
    
            Ok((input, scanners))
        }
    }
    
    Helper functions
    fn common_count(a: &Vec<FloatOrd<f32>>, b: &Vec<FloatOrd<f32>>) -> usize {
        let mut i = 0;
        let mut j = 0;
        let mut common = 0;
    
        while i < a.len() && j < b.len() {
            if a[i] == b[j] {
                common += 1;
                i += 1;
                j += 1;
            } else if a[i] > b[j] {
                j += 1;
            } else {
                i += 1;
            }
    
            if common >= 11 {
                break;
            }
        }
    
        common
    }
    

    common_count counts the common elements of two sorted vectors. We never need to count above 11, so it aborts there.

    fn match_scanners(a: &Scanner, b: &Scanner) -> Option<Scanner> {
        let mut matches = Vec::new();
    
        let ba = a.beacons();
        let bb = b.beacons();
    
        'outer: for (i, fa) in a.fingerprints.iter().enumerate() {
            for (j, fb) in b.fingerprints.iter().enumerate() {
                if common_count(fa, fb) == 11 {
                    matches.push((ba[i], bb[j]));
    
                    if matches.len() == 12 {
                        break 'outer;
                    }
                }
            }
        }
    
        if matches.is_empty() {
            return None;
        }
    
        merge(&a, &b, matches)
    }
    

    match_scanners checks the fingerprints to see if there is a match between Scanner a and b, and returns the aligned scanner if successful.

    fn merge(a: &Scanner, b: &Scanner, matches: Vec<(Coordinate, Coordinate)>) -> Option<Scanner> {
        let mut ha = HashSet::new();
    
        for b in a.beacons().into_iter() {
            ha.insert(b);
        }
    
        for (ba, bb) in matches {
            let mut rotated_b = b.clone();
    
            for rot in 0..24 {
                rotated_b = rotated_b.rotate();
    
                let bb_rot = rotated_b.transform(bb);
                let aligned = rotated_b.translate(ba - bb_rot);
    
                let mut hb = HashSet::new();
    
                for b in aligned.beacons().into_iter() {
                    hb.insert(b);
                }
    
                if hb.intersection(&ha).count() >= 12 {
                    return Some(aligned);
                }
            }
        }
    
        None
    }
    

    merge assumes scanner a is correctly oriented and brute-forces the 24 possible rotations for scanner b. This should generally work for the first attempt, but it loops through all matches, just in case we have mistakenly identified a common beacon. The function will return a new Scanner struct that has the correct orientation and offset. The latter is calculated by moving from Scanner A in the positive direction towards a shared beacon (+ ba), and then in the negative direction towards the other scanner (ba - bb_rot). Try drawing a diagram of this -- in the one-dimensional case it becomes really obvious.

    This is where the Ruby Set data structure failed me.

    Solving
    fn solve(s: &Vec<Scanner>) -> Option<Vec<Scanner>> {
        let mut q = VecDeque::new();
        let mut solved_set = vec![s[0].clone()];
        let mut pending = s.clone();
    
        pending.remove(0);
        q.push_back(s[0].clone());
    
        while let Some(cur) = q.pop_front() {
            let mut to_remove = Vec::new();
    
            for (idx, p) in pending.iter().enumerate() {
                if let Some(solved) = match_scanners(&cur, &p) {
                    solved_set.push(solved.clone());
                    q.push_back(solved.clone());
                    to_remove.push(idx);
                }
            }
    
            for &r in to_remove.iter().rev() {
                pending.remove(r);
            }
        }
    
        if solved_set.len() == s.len() {
            return Some(solved_set);
        }
    
        None
    }
    

    This just keeps a Queue of unsolved and a vec of solved Scanners, to iterate until done.

    fn part1(scanners: &Vec<Scanner>) -> usize {
        let mut h = HashSet::new();
    
        for s in scanners.iter() {
            for b in s.beacons().into_iter() {
                h.insert(b);
            }
        }
    
        h.len()
    }
    
    fn main() {
        let input = include_str!("../../input-19.txt").trim();
        let scanners = parse::parse(input).unwrap().1;
    
        println!("Part I:  {}", part1(&solve(&scanners).unwrap()));
    }
    

    And then we just count how many unique beacons there are. Takes about 60ms.

    Tip

    The puzzle description says the scanners keep their position, but it is helpful to move them around (conceptually, at least).

    2 votes
  16. Comment on Email forwarding services in ~tech

    DataWraith
    Link Parent
    While MD5 is considered broken by now, it is still difficult to reverse an MD5 hash into the plain email address, so I think this can probably not be considered personally identifying information...

    While MD5 is considered broken by now, it is still difficult to reverse an MD5 hash into the plain email address, so I think this can probably not be considered personally identifying information under the GDPR, although, of course, IANAL.

    Note that services that use Gravatar won't automatically open a Gravatar account for you -- they merely use the image you associated to your Gravatar account if you have one.

    2 votes
  17. Comment on Day 19: Beacon Scanner in ~comp.advent_of_code

    DataWraith
    (edited )
    Link
    Well, I'm out. This one was too difficult to figure out without looking for hints online. Figuring out which beacons are shared between any two scanners is trivial by comparing the relative...

    Well, I'm out.

    This one was too difficult to figure out without looking for hints online. Figuring out which beacons are shared between any two scanners is trivial by comparing the relative distance to the others, but that didn't give me any leverage for determining the scanner offsets.

    I drew a bunch of diagrams for triangulation, but that somehow never worked in 15+ hours of attempts, there always seemed to be a missing ingredient in the formulas I drafted.

    AoC was good for one thing: I learned a lot about Rust, especially parsing with Nom. But I don't feel like wasting any more time here. You could say I'm salty, but the puzzles so far were all either trivial or trivial and tedious. The only puzzle where the solution is not immediately obvious is Day 19, which then turned out to be straight up impossible for me.

    So yeah, fuck this.

    2 votes
  18. Comment on Day 18: Snailfish in ~comp.advent_of_code

    DataWraith
    Link Parent
    I didn't like today's puzzle, but I don't think that's fair. Re-reading the explanation after I cooled off from my own frustration, I think it is quite well done – it's just the puzzle itself that...

    Fuck everything about this question and especially his shitty way of explaining what he wants you to do for a problem. Don't make what is actually quite an interesting task into a monumental brainfuck by using such a difficult, convoluted and tangled explanation.

    I didn't like today's puzzle, but I don't think that's fair. Re-reading the explanation after I cooled off from my own frustration, I think it is quite well done – it's just the puzzle itself that is convoluted and thus needs a long explanation.

    The explanation is very condensed; every single sentence must be read, related to the others, and understood, in order to solve the puzzle. Skipping around or otherwise letting a sentence slip by results in mistakes, but thankfully the examples were plentiful and exhaustive.

    That said, today 1/3rd of my code is unit tests, and writing those takes time you don't have if you compete for the leaderboard...

    1 vote
  19. Comment on Day 18: Snailfish in ~comp.advent_of_code

    DataWraith
    (edited )
    Link
    Day 18 (Rust) This is definitively my least-favorite puzzle so far. On Day 16 it was at least possible to write a proper parser, but this seems to depend on the order of inputs in a serialized...

    Day 18 (Rust)

    This is definitively my least-favorite puzzle so far. On Day 16 it was at least possible to write a proper parser, but this seems to depend on the order of inputs in a serialized string. I could probably have jury-rigged something with tree-traversal (in-order? pre-order? post-order?), but it was simpler this way.

    Overall it felt like tedium for tedium's sake (well, it was homework after all...)

    Data structures
    use std::fmt::Display;
    
    #[derive(Debug, Copy, Clone, PartialEq, Eq)]
    pub enum SnailfishNumber {
        Open,
        Comma,
        Close,
        Value(usize),
    }
    
    impl Display for SnailfishNumber {
        fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
            match self {
                SnailfishNumber::Open => write!(f, "["),
                SnailfishNumber::Value(num) => write!(f, "{}", num),
                SnailfishNumber::Comma => write!(f, ","),
                SnailfishNumber::Close => write!(f, "]"),
            }
        }
    }
    
    Parsing
    fn parse(input: &str) -> Vec<SnailfishNumber> {
        let mut result = Vec::new();
    
        for c in input.trim().chars() {
            result.push(match c {
                '[' => SnailfishNumber::Open,
                ']' => SnailfishNumber::Close,
                ',' => SnailfishNumber::Comma,
                // Parsing with base 16 simplifies the test-cases with numbers > 9,
                // since you can just specify them in hexadecimal
                _ => SnailfishNumber::Value(c.to_digit(16).unwrap() as usize),
            })
        }
    
        result
    }
    
    Helper functions
    fn to_string(v: &Vec<SnailfishNumber>) -> String {
        v.iter()
            .map(|sn| format!("{}", sn))
            .collect::<Vec<_>>()
            .join("")
    }
    
    fn add(a: Vec<SnailfishNumber>, b: Vec<SnailfishNumber>) -> Vec<SnailfishNumber> {
        let mut result = Vec::new();
        result.push(SnailfishNumber::Open);
        result.extend(a.into_iter());
        result.push(SnailfishNumber::Comma);
        result.extend(b.into_iter());
        result.push(SnailfishNumber::Close);
    
        reduce(result)
    }
    
    fn reduce(s: Vec<SnailfishNumber>) -> Vec<SnailfishNumber> {
        let mut s = s;
    
        loop {
            let (exploded, has_exploded) = explode(s);
            s = exploded;
    
            if has_exploded {
                continue;
            }
    
            let (split, has_split) = split(s);
            s = split;
    
            if has_split {
                continue;
            }
    
            break;
        }
    
        s
    }
    
    fn explode(s: Vec<SnailfishNumber>) -> (Vec<SnailfishNumber>, bool) {
        // increase_by finds the first Value in the vector and increments it
        fn increase_by(s: &mut Vec<SnailfishNumber>, inc: usize) {
            for i in 0..s.len() {
                if let SnailfishNumber::Value(x) = s[i] {
                    s[i] = SnailfishNumber::Value(x + inc);
                    break;
                }
            }
        }
    
        // increase_by finds the last Value in the vector and increments it
        fn increase_by_rev(s: &mut Vec<SnailfishNumber>, inc: usize) {
            for i in (0..s.len()).rev() {
                if let SnailfishNumber::Value(x) = s[i] {
                    s[i] = SnailfishNumber::Value(x + inc);
                    break;
                }
            }
        }
    
        let mut depth = 0;
    
        for (idx, &c) in s.iter().enumerate() {
            match c {
                SnailfishNumber::Open => depth += 1,
                SnailfishNumber::Close => depth -= 1,
                SnailfishNumber::Comma => (),
                SnailfishNumber::Value(_) => {
                    // If we are nested 4 deep...
                    if depth > 4 {
                        // ...and this is not a sub-pair, ...
                        if s[idx + 2] == SnailfishNumber::Open {
                            continue;
                        }
    
                        let (prev, s2) = s.split_at(idx);
                        let mut prev = prev.to_vec();
                        prev.pop();
    
                        // ... extract the left and right numbers of the pair and...
                        if let SnailfishNumber::Value(left) = s2[0] {
                            if let SnailfishNumber::Value(right) = s2[2] {
                                // Split after the pair
                                let (_, trail) = s2.split_at(4);
                                let mut trail = trail.to_vec();
    
                                // Increase the last number in the prefix
                                increase_by_rev(&mut prev, left);
    
                                // Increase the postfix
                                increase_by(&mut trail, right);
    
                                // Concatenate everything
                                prev.push(SnailfishNumber::Value(0));
                                prev.extend(trail.iter());
    
                                return (prev, true);
                            }
                        }
    
                        unreachable!()
                    }
                }
            }
        }
    
        (s, false)
    }
    
    fn split(s: Vec<SnailfishNumber>) -> (Vec<SnailfishNumber>, bool) {
        let mut result = Vec::new();
    
        for (idx, &c) in s.iter().enumerate() {
            if let SnailfishNumber::Value(num) = c {
                if num > 9 {
                    let left = num / 2;
                    let right = (num + 1) / 2;
                    let (_, trail) = s.split_at(idx + 1);
    
                    result.push(SnailfishNumber::Open);
                    result.push(SnailfishNumber::Value(left));
                    result.push(SnailfishNumber::Comma);
                    result.push(SnailfishNumber::Value(right));
                    result.push(SnailfishNumber::Close);
                    result.extend(trail.into_iter());
    
                    return (result, true);
                } else {
                    result.push(c);
                }
            } else {
                result.push(c);
            }
        }
    
        (s, false)
    }
    
    // use a stack to find the magnitude of an expression
    fn magnitude(s: Vec<SnailfishNumber>) -> usize {
        let mut stack = Vec::new();
    
        for c in s.iter() {
            match c {
                SnailfishNumber::Value(x) => stack.push(*x),
                SnailfishNumber::Comma => (),
                SnailfishNumber::Open => (),
                SnailfishNumber::Close => {
                    let right = stack.pop().unwrap();
                    let left = stack.pop().unwrap();
                    let magnitude = 3 * left + 2 * right;
                    stack.push(magnitude);
                }
            }
        }
    
        stack.pop().unwrap()
    }
    
    Solving
    fn part1(input: Vec<Vec<SnailfishNumber>>) -> usize {
        magnitude(input.into_iter().reduce(|a, b| add(a, b)).unwrap())
    }
    
    fn part2(input: Vec<Vec<SnailfishNumber>>) -> usize {
        let mut max_magnitude = 0;
    
        for i in 0..(input.len() - 1) {
            for j in (i + 1)..(input.len()) {
                let m1 = magnitude(add(input[i].clone(), input[j].clone()));
                let m2 = magnitude(add(input[j].clone(), input[i].clone()));
                max_magnitude = max_magnitude.max(m1);
                max_magnitude = max_magnitude.max(m2);
            }
        }
    
        max_magnitude
    }
    
    fn main() {
        let input = include_str!("../../input-18.txt").trim();
    
        let parsed = input
            .lines()
            .filter_map(|l| {
                if l.trim().is_empty() {
                    None
                } else {
                    Some(parse(l.trim()))
                }
            })
            .collect::<Vec<Vec<SnailfishNumber>>>();
    
        println!("Part I:  {}", part1(parsed.clone()));
        println!("Part II: {}", part2(parsed));
    }
    
    2 votes
  20. Comment on Day 17: Trick Shot in ~comp.advent_of_code

    DataWraith
    (edited )
    Link
    Day 17 (Rust) I spent a bit of time trying to come up with a formula, but all I found was a way to calculate the x coordinate at the point where drag reduces the velocity to zero; this isn't very...

    Day 17 (Rust)

    I spent a bit of time trying to come up with a formula, but all I found was a way to calculate the x coordinate at the point where drag reduces the velocity to zero; this isn't very significant in practice though, the whole thing still takes about 1 second to run.

    Edit: D'oh! I had a dbg!-statement still in there. Turns out, writing to the console is slow... actual runtime is more like 8ms.

    Data structures
    #[derive(Debug)]
    pub struct TargetArea {
        x_start: i64,
        x_end: i64,
        y_start: i64,
        y_end: i64,
    }
    
    #[derive(Clone, Copy, Debug)]
    pub struct Probe {
        x: i64,
        y: i64,
        dx: i64,
        dy: i64,
    }
    
    impl Probe {
        pub fn new(dx: i64, dy: i64) -> Probe {
            Probe { x: 0, y: 0, dx, dy }
        }
    }
    
    Parsing
    mod parse {
        use super::TargetArea;
    
        use nom::{bytes::complete::tag, character::complete::i64, combinator::eof, IResult};
    
        fn puzzle(input: &str) -> IResult<&str, TargetArea> {
            let (input, _) = tag("target area: x=")(input)?;
            let (input, x_start) = i64(input)?;
            let (input, _) = tag("..")(input)?;
            let (input, x_end) = i64(input)?;
            let (input, _) = tag(", y=")(input)?;
            let (input, y_start) = i64(input)?;
            let (input, _) = tag("..")(input)?;
            let (input, y_end) = i64(input)?;
            let (input, _) = eof(input)?;
    
            Ok((
                input,
                TargetArea {
                    x_start,
                    x_end: x_end + 1,
                    y_start,
                    y_end: y_end + 1,
                },
            ))
        }
    
        pub fn parse(input: &str) -> TargetArea {
            puzzle(input).unwrap().1
        }
    }
    
    Helper functions
    fn is_within_target(probe: &Probe, target: &TargetArea) -> bool {
        (target.x_start..target.x_end).contains(&probe.x)
            && (target.y_start..target.y_end).contains(&probe.y)
    }
    
    fn is_lost(probe: &Probe, target: &TargetArea) -> bool {
        probe.x > target.x_end
            || probe.dx == 0 && !(target.x_start..target.x_end).contains(&probe.x)
            || probe.y < target.y_start
    }
    
    fn step(probe: &Probe) -> Probe {
        Probe {
            x: probe.x + probe.dx,
            y: probe.y + probe.dy,
            dx: probe.dx - probe.dx.signum(),
            dy: probe.dy - 1,
        }
    }
    
    fn trajectory(p: &Probe, t: &TargetArea) -> Option<Vec<Probe>> {
        let mut result = vec![p.clone()];
        let mut next = *p;
    
        loop {
            next = step(&next);
    
            result.push(next);
    
            if is_within_target(&next, &t) {
                return Some(result);
            }
    
            if is_lost(&next, &t) {
                return None;
            }
        }
    }
    
    Solving
    fn part1(target: &TargetArea) -> i64 {
        let mut best_y = i64::MIN;
    
        for dx in 0..=target.x_end {
            // x coordinate we reach when drag has eaten up all our velocity.
            let final_x = dx * (dx + 1) / 2;
    
            // This falls short of the target area, so start next attempt
            if final_x < target.x_start {
                continue;
            }
    
            // Assuming the target is directly below us, we can't use a velocity that is larger than
            // y_start or we'll overshoot. This is the worst case. I just mirrored the shot velocity
            // towards the positive as well, but just because that happened to work, not because it's
            // mathematically rigorous.
            for dy in target.y_start..=(-target.y_start) {
                if let Some(trajectory) = trajectory(&Probe::new(dx, dy), &target) {
                    let highest = trajectory.iter().map(|probe| probe.y).max().unwrap();
                    best_y = std::cmp::max(best_y, highest);
                }
            }
        }
    
        best_y
    }
    
    fn part2(target: &TargetArea) -> i64 {
        let mut count = 0;
    
        for dx in 0..=target.x_end {
            let final_x = dx * (dx + 1) / 2;
    
            if final_x < target.x_start {
                continue;
            }
    
            for dy in target.y_start..=(-target.y_start) {
                if let Some(trajectory) = trajectory(&Probe::new(dx, dy), &target) {
                    count += 1;
                }
            }
        }
    
        count
    }
    
    fn main() {
        let input = include_str!("../../input-17.txt").trim();
        let target = parse::parse(input);
    
        println!("Part I:  {}", part1(&target));
        println!("Part II: {}", part2(&target));
    }
    
    2 votes