10
votes
Day 19: Monster Messages
Today's problem description: https://adventofcode.com/2020/day/19
Join the Tildes private leaderboard! You can do that on this page, by entering join code 730956-de85ce0c
.
Please post your solutions in your own top-level comment. Here's a template you can copy-paste into your comment to format it nicely, with the code collapsed by default inside an expandable section with syntax highlighting (you can replace python
with any of the "short names" listed in this page of supported languages):
<details>
<summary>Part 1</summary>
```python
Your code here.
```
</details>
Haskell
Yo dawg, I heard you like parsers, so I built a parser for your parser!
day19.hs
I first use a set of
ReadP
s to parse the rules themselves, then construct a newReadP
from these rules and run it against the input strings to see which match.Funnily enough, this approach was robust enough that I didn't have to change anything at all for part 2 (other than appending the new rules to the list of rules, overwriting the previous definitions).
AoC.hs (runner and utility functions)
To use a multi-return parser was definitely the way to do this. I didn't, and...
part 2 parser
runs in 22ms at least.
Python
I'm quite proud of today's. I kinda hate regex (cliche, I know!) and first considered just writing a bunch of simple manual tests. Eventually, I couldn't resist doing it the "proper" way, generating a regular expression to match against. Ultimately, I think it was actually easier that way.
Part 1+2
I have a recursive generate_regex function that spits out a regex string to use in matching. When I read about the loops in part 2, I first was getting nervous, expecting that I might need to redo it all without regex after all. But then I thought, how deep can it ever loop? I ended up just adding a "depth" value to the recursive generate_regex function and cutting it off at a depth of 30 (that number was the result of just a bit of trial and error). Works just fine for the task.
Today's one, used regexes for both parts, compiled the input into a tree and then executed it, had to move from rust regex to pcre2 for pt 2. I kinda enjoyed this, was interesting to learn about new regex features.
Runs in around 10ms for parsing and both parts
Repo link: https://github.com/OrangeBacon/adventofcode2020/blob/main/src/days/day19.rs
Rust code
Wow that one was hard! I was about to give up and go to bed when I finally got it working! I'm kinda proud of the regex-powered monstrosity I built. I was even able to incorporate some of the challenges from previous days to make parts of it work.
Part A was a solid challenge, but Part B added a huge level of complexity.
Day 19 Part A – Python
For part A I build a tree out of nodes, sort of like the bag problem on day 7. Then I built a recursive function that digs down from rule 0 all the way to the bottom of the tree, taking every branch possible. Andon the way back up through the tree I auto-generate a regex pattern that accounts for all the branches and possibilities.
This beautiful bastard in all it's horrific glory:
I just run that fucker on every line of my data set. And if they behave I count them. Now for the actual code
Day 19 Part B – Python
The rule change for Part B is simple, but add a great deal of brain-fuckery to the problem. It took several hours of wrestling regex into submission until I finally had a breakthrough.
Based on the new rules, I just count how many copies of rule 42 and rule 31 are at the beginning and end of the code. Based staring at the math for a bit, as long as these are all true:
Which is to say, I need at least 2 of rule 42, at least 1 of 31 AND I need rule 42 to occur more times than rule 31. That behavior should be standard for everyone, since it's part of the problem description
Knowing all that, I count and remove extra copies of pattern 42 from the beginning and pattern 31 from the end.
Then I do one final pass to see if the new, shortened message follows the original rule 0. For some reason I was having trouble dealing with messages that followed the beginning and end, but had other stuff in the middle
Tips and Commentary
This one is hard, like really hard. Don't be discouraged if you can't figure it out. But still, it's an interesting challenge. And if you find yourself screaming 'Fuck... FUCK... fuck... what the fuck? FFFFFFUUUUUUCCCCCCCCKKKKKK!' don't worry, that's a normal reaction to this problem.
This is a great opportunity to use some of what you've learned from previous days. In particular your solution to day 7 may be helpful.
Your program can auto-generate regex patterns that you can then use. But you really need to know your way around regex if you want to try that. Might be worth reading up on some of the documentation for what regex commands are available in your preferred language
For part B don't try to build a general usage solution to this problem unless you're truly a masochist. I would suggest studying the specific rule chances that occurred and seeing how you can work with them
I ended up having 3 different regex patterns that I used for each pass. One to deal with the beginning, one to deal with the end, and one to deal with the middle.
Racket
Part 1
Part 2
My notes
Part 1 has 3 parts:
1. A terminal rule results in "a" or "b"
2. A single production rule - recursively apply the builder to each element and concatenate the results
3. A multi production rule - concatenate each single rule result using the regex
or
operator"|"
Part 2 is
After making a very time consuming error and given the very strong hint in the problem I looked closer at my grammar and Rule0 is simply Rule8 followed by Rule11.
+
operator applied to Rule42.The bottom line is the whole grammar is summarised by Rule42^m Rule31^n where m > n.
Since I have a regex builder already I generated a regex for Rule42 and one for Rule31.
The only tricky part is that I had to count the matches to ensure there were more Rule42 matches. I did this by removing the match from the start of the string every time a rule matches. The last part is checking that Rule 42 > Rule 31 and there is the remaining part of the string is empty.
I lost a lot of time when I became convinced the key to part 2 was converting the grammar to Chomsky Normal Form - smh ... don't ask.
Well that failed spectacularly so back to the drawing board.
My rank for part 1 and 2 was 2525/3949. Once I realised my mistake, completing part 2 was super quick to write and instead of almost double the rank it would have been less then part 1.
Oof. Now we're into the extra-meaty problems
C#
Part 1
Part 2
Helpers
Commentary
I think I lucked out for this one - for part 1, I could tell it was context-free, but I didn't notice that it was also regular, so I didn't even _try_ to do it with a regex engine. Instead, I rolled my own recursive parser, and I was surprised at how quick it turned out - Part 1 was solved in only 96ms.Part 2 was definitely a problem-specific hack, as Eric recommended. My recursive parser will handle the recursion on Rule 11 correctly, which is where I was very glad that I hadn't gone with the pure regex approach. The problem was that it was greedy on Rule 8, and couldn't backtrack to a deeper Rule 8 match if Rule 11 failed to parse. So there's some pure hackery for those two cases, but indeed they work. I'm kinda curious as to how far this approach would generalize if I expanded my rule definitions slightly.
Python
Repo Link
Part 1
Oof, these are getting tough. Once again, I went for a recursive approach and basically tried to match the letters in the message to the corresponding state. This means that a depth-first search on the next states that correspond to the current state in the set of rules.
The trickiest part of this approach was figuring out what to return in order to allow for backtracking (if we go down a path and it ultimately doesn't match, then we need to backup and try an alternate path). Eventually, I settled on checking if the message generated by the recursive call was different than the original message, then that means the recursive call was successful because tokens were consumed.
At the end, if the
match_rule
returns an empty message then all the states in the initial rule were satified, otherwise, if there is a non-empty message, then that is the remaining unmatched sequence (which will be useful for Part 2).Part 2 (diff)
For Part 2, I spent a while trying to change my match_rule function to handle loops, but couldn't figure it out. After seeing the hint from @PapaNachos about counting matches for rules
42
and31
, I saw that I could use my existingmatch_rule
function to consume all the42
patterns at the beginning of the message and all the31
patterns at the end of the message. Once I've done that, I just needed to check if the whole message is consumed and if the number of42
patterns was at least2
, the number of31
matches was at least1
, and the difference between the two matches was at least1
I tried to do this one entirely with logical regex, a bold choice given that my familiarity with regex was low to start with. I learned later that it was an especially bold choice because part 2 is literally impossible to do with a single, pure regular expression, because it's asking for something that's not a regular language! Several disgusting hacks later, I had a solution inspired by some of the others in the thread, but I don't know how many more of these are in my wheelhouse.
Part 2, and mostly part 1 but I forgot to commit it
SOS! I took the approach of compiling the rules into a single giant regex for part 1, which produces the correct answer for the small sample input but gives a wrong answer for my real puzzle input. I'm totally stumped on what I'm doing wrong, as the regex pattern generation is mostly straightforward.
For those of you who did the giant-regex approach (@OrangeBacon @PapaNachos @nothis, looks like), could one of you share your input and the regex pattern that was generated from it? I just need one more good sample to compare with what my code produces. 🙏
I feel your pain, this happened so often for me!
I'm posting my input and regex (without loops, with loops it breaks tildes' 50000 character limit!). Note that I'm posting the raw output from my regex-text generator. It's an intimidating wall of regex and I put it through re.compile("^" + regex_string + "$"), which outputs a much shorter string. I assume there's a way to detect and compress redundancies during generation but this is already at the outer fringes of what I can wrap my head around.
If you can easily run python, you can just copy-paste my code above and generate the code yourself, maybe also for a much shorter rule set. I'm also posting my regex for the short test set in the instructions.
Input
Regex (without loops)
Testdata
Testdata-Regex
Aha! I had an assumption in my parsing logic that the "disjunction" rules always have exactly two numbers on either side of the pipe, but in my full input there's a single case of just one number on either side:
54 | 117
. I'm kind of ticked off that that wasn't made clear in the sample input or the instructions.Thanks so much for sharing this, you helped me make sure I was creating the regex pattern correctly so I could track down the issue elsewhere.
Well, that's plain evil! I think I remember this for the "42 | 42 8" replacement, which only has one rule to the left!
Generally, the only thing that bothers me about these is that such gotchas are often not covered by the test data. It turns half an hour challenges into much longer bug-tracking ordeals that I don't think are appropriate for what is supposed to be, ultimately, a fun Christmas puzzle thingy.