9
votes
Day 12: Hot Spring
Today's problem description: https://adventofcode.com/2023/day/12
Please post your solutions in your own top-level comment. Here's a template you can copy-paste into your comment to format it nicely, with the code collapsed by default inside an expandable section with syntax highlighting (you can replace python
with any of the "short names" listed in this page of supported languages):
<details>
<summary>Part 1</summary>
```python
Your code here.
```
</details>
This problem isn't really quality-checking your initial implementation so much as it is daring you to do the straightforward solution so it can pull the rug out from under you.
Part 1 Solution
Like yesterday if we go with the future-proof solution here we won't have anything to discuss for Part 2, so we'll spend a bit of time checking out the most obvious solution.
Namely, just try all the combinations and see what works!
I imagine you could have a hideous for loop do the manual checking since the array size is limited, but we can at least spare ourselves the dignity here of using recursion to deal with that instead. I would imagine such a solution could pass to itself the context of which part of the string you're building and where in the list you are, like something like:
This solution is pretty crude and elides some detail, but the general idea should work for a real solution.
If you program it well (and you don't need to program it well to get this part), then your code will only make a few calls per valid assignment, assuming there aren't many assignments, it should be pretty fast.
My answer was on the order of thousands, so it really doesn't take that long.
Part 2 Solution
Now it's time for the rug pull, because this expanded version absolutely takes that long (my answer was on the order of 1012).
Let's go back to the original problem and our original solution.
Did you notice an optimization for our
COUNT_SOLUTIONS
search routine? Namely, if thestring_candidate
is known to not match the reference string prefix, then we can cut it off early. This sounds simple, but it means that we can actually haveCOUNT_SOLUTIONS
be in terms ofstring_index
andarray_position
(i.e. we know that the string is correct up tostring_index
and we're trying to build the rest of the string afterwards).But now we realize something even more interesting: If we call
COUNT_SOLUTIONS
with the samestring_index
andarray_position
every time, it will always return the same answer, but our current search algorithm calls itself billions of times to count every solution one by one. Why process all those calls forCOUNT_SOLUTIONS
when we can just call eachstring_index
andarray_position
once and cache the answer to return any time it gets called again?This insight represents a ridiculous speedup (and a complete rethinking of the solution's complexity which we'll get back to later), because it saves not only the immediate call to
COUNT_SOLUTIONS
but also every recursive call that it would have made, and every call that that call makes, etc.So, after rewriting the
COUNT_SOLUTIONS
to be based off ofstring_index
andarray_position
we can stick an map/dict/2d array somewhere that, if we encounter a call where we've already calculated the answer, just returns it immediately.Now back to the complexity. It's much faster, but how much faster?
Well we can answer that with how our function is called and what it does.
There are at most (string length) values of
string_index
and at most (array length) values ofarray_position
, and most importantly every corresponding calculation withCOUNT_SOLUTIONS
is done at most once (because every call after that is just a map/dict/array lookup).Then, for each call,
COUNT_SOLUTIONS
does some string building/checking and makes 2 other calls toCOUNT_SOLUTIONS
.This gives means that we do at most
(string length)*(array length)*(string comparison cost)
operations, which is much faster than the(weird, could be really bad)
complexity that it was before and definitely enough to solve the expanded version of the problem they're throwing at us now.Just to note, the strategy we used here is has the wonderfully undescriptive computer science name of Dynamic Programming, which can be used to find solutions to some very weird and hard problems, though with the distinct downside of the solutions themselves being strange looking and difficult to describe.
C++ Part 1/2 Code
I have to preface this with the fact that my solution doesn't actually do
COUNT_SOLUTIONS
in the same way as above, but it uses the same dynamic programming idea of re-using existing counts.Ugh, this one kicked my ass. Ran out of steam last night and didn't have much time today, but I was barking up the completely wrong tree for a while. I was very tired when I started the problem, so that meant I got pretty far up the blind alley and had a hard time abandoning the whole strategy.
Golang solution
Discussion
I was treating it like a DFS problem (walking through trying each value for each bit), which worked okay Part 1, but didn't scale well. So I was hammering on pruning optimizations for a good while before I had to go looking for hints. In the end, I don't think I was ever going to get there because I was incrementally setting the unknowns, measuring the groups (up to the first unknown that hadn't been set), then seeing if the measured group was compatible. This meant I was also always working with the whole string, and the approach didn't make it easy to tell which parts of the input were related to which groups.
When I started the problem last night, the scalable solution (recursively peeling groups off the front of the row) seemed like it would require too much backtracking to get right. I was missing the insight that any good spring must separate a group. That greatly simplifies the peeling logic and the number of cases to consider. In the end I'm pretty happy with my solution. It has a simple flow and not too many edge cases in the logic.
One other good outcome, I learned what memoization is.
Wondering if anyone could point me in the right direction for this day's part 2? I tried for hours to get it working but hadn't ever done dynamic programming before and so wasn't able to figure out what to memoize to make my program fast enough.
My code is probably dynamic programming. If I have the input
.###..# 3,1
the algorithm works by#
,.
or?
.
, then it is not relevant. The result of.###..# 3,1
is the same as###..# 3,1
. So, we recurse with new input###..# 3,1
#
, then count all continuous#
or?
next to it.#.
#?.
or premature end of string)###
. The fourth must be either end of string,.
or?
(which can only be resolved as.
). Check that it is so..# 1
.So with that you can now validate whether a solution is valid or not, but that is only a single solution. Now the
?
comes into play. Let's say the input is?##?.? 3,1
?
. It can be either#
or.
#
, then, similar to step 3-5 previously we check that the first 3 characters are###
. And the 4th character is either end of string,.
or?
(which can only be.
).?##?
=###.
. The input is now.? 1
. Recurse and count the number of solutions.
, then as mentioned in step 2.
can be removed. The input is now##?.? 3,1
. Recurse and count the number of solutions.As for the caching part, if you have the same inputs (map & blocks) then there's only one correct answer. So you can use both of them as cache key and lookup the cache first before actually recursing.
I found that for part 1, actually building out the solutions help in debugging. It is not scalable enough for part 2, however. Bugs may cause your algorithm to run wild (like I added an extra
?
at the end of part 2 input), so good luck.Welp, my updated solution for part 2 is passing unit tests (including checking the count for each line individually) but giving too high an answer for the real input. It seems like I'm missing some edge case that exists only in the real input, but I couldn't begin to guess what that is.
Could someone who has a working solution share their puzzle input, with each individual line tagged with its individual arrangement count? Only if you have time to help a stranger on the internet, ofc.Maybe via pastebin / GH gist, considering how big the input is...
edit: I figured it out! I wasn't checking that the arrangement contains no '#'s after the last contiguous segment.
Parts 1 and 2, now working
After changing it to run on all lines concurrently, and tuning the cache table appropriately, it solves part 2 in about 1.8 sec on my machine. :)
Fun day, and continuing to step up the difficulty!
Overall I'm not especially thrilled with my solution; after the manipulation to make it memoized it's no longer very readable (and I'm too tired to bother cleaning it up).
I am, however, happy to have learned how Haskell can implement memoization with just a list. It took some mangling to convert my solution into taking a single
Int
parameter, but after that the conversion to the memoized form was neat.Haskell Solution
As a huge paint-by-numbers fan, I appreciate the setup of this problem quite a bit.
Part 2 notes
After an easy part 1, I tried to ~~ math ~~ this with a partially-recursive, partially-combinatorial method. Then I gave up after an hour or more and did it straightforward. Had a couple dumb mistakes (what a surprise) but eventually got there.Python solutions
Today's language is JavaScript. I wanted to use TypeScript, but I'm lazy to setup the compiler. A trailing separator in my join implementation make me rewrote part 2 into recursiveless, before realizing it would be very complicated. I spotted the error during the rewrite.
Part 1
My actual part 1 implementation use exception to signify "invalid solution". Later on the the catch block catches the recursion depth limit error and make it return the wrong results. So during part 2 rewrite I had to change it to returning empty list.Part 2
The part 2 version do not visually build the output (it was useful in part 1 to debug any mistakes). Otherwise it is mostly the same as part 1. Adding cache also help improve large running time.God, this one was hard. I didn't complete it within 24h. The algorithm I came up with was sound, but once again, I lost hours upon hours in implementation bugs. At one point I was exhausted enough to screw up my unit tests, which made further progress extremely difficult... but I'm proud that I finally got through it after two days, and without any spoilers, too.
Algorithm
For part 1 I wrote a RegEx template that validated the strings, it looks like this:
^(\.|\?)*(\#|\?){4}(\.|\?)+(\#|\?){1}(\.|\?)+(\#|\?){1}(\.|\?)*$
for a row with4,1,1
as the consecutive broken spring numbers. I knew this was highly sub-optimal, but I just threw that in a Breadth-First Search that successively replaced the question marks and it worked fairly quickly, because the RegEx pruned away the invalid states. Had I not spent quite a bit of time on a more elegant solution by that time (because I was fairly certain this was another one of those 'make the naive solution exponentially slower' part 2's), it would have been a quick win...Since the RegEx worked so well, I decided to try and use them for Part 2 as well. The idea was to generate a simple finite state automaton (a simplified version of what is used for regular expressions) for each row, that would
accept
if the row was 'proper' and then count the number of times we end up in theaccept
state, similar to Day 5. Yes, this is technically dynamic programming.Rust
Performance