8
votes

# Day 11: Cosmic Expansion

## Today's problem description: https://adventofcode.com/2023/day/11

Please post your solutions in your own top-level comment. Here's a template you can copy-paste into your comment to format it nicely, with the code collapsed by default inside an expandable section with syntax highlighting (you can replace `python`

with any of the "short names" listed in this page of supported languages):

```
<details>
<summary>Part 1</summary>
```python
Your code here.
```
</details>
```

Had an irritating error that prevented me from getting a leaderboard spot on part 2, but other than that, today was a nice break after yesterday. I did part 1 pretty naïvely, so had to rewrite my solution for part 2.

## Solution

This feels like one of those problems that quality-checks how well thought out your initial implementation was, because I only narrowly dodged having to rewrite the whole thing for Part 2.

## Part 1 Solution

We can start this one with the most obvious implementation, which is:

Going through all the rows & columns, mark off all the columns and rows that have a galaxy (or, equivalently, all the columns/rows that don't).

Create a new map doubling all those rows/columns and place all the galaxies appropriately in the new map.

Run a BFS from each galaxy to each other galaxy.

This is probably good enough to get you through this part, though it might already be chugging a bit. We'll start by noting that a BFS actually isn't necessary. If you play around a bit, you might be able to see that the shortest path between 2 galaxies is to start at one galaxy and then go vertically until you're on the same plane as the other one, and then just go horizontally until you're at the other galaxy.

To capture this with an algorithm, if we took the X and Y positions of the two galaxies then the distance between those two galaxies is just the absolute differences between the X and Y coordinates (in math terms this is how distance works in the Taxicab Geometry).

We could go further here because the mere mention of "X" and "Y" positions might already be telling you things about how to optimize this, but if we do too much then there's nothing to write for Part 2! So we'll leave it here for now.

## Part 2 Solution

This is where the solution above gets quality-checked out of existence. Even if you can check each galaxy in constant time, building a map that's millions by millions doesn't quite work.

But this is where some extra insight into what we're actually doing with our algorithm hints at a way to compress this map. Namely, we don't care about most of the map, we only care about the galaxies!

If we built compressed map that only contains the X and Y positions of the galaxies, then we can just use taxicab distance for all the pairs and solve everything right then and there.

The problem is getting those positions. But let's look more carefully how we're building that map. Namely, if on the un-expanded map a galaxy is at row R and column C, every empty row before R contributes a million* expanded empty rows to R in the expanded map, and every empty column before C

independentlycontributes a million* expanded empty rows to C in the expanded map.^{*Alright it's a million minus one, because the empty row itself is already counted in R (this is easier to see in Part 1 where it's just two rows).}That is, in the expanded map,

`(expanded-R, expanded-C) = (R + (1000000-1)*(# of empty rows before R), C + (1000000-1)*(# of empty columns before C))`

Which we can figure out just by counting the number of empty rows before each R and each C.

Now once we've collected the galaxies, we can mark off which rows/columns are empty and collect a record of all the rows/columns that are empty, and build the expanded map using the above algorithm to get the answer.

## Bonus Solution Discussion

We found a solution that is O(n^2) in terms of number of galaxies, which is definitely good enough for this problem. But what if I told you it isn't

asymptotically optimal? Alright this one isn't nearly as ridiculous as the other bonus that involved polynomial interpolation, but it's still pretty cool.The first thing to note is that the taxicab algorithm calculates the X and Y differences independently, and then just adds them together. So what if we uncoupled those and also calculated the X and Y parts of the taxicab distance independently as well?

That would look like:

If you precompute those sums in the last equation, the whole thing can be done in O(n).

Now in this case you could argue that it's O(n log n) because you need to sort to get the X/Y coords in order, but because of the way our input is formatted there's actually a hidden second factor to the complexity, which is just reading the input at all.

If we scan left to right then top to bottom (like usual) then we get all the Y coordinates in order for free.

Similarly, if we scan top to bottom first and then left to right then we can also get the X coordinates in order, since we don't need to associate these coordinates to any particular galaxy.

## Part 1/2 Code

A fun day! I'll admit my initial approach was to actually manipulate the array, which obviously needed reworking for part 2. After writing part 2 I went back and refactored as there was no need to have two different approaches.

## Haskell Solution

I've been noticing how much my approach to problems has changed using Haskell. Little things like the ease of defining new data types or having list tools like

`transpose`

at hand make me much more likely to work them into my solution. It's surprising how minor language difference will nudge you towards defaulting to certain tools.Hah! I predicted exactly what they were gonna do for Part 2.

## Algorithm

The idea is to store the x/y distances as a Prefix sum. Each row/column on the map is assigned the distance light travels when crossing that row/column, and thanks to the prefix sum I can lookup quickly how far apart two

`x`

or two`y`

coordinates are.And since the distance between two galaxies is the manhattan distance here (because we cannot move diagonally), knowing the distances along the

`x`

and`y`

axes is enough to compute the distance between galaxies.## Rust

## Performance

Golang solution

Not too bad today, but I found the problem presentation sneaky. Spent too much time on blind alleys in Part 1. I should plan more before I get into code.

## Discussion with spoilers

I am irritated with myself for not planning more for part 1. I just sort of jumped into implementation and wasted a lot of time expanding the grid array and dealing with the fact that once you

start expanding it, your row numbers are for the old grid, so you have to account for that.

It cost me a lot of time, so I felt pretty dumb when I got into part2 and it made me realize I didn't actually need to expand the grid, just track how many empty rows and columns are crossed on the shortest path.

For computing the distances, I at least noticed that you didn't actually have to stair step down, it is just the sum of the row difference plus the column difference.

I had an unbelievably dumb mistake that took me like 30 minutes to work out on part 2, really sad about that. Went from ~2500 on part 1 to ~5500 on part 2

Python solutions

## grid additions

I realized my VisitCells function was written with the assumption of a dense grid, so I added a VisitActive function that only visits populated cells, more useful for a sparse grid.

visiting all unique pairs seemed like a useful generic function too, so I added that to the library instead of putting it in the code just for today.

## part 1 discussion

a useful trick is that the distance between two galaxies is the Taxicab distance and you don't actually need to search for it, it's just the distance in rows plus the distance in columns.

## part 1

## part 2

the sparse implementation of my Grid ends up being a huge help here.

## performance

no meaningful difference in runtime between parts A and B, as expected. but running over the example input yields a noticeable difference:

140x140 instead of 10x10, so about 200 times larger input yielding about 1000 times slower performance.

## edit: performance take 2

I had a suspicion about where my bottleneck might live - I implemented

`VisitPairs`

in a very brute-force way. the algorithm is inherently O(N^2) in runtime, but my simple implementation also uses O(N^2) memory, which is suboptimal. here's a better version that only needs O(N) memory, and allocates it in one shot to avoid resizing the array:and I was expecting that to give me a speedup, but holy crap:

I've been writing Kotlin this long weekend. It's nice, but it can be janky if something is missing. Now to use Kotlin on AoC...

For second part I tried adding the factor variable in, but that only cause OOM. So, I added

`virtualExpand`

that recompute star positions based on the rows. While doing that, I totally forgot about the mutating mid-loop thing...ElixirThis one was pretty straightforward. One of the more esoteric

`Enum`

functions,`flat_map_reduce/3`

, came in handy here.## Parts 1 and 2

I had some extra time to tweak things and see how they affected performance, so parts of the code are a bit more complicated or weird than they need to be, but hopefully it still mostly makes sense.

To avoid having to deal with a sparsely-filled grid after expanding along one axis, I did the expansions separately (and concurrently), with each producing a map looking like

`original_galaxy_coordinates => new_x_or_y_coordinate`

. Then, I merged the two maps together to get the full new coordinates of each galaxy.Rust## Day 11

Easily, easily my slowest day in terms of runtime (~300 ms), probably due to my liberal use of hashsets and hashset lookups. Also definitely at the stage where I cant do every day in time anymore.

https://git.venberg.xyz/Gabe/advent_of_code_2023/src/branch/main/days/day11