Spoiler Alert! This post contains details of how I solved this problem. If you want to try this problem yourself, I strongly recommend you don’t read on. Coming up with your own method to do this efficiently is all the fun.
A few weeks ago my friend Albert nerd-sniped me with this coding problem from Quora, which they seem to use in their hiring process. I was immediately intrigued, and spent the next day or so on it.
It goes like this: A datacenter needs to be cooled using a duct. It is required to pass through every room of the datacenter exactly once, and some can’t be crossed(marked by a 1). It must start at the entrance(2) and end at the exit(3). The task is not to find a way of doing this, but rather counting how many ways there are to do it. I won’t reproduce the specifics, but here is a snippet:
Here is an example datacenter:
2 0 0 0
0 0 0 0
0 0 3 1
There are two possible ways to run the duct here:
2--0--0--0
|
0--0--0--0
|
0--0--3 1
or
2 0--0--0
| | |
0 0 0--0
| | |
0--0 3 1
Write a program to compute the number of possible ways to run the duct. For the above example, the correct answer is 2.
Check their challenge page for specifics.
My first implementation was a simple forward search. Start at the entrance and iteratively grow the tubing one step at a time. I implemented this with a queue, which contained all the legal tubing variations for a given length N. From this I generated all the possibilities for length N+1 by growing by one in all possible directions.
This proved to be disastrously slow, and memory consumption was high, too. Since the branches of the search tree grow exponentially, it is highly beneficial to prune them as early as possible. I did this in 3 ways.
- Merging equivalent branches: To find new solutions it really only matters wether a given room already contains a tube or not. The direction of the tubes is irrelevant. This means that leaf nodes in a tree that have the same ending, and have passed through the same rooms can be merged into one. The number of solutions derived from this merged leaf needs to be multiplied by the number of original variants it contains.
- Prune branches that cut off areas: If a part of the datacenter is cut off and can’t be reached from current the end of the tubing, the branch will not result in valid solutions. They can be pruned.
- Dead ends: This was the last optimization I came up with, and it was surprisingly effective. Basically, to pass through a room, it must have 2 or more adjacent rooms free. If there is only one free, this means the tube can enter, but not leave any more. As a result all the branches that contain this pattern are doomed an can be pruned.
The resulting algorithm is quite speedy. I wrote it in C++, using STL containers for all my data structures. I was only really interested in the algorithmic optimizations, and didn’t worry too much about memory optimizations or other tricks to shave off milliseconds. Also, I performed all these optimizations all of the time. It might well be that their effect is beneficial in the beginning but outweighed by their cost in the final stages, when there is not much left to prune.
Another optimization I thought of but never implemented:
- Uneven number of exits: If there is an area that can only be reached by a number of bottlenecks of width one, this can only be even if the exit is on the same side as the current tube end, and uneven if they lie on different sides. Otherwise the configuration will eventually end up with a dead end as described in the third optimization.
I’m not sure how much time this would save, since the check is quite elaborate compared to the other tests I mentioned.
The reference problem with the 7×8 grid ran in about 15 seconds on my 2Ghz Core2Duo. It found 301716 solutions. Larger examples take significantly longer, so I guess the solution still has exponential scaling behavior, which is probably not surprising since it is very similar to a hamiltonian path problem.
The code is not really fit for public consumption, but if someone asks for it I might clean it up and post it here.