Phonons

Photography & Music, Science & Code

My solution to the Quora challenge

with 7 comments

Spoiler Alert! This post contains details of how I solved this problem. If you want to try this problem yourself, I strongly recommend you don’t read on. Coming up with your own method to do this efficiently is all the fun.

A few weeks ago my friend Albert nerd-sniped me with this coding problem from Quora, which they seem to use in their hiring process. I was immediately intrigued, and spent the next day or so on it.

It goes like this: A datacenter needs to be cooled using a duct. It is required to pass through every room of the datacenter exactly once, and some can’t be crossed(marked by a 1). It must start at the entrance(2) and end at the exit(3). The task is not to find a way of doing this, but rather counting how many ways there are to do it. I won’t reproduce the specifics, but here is a snippet:

Here is an example datacenter:

2  0  0  0

0  0  0  0

0  0  3  1

There are two possible ways to run the duct here:

2--0--0--0
         |
0--0--0--0
|
0--0--3  1

or

2  0--0--0
|  |     |
0  0  0--0
|  |  |
0--0  3  1

Write a program to compute the number of possible ways to run the duct. For the above example, the correct answer is 2.

Check their challenge page for specifics.

My first implementation was a simple forward search. Start at the entrance and iteratively grow the tubing one step at a time. I implemented this with a queue, which contained all the legal tubing variations for a given length N. From this I generated all the possibilities for length N+1 by growing by one in all possible directions.

This proved to be disastrously slow, and memory consumption was high, too. Since the branches of the search tree grow exponentially, it is highly beneficial to prune them as early as possible. I did this in 3 ways.

  • Merging equivalent branches: To find new solutions it really only matters wether a given room already contains a tube or not. The direction of the tubes is irrelevant. This means that leaf nodes in a tree that have the same ending, and have passed through the same rooms can be merged into one. The number of solutions derived from this merged leaf needs to be multiplied by the number of original variants it contains.
  • Prune branches that cut off areas: If a part of the datacenter is cut off and can’t be reached from current the end of the tubing, the branch will not result in valid solutions. They can be pruned.
  • Dead ends: This was the last optimization I came up with, and it was surprisingly effective. Basically, to pass through a room, it must have 2 or more adjacent rooms free. If there is only one free, this means the tube can enter, but not leave any more. As a result all the branches that contain this pattern are doomed an can be pruned.

The resulting algorithm is quite speedy. I wrote it in C++, using STL containers for all my data structures. I was only really interested in the algorithmic optimizations, and didn’t worry too much about memory optimizations or other tricks to shave off milliseconds. Also, I performed all these optimizations all of the time. It might well be that their effect is beneficial in the beginning but outweighed by their cost in the final stages, when there is not much left to prune.

Another optimization I thought of but never implemented:

  • Uneven number of exits: If there is an area that can only be reached by a number of bottlenecks of width one, this can only be even if the exit is on the same side as the current tube end, and uneven if they lie on different sides. Otherwise the configuration will eventually end up with a dead end as described in the third optimization.

I’m not sure how much time this would save, since the check is quite elaborate compared to the other tests I mentioned.

The reference problem with the 7×8 grid ran in about 15 seconds on my 2Ghz Core2Duo. It found 301716 solutions. Larger examples take significantly longer, so I guess the solution still has exponential scaling behavior, which is probably not surprising since it is very similar to a hamiltonian path problem.

The code is not really fit for public consumption, but if someone asks for it I might clean it up and post it here.

Written by Thomas

30.05.2010 at 19:18

Posted in Coding

Tagged with , , ,

7 Responses

Subscribe to comments with RSS.

  1. Hi, Thomas, I agree with your first and second optimization; however, I am a little bit confused about the third one. For example, in the example you showed before, when you reach (x=0,y=2), it is a dead end, right? but you cannot prune it. Could you please explain more?

    Lixing

    29.12.2010 at 03:29

    • I’m not sure I understand your problem. Which route are you looking at, and what coordinate system? The duct exit and the current position get counted as free, too. does that clear it up for you?

      Thomas

      3.01.2011 at 12:54

  2. Hi there, I stepped on this challenge by accident, I thought it would be a great exercise to experiment with pointers and linked lists in C (not C++). I used a recursion algorithm with one data structure and only one condition: to avoid the loose ends which might be created in a new room passage attempt. My results were 301716 in 4 seconds in a DELL laptop 2.33Ghz 2-core CPU.
    Have fun..

    Andreas

    17.02.2011 at 09:33

    • Hi, Quite impressive! But could you elaborate on the “loose ends which might be created in a new room passage attempt?” Thanks

      Yong

      3.01.2012 at 02:20

      • Hi, here is the trick in words.

        While being in a room, you don’t just jump to the next available, you need to make some investigation first. You go and ask your neighbors, what do they offer? Then you evaluate your findings and pick the most promising one. That is what I call “passage attempt”.

        But the real secret, as in real life, is how you value your findings. I called it “loose ends” or “the most wicked link”.

        If you need more, I can help, but don’t miss the fun…

        Andreas

        11.01.2012 at 22:50

  3. Hi Thomas, from what I have understood from your post I believe that you are doing a customized bfs algorithm with above mentioned 3 optimizations. I would like to know how did you implemented “Merging equivalent branches”? I believe implementing this would lead to a very high space requirement – storing state info for each branch and comparing them for merging. Please elaborate on this topic.

    Jack

    29.10.2013 at 18:36

  4. My solution written in C++ takes 2.5 seconds on a 2GHz Core2Duo machine. The techniques include early pruning, and traversing from both start and destination with each traversal going N/2 steps, and match up all pairs of complementing pieces. Given the total cells to be traversed is set G, for a subset g_i of G with size = |G|/2 including the starting cell S, there are p_i ways to traverse from S. Similarly there are q_i ways to traverse G – g_i starting from the destination cell D. Then the total number of solutions is sum(p_i * q_i) over all possible g_i. Fast pairing of complementing subsets is achieved using binary AND operation with each bit representing one cell.
    I can provide more details on the algorithm and post my code somewhere when I get more time.

    Cheng

    12.04.2014 at 08:51


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s