Wilan: July 2009

Friday, July 31, 2009

TC SRM387 D2 1000 (MarblesRegroupingHard)

The problem statement can be accessed via:
http://www.topcoder.com/stat?c=problem_statement&pm=8538

There are two major approaches to solve this problem. But first, let's start with breaking down the problem, it asks us to arrange a set of jumbled up marbles into a correct box configuration. We can try a greedy strategy in which we heuristically move coloured marbles to boxes in which they have the most presence, aiming to reduce the number of moves. However it was rather easy to construct a counter-case towards this strategy. So we are faced with either doing a smart enumeration or try to reduce it towards a graph matching-like problem and hope that a standard graph algorithm solves it.

To try to reduce it to a matching algorithm we attempt to divide it into a bipartite graph. The obvious choice is colours on one side and boxes on the other side. Now we need to give a relationship between the two disjoint subsets - again, pretty obvious: the cost associated with moving the colour into a specific box. Hence we now have a weighted bipartite graph in which we need to match for minimum cost. Voila! The standard hungarian algorithm and/or minimum cost flow solves this perfectly. However, given the fact that we may not have an implementation (or even know of the algorithms) we can try and do some smart enumeration.

Enter dynamic programming (again), and it's pretty similar to SRM 444 D1 500. If we traverse the boxes and gradually expand our set of available boxes whilst keeping track of the colours we have used, we can effectively DP on this state (as it contains both the essential information to optimally decide options as well as being compact enough to be stored within the memory limit of 64MB). To keep track of the colours we simply keep a bitmask and since the maximum number of colours in the worst case is only 14, this means a worst-case space complexity of O( B * 2 ^ C ) = 50 * 16384 = 819200 (under 1MB).

As with all DP problems we then define the recurrence relation. To do this we simply consider a set of possible and valid decisions we can make at a specific DP state.

1. Skip the box (don't put any marbles into the box)
2. Assign colour i into the box (also ensuring we haven't used colour i yet due to the constraints of the problem - a simple bitwise-AND operation with the bitmask can check this)

Using these two decisions we get:

F(idx,mask) = max { F(idx+1,mask), F(idx+1,mask | (1 << colour)) + cost(idx,colour) }

We define cost(idx,colour) as the cost of moving "colour" coloured marbles into box idx. This is simple to calculate, we just need to keep track of the total number of marbles associated with that colour for all the boxes and move all those marbles except the ones that are already in box idx to that box. We define cost(idx,colour) as #Colour-Coloured Marbles - box[idx][colour].

Not forgetting base cases, an invalid configuration is when it reaches the end of the boxes and it still hasn't used up all the colours. Hackishly you can just return a value of (pseudo) "infinity" for this situation. Otherwise you return a value of 0 (no more additional costs/penalties associated - i.e. a valid configuration).

My implementation in the practice room follows (using the ideas above):


#define INF 1<<20

int memo[51][1<<15];
int sz; int numCol;
vector<vector<int> > vec;
vector<int> totalColors;

int func(int idx, int mask) {
  int res = 0;
  if (idx == sz) {
     return mask == (1 << numCol)-1 ? 0 : INF;
  }
  if (memo[idx][mask] != -1) return memo[idx][mask];
  res = func(idx+1, mask);
  // assign colour
  for (int i = 0; i < numCol; i++) {
     if (mask & (1 << i)) {
         // taken
         continue;
     }
     res <?= func(idx+1, mask | (1 << i)) +
          (totalColors[i] - vec[idx][i]);
  }
  return memo[idx][mask] = res;
}

int MarblesRegroupingHard::minMoves(vector <string> boxes) {
  int res = 0;
  totalColors = vector<int>(14,0);
  for (int i = 0; i < boxes.size(); i++) {
     istringstream iss(boxes[i]);
     vec.push_back(vector<int>());
     int val;
     int idxC = 0;
     while (iss >> val) {
         vec[i].push_back(val);
         totalColors[idxC++] += val;
     }
     numCol = idxC;
  }
  memset(memo,-1,sizeof(memo));
  sz = boxes.size();
  return res = func(0, 0);
}

TC SRM360 D1 500 (PrinceOfPersia)

The problem statement can be accessed via:
http://www.topcoder.com/stat?c=problem_statement&pm=7876&rd=10772

This problem lends itself to a graph problem as can be observed by the fact that, the input format is in a grid format and there are obstacles (forbidden cells) and its related to path accessibility between the prince and princess. One way to solve this problem is to use the Maximum Flow algorithm. Take note, although they are required to "meet" in any cell - it is sufficient enough for the prince (or princess) to go to princess (or prince) as it is not limited by time.

How does it relate to the Maximum Flow problem? We are basically asked to find the minimum cut and using the max-flow min-cut theorem we can calculate this by running a maximum flow algorithm through the graph. To see that what we really want is the minimum cut, let's revise the definition of a cut (in terms of graph theory of course): A cut is a partition of vertices of a graph into two disjoint subsets. So if we imagine the princess in one subset and the prince in another subset, the minimum cut between these two disjoint subsets is the number of accessible ways in which one can go to another.

So now we move on to applying the actual algorithm to the problem. If we let each empty cell be a node (or vertex) on the network and assign 1-capacity edges to adjacent empty nodes and apply flow from one "P" (source) to another "P" (sink) do we obtain the correct answer? Well, unfortunately no. The problem lies in which one "bottleneck" node (on the minimum cut set) can have an outgoing flow of more than 1, so it may over-calculate the answer. To see this, consider the example:

# . P
. . .
P . #

The central node in the snippet can have an incoming flow of 2 if both its adjacent nodes gives it a flow of 1. The problem with this is that the bottom-left P (assume it to be the sink) will have a maximum flow of 2 (and hence a minimum cut of 2). So our problem now is to ensure that each empty node will only give out a maximum of 1 out-going flow.

One wrong approach is to ensure that there is only an incoming flow of maximum 1 (i.e. 1 edge going into the node). The problem with this is that it totally breaks down the network layout. The solution is to further break each empty cell into 2 nodes: let's call them A and B. These nodes are used as "filters" or more precisely input and output filters. The edge from A->B will have a capacity of 1 which limits the maximum flow across the node to be 1 whereas the incoming and outgoing arcs of A and B respectively are infinite. This allows the network to retain the correct layout (and hence the correct answer) as well as fixing the original problem. We just need to ensure that all input arcs to the node use the sub-node A, and all output arcs from the node use the sub-node B.

Implementation-wise it's pretty standard if you follow the ideas discussed earlier. A sample implementation from the practice room is given below:


// the maximum number of vertices
#define NN 256

// adjacency matrix (fill this up)
int cap[NN][NN];

// flow network
int fnet[NN][NN];

// BFS
int q[NN], qf, qb, prev[NN];

int fordFulkerson( int n, int s, int t )
{
    memset( fnet, 0, sizeof( fnet ) );
    int flow = 0;
 // find an augmenting path of at least 1
    while( true )
    {
        memset( prev, -1, sizeof( prev ) );
        qf = qb = 0;
        prev[q[qb++] = s] = -2;
        while( qb > qf && prev[t] == -1 )
            for( int u = q[qf++], v = 0; v < n; v++ )
                if( prev[v] == -1 && fnet[u][v] - fnet[v][u] < cap[u][v] )
                    prev[q[qb++] = v] = u;
        if( prev[t] == -1 ) break;
        // get the bottleneck capacity
        int bot = 0x7FFFFFFF;
        for( int v = t, u = prev[v]; u >= 0; v = u, u = prev[v] )
            bot <?= cap[u][v] - fnet[u][v] + fnet[v][u];
        // update the flow network
        for( int v = t, u = prev[v]; u >= 0; v = u, u = prev[v] )
            fnet[u][v] += bot;
        flow += bot;
    }
    return flow;
}

int dx[] = {-1, 1, 0, 0};
int dy[] = {0, 0, -1, 1};

#define INF 999999

int PrinceOfPersia::minObstacles(vector <string> maze) {
 int res = 0;
 bool gotP = false;
 int source = 0, sink = 0;
 for (int i = 0; i < maze.size(); i++) {
  for (int j = 0; j < maze[i].size(); j++) {
      // assign the source and sink
   if (maze[i][j] == 'P' && !gotP) { 
     source = (i * maze[0].size() + j)*2; gotP = true; 
   }
   else if (maze[i][j] == 'P' && gotP) { 
     sink = (i * maze[0].size() + j)*2; 
   }
   // build connections
   if (maze[i][j] == '.') { 
     cap[(i * maze[0].size() + j)*2][(i * maze[0].size() + j)*2+1] = 1; 
   }
   else if (maze[i][j] == 'P') { 
     cap[(i * maze[0].size() + j)*2][(i * maze[0].size() + j)*2+1] = INF; 
   }
   // iterate through adjacent cells
   for (int k = 0; k < 4; k++) {
    int mi = i + dx[k];
    int mj = j + dy[k];
    if (mi < 0 || mj < 0 || mi >= maze.size() || mj >= maze[0].size()) 
      continue;
    // impossible case
    if (maze[mi][mj] == 'P' && maze[i][j] == 'P') return -1;
    if (maze[mi][mj] != '#') { 
     cap[(i * maze[0].size() + j)*2+1][(mi * maze[0].size() + mj)*2] = INF;
    }
   }
  }
 }
 int flow = fordFulkerson(maze[0].size()*maze.size()*2, source, sink);
 return res = flow;
}

Thursday, July 30, 2009

Google Code Jam and DP Tutorial

With Google Code Jam 2009 finally being announced last week - it is now a good time to start practicing. I've been warming up with some Topcoder problems which I've missed since the recent contests have been held early in the morning :)

Google Code Jam: http://code.google.com/codejam/

On a side note, I've written a dynamic programming tutorial early last year aimed towards newer competitors who are unfamiliar with the concept. Unfortunately I no longer have a copy of the TeX file which means I can't edit it for corrections.

DP Tutorial can be accessed here: http://sites.google.com/site/wilanw/DP1.pdf

TC SRM442 D2 1000 (EqualTowers)

The problem statement can be accessed via:
http://www.topcoder.com/stat?c=problem_statement&pm=10466&rd=13750

In summary, we are given a set of blocks and are asked to construct two equal height towers from using a subset of the given blocks. This problem is similar to the subset sum problem, since that problem is NP-complete we can guess at the time complexity of this problem. The subset sum problem is the following: given a sequence of integers (both negative and non-negative) find a non-empty subset which sum totals exactly to 0. How does this problem relate to the one given? We can treat the negative numbers as one side of the tower and the positive numbers as the other side. As the absolute values are the same (since the sum totals 0) this fits our problem. However, there is a major difference: we must find the largest height.

However from this we are given a slight hint on how to approach the problem. The subset sum problem is solved using dynamic programming and hence it is likely we can approach this problem in the same way.

Deciding the state space is slightly more tricky, when looking for sub-problems we can take note that we can gradually consider a growing set of items (in a very similar manner to the 0-1 knapsack problem). However, this is not enough we also need to know the heights of both towers to make an optimal decision (and ultimately to decide whether or not the block configuration is a valid one - i.e. the heights are equal). A naive approach would be to take the heights of both buildings and put them in the state, yielding:

F(i, size1, size2) = the highest building size we can construct using a set of blocks from up to index i which the buildings have a size of size1 and size2 respectively.

Looking at the constraints, this is state space is simply too large to fit into the memory constraints. So we must optimise or compress our state space. If we look closely we can eliminate having two different building sizes by using the difference between the taller and shorter building. We can store the "base" of the building in the actual DP state value. For example, if the buildings we had so far were of height 5 and 8. We can store 5 as the DP state value and 3 as a building parameter which depicts the height of the taller one to be 5 + 3 = 8. We don't actually need to worry about which building is which, as our final goal is only to determine whether or not the buildings are equal. This allows us to also determine valid configurations to have an allowable difference of 0.

Trying to construct another DP state now:

F(i, size) = the height of the shorter building we can construct using a set of blocks from up to index i in which the difference between the shorter and taller building is size.

We calculate whether or not this will fit in the memory constraints (which is 64MB). The maximum number of blocks is 50 and the maximum size of a building is 250000 (as if it's over 250000 we can never match a second building to that height because the maximum sum of all the blocks is 500000). This yields a space complexity of O(n * s) which fits perfectly.

Constructing the recurrence is rather simple, we have 3 decisions for every state.
1. Do not add block i into either towers
2. Add block i to the shorter tower
3. Add block i to the taller tower

If we add block i to the shorter tower, we must be careful to note whether there will be a swap between the current shorter tower and the current taller tower. However, as mentioned above we don't actually care which tower is shorter or taller so we only need to calculate the new height of the shorter tower and the new difference. The new difference is simply the absolute difference between the current difference and the block to be added: abs(size - block[i]). The height of the new shorter tower is simply the minimum of (size, block[i]) added to F(i,size). The first and third condition are relatively simple - you just need to write an extra guard to prune any additions to the taller tower which exceeds 250000.

We now define the recurrence as:
F(i,size) = max(F(i+1,size), F(i+1,abs(size-block[i])) + min(size,block[i]), F(i+1,size+block[i]))

This can then be solved using either memoization or bottom-up dynamic programming. Further optimisations can be made by noting that the recurrence only takes into account the previous index. So we only need at most O(2 * s) space. We then alternate and over-write the previous rows using simple modulus calculations. We can also discard any block size which is greater than 250000. We can also calculate the sum of all the blocks and only iterate the size from 0 to sum/2 which will generate faster execution for smaller cases. My practice room solution is as follows:


int dp[2][250011];

int EqualTowers::height(vector <int> bricksPre) {
 vector<int> bricks;
 for (int i = 0; i < bricksPre.size(); i++) 
  if (bricksPre[i] <= 250000) bricks.push_back(bricksPre[i]);
 dp[bricks.size()&1][0] = 0;
 for (int i = 1; i <= 250001; i++) dp[bricks.size()&1][i] = -1;
 for (int i = bricks.size()-1; i >= 0; i--) {
  for (int j = 0; j <= 250001; j++) {
   dp[i&1][j] = dp[!(i&1)][j];
   if (j + bricks[i] <= 250001) dp[i&1][j] >?= dp[!(i&1)][j + bricks[i]];
   if (dp[!(i&1)][abs(j-bricks[i])] >= 0) dp[i&1][j] >?= dp[!(i&1)][abs(j-bricks[i])] + min(j, bricks[i]);
  }
 }
 return dp[0][0] ? dp[0][0] : -1;
}

Wednesday, July 29, 2009

TC SRM444 D1 500 (FourBlocks)

The problem statement can be accessed via: http://www.topcoder.com/stat?c=problem_statement&pm=10487&rd=13898

To summarise the problem, it asks us to find the maximum number of 4 blocks we can place within a grid layout. It is always optimal to use as much 4 blocks as we can as their value per unit area is better than 1 blocks.

This problem lends itself to a dynamic programming algorithm due to the fact that there are significant amounts of overlapping subproblems. To see this, we isolate one column and see that the possible configurations of inserting in new 4 blocks are only dependent on this column and the previous one (as the maximum block sizes we are allowed are 2x2). Looking at the constraints we see that we can bitmask based on the grid height (i.e. number of elements) as this only has a maximum of 10 entries and hence 2^10 = 1024 combinations.

Let's start by defining the DP state:
F(idx, mask) = the maximum number of points using idx columns and with the previous column having a bitmask of "mask"

Naturally, we next consider base cases and the recurrence relation. If the current index of the recurrence is equal to the number of columns (i.e. the size of each grid element) then we need to check whether or not the last column made any new 4 blocks. Why? This is because if the algorithm inserted a 4 block in the last column it's going to "overflow" into the last non-existent column on the board. Implementation wise there are several ways to do this, one is to not let new 4 blocks be made in the last column - another way is to let it overflow and make a condition which checks for this to prevent it from such an occurrence being used as an optimal figure (i.e. by setting the final result of this configuration to a negative value).

We define x as a possible 4 block configuration candidate.
F(idx, mask) = foreach x max { F(idx+1, x) + numBlocksInserted(x) * 16 + (numRows - numBlocksNotUsed(mask | x)) }

Next thing is implementing the actual algorithm which is fairly straightforward given the recurrence relations. We just need to make sure that we don't insert blocks to which:

1. Already has an associated 4 block inserted in the last column (use the bits in the mask parameter to determine this)
2. Conflicts with any 1 blocks initially present with the board configuration (this is calculated and kept before the dynamic programming begins)
3. Does not create a 4 block in two consecutive rows (i.e. row x and row x+1 for a column y)

An easy and quick way to implement these are to use bit operations. Here is my practice room implementation which uses the ideas discussed above:


#define INF 9999999

int memo[26][1<<11];
vector<int> cols;
int sz;

int numBits(int n) {
 int k = 0;
 while (n > 0) { if (n & 1) k++; n >>= 1; }
 return k;
}

int func(int idx, int mask) {
 if (idx == cols.size()) {
  if (mask == 0) return 0;
  return -INF;
 }
 if (mask & cols[idx]) return -INF;
 if (memo[idx][mask] != -1) return memo[idx][mask];
 int res = 0;
 for (int i = 0; i < (1 << (sz-1)); i++) {
  if ((i & (i << 1)) || (mask & (i | (i << 1))) || (cols[idx] & 
       (i | (i << 1)))) continue;
  res >?= (numBits(i) * 16) + (sz - numBits(mask | i | (i << 1))) 
       + (func(idx+1, i | (i << 1))); 
 }
 return memo[idx][mask] = res;
}

int FourBlocks::maxScore(vector <string> grid) {
 int res = 0;
 memset(memo,-1,sizeof(memo));
 cols = vector<int>(grid[0].size(),0);
 for (int i = 0; i < grid.size(); i++) {
  for (int j = 0; j < grid[0].size(); j++) {
   if (grid[i][j] == '1') cols[j] += (1 << i);
  }
 }
 sz = grid.size();
 return res = func(0, 0);
}

Pages