Pages

Thursday, May 13, 2010

TC TCO10 Qualification Round 2

Overview:

This was the second qualification round for TCO10 with around 1300 competitors. As 600 competitors advance through to the next round (provided that at least 600 score a positive score), it provided ample opportunity to qualify for Round 1. The problems were rather simple with two greedy problems and a simple DP problem as the 1000 pointer. Having said that, the system test claimed a decent portion of the 500 pointer solutions due to a corner case which wasn't covered in the examples.

Problem 1 (250 Points): JingleRingle
Category: Greedy
URL: http://www.topcoder.com/stat?c=problem_statement&pm=10896

This problem required us to calculate the maximum profit we can make by buying jingles and selling them to various buyers. We are assumed to have infinite wealth so we can essentially buy all the jingles but we also want to maximise our profits.

The first key observation is to note that we can only use each buyer or seller once. The second key observation is that we only buy jingles from sellers if we can find a suitable buyer in which we can make a net profit (after tax). Now the question that remains is which sellers do we buy from and whom do we sell it to. Since each seller only gives us 1 jingle and we can only sell back 1 jingle at a time, this constraint essentially means that we can match a seller to a buyer (i.e. we are buying at a lower price and re-selling it at a higher price for a net profit). A simple greedy strategy would be then to buy at the lowest price and selling it to the highest price for maximum profit.

Does this greedy strategy work? Let's informally prove it. As we need to match a seller and buyer, if we buy from N people then we also have to sell it to N people. To maximise our profit we need to maximise our revenue, so the optimal choice is to choose the N highest buyers. Then to minimise our expenses we need to choose the N lowest sellers of jingles. It turns out it doesn't matter what order we pair the buyers and sellers as the total profit is the same if we choose the N highest buyers and N lowest sellers. Now all we need is to determine the value of N. Our greedy algorithm keeps pairing the highest available buyer to the lowest available seller until we reach a situation where we no longer profit, this is the value of N. If we assume this is not the value of N, then we have to somehow match a lower buyer to a higher seller or vice versa. But our value of N is at least as good as this on all cases hence our deduced value of N is correct.

A straightforward implementation:

class JingleRingle {
public:
   int profit(vector <int>, vector <int>, int);
};

int calcTax(int p, int tax) {
   return (p * tax) / 100;
}

int JingleRingle::profit(vector <int> buyOffers, 
      vector <int> sellOffers, int tax) {
   int res = 0;
   int buyJ = buyOffers.size()-1;
   sort(buyOffers.begin(),buyOffers.end());
   sort(sellOffers.begin(),sellOffers.end());
   for (int i = 0; i < sellOffers.size(); i++) {
      if (buyJ >= 0 && buyOffers[buyJ] - calcTax(buyOffers[buyJ],tax) 
         - sellOffers[i] >= 0) {
         res += buyOffers[buyJ] - calcTax(buyOffers[buyJ],tax) - sellOffers[i];
      }
      buyJ--;
   }
   return res;
}

Problem 2 (500 Points): FuzzyLife
Category: Greedy, Brute Force
URL: http://www.topcoder.com/stat?c=problem_statement&pm=10911

This problem has us calculate the maximum number of live cells on an infinite 2D plane after one iteration of time. The 2D plane is consisted of cells which are considered either dead or alive at a given slice in time. However, there are unknown cells denoted as '?' and our job is to determine whether each of these cells should be considered alive or dead for the next evolution in time to have the maximum number of alive cells.

The rules are:
- Each live cell with less than 2 or more than 3 live neighbours are dead
- Each dead cell with exactly 3 live neighbours becomes alive
- All other cell remain the same

Each cell has exactly 8 neighbours (even the ones on the "border" of the input because of the infinite plane) - 2 horizontal, 2 vertical and 4 diagonal.

The main observation required for this problem is the constraints of the problem allow us to adopt a greedy strategy and consider the neighbouring cells of a '?' as its own problem subinstance. As no cell will have more than 1 neighbour of type '?' this means that every cell is at most affected by 1 unknown cell. All we need to do is try each '?' cell by placing a '1' and a '0' in its place and simulate how many alive cells we are left with on the next iteration. We then simply choose the maximum of these two choices.

One tricky case is that we also need to take into account the '?' cell, if we place a '1' in it's position does it become alive or dead in the next iteration - vice versa if we place a '0'. We need to add this to our alive count before we compare which one is better. The other tricky case was to add a padding of dead cells around the grid as some of these can turn into alive cells (which need to be included in the answer) - this case however, was covered in the example cases. After we have filled in each of the '?' cells we can then do a single iteration pass and work out the number of alive cells.

The implementation using the ideas above:

class FuzzyLife {
public:
   int survivingCells(vector <string>);
};

// movement vectors
int dx[] = {-1, -1, -1, 0, 0, 1, 1, 1};
int dy[] = {-1, 0, 1, -1, 1, -1, 0, 1};

int FuzzyLife::survivingCells(vector <string> grid) {
   int res = 0;
   // pad the grid with extra 0's around the perimeter
   string str(grid[0].size()+2,'0');
   for (int i = 0; i < grid.size(); i++) {
      grid[i] = '0' + grid[i] + '0';
   }
   grid.insert(grid.begin(),str);
   grid.push_back(str);
   for (int i = 0; i < grid.size(); i++) {
      for (int j = 0; j < grid[i].size(); j++) {
         if (grid[i][j] == '?') {
            // compute options ('?' = 1)
            grid[i][j] = '1';
            int c1 = 0;
            for (int m = 0; m < 8; m++) {
               int neighX = i + dx[m];
               int neighY = j + dy[m];
               if (neighX < 0 || neighY < 0 || neighX >= grid.size() 
               || neighY >= grid[0].size()) continue;
               int alive = 0;
               for (int n = 0; n < 8; n++) {
                  int mX = neighX + dx[n];
                  int mY = neighY + dy[n];
                  if (mX < 0 || mY < 0 || mX >= grid.size() 
                  || mY >= grid[0].size()) continue;
                  if (grid[mX][mY] == '1') alive++;
               }
               if (grid[neighX][neighY] == '1' && (alive == 2 || alive == 3)) c1++;
               else if (grid[neighX][neighY] == '0' && (alive == 3)) c1++;
            }
            // consider the square itself under '?' = 1
            int selfAlive = 0;
            for (int n = 0; n < 8; n++) {
               int selfX = i + dx[n];
               int selfY = j + dy[n];
               if (selfX < 0 || selfY < 0 || selfX >= grid.size() 
               || selfY >= grid[0].size()) continue;
               if (grid[selfX][selfY] == '1') selfAlive++; 
            }
            if (selfAlive == 2 || selfAlive == 3) c1++;
            // compute options ('?' = 0)
            grid[i][j] = '0';
            int c2 = 0;
            for (int m = 0; m < 8; m++) {
               int neighX = i + dx[m];
               int neighY = j + dy[m];
               if (neighX < 0 || neighY < 0 || neighX >= grid.size() 
               || neighY >= grid[0].size()) continue;
               int alive = 0;
               for (int n = 0; n < 8; n++) {
                  int mX = neighX + dx[n];
                  int mY = neighY + dy[n];
                  if (mX < 0 || mY < 0 || mX >= grid.size() 
                  || mY >= grid[0].size()) continue;
                  if (grid[mX][mY] == '1') alive++;
               }
               if (grid[neighX][neighY] == '1' && (alive == 2 || alive == 3)) c2++;
               else if (grid[neighX][neighY] == '0' && (alive == 3)) c2++;
            }
            // consider the square itself under '?' = 0
            selfAlive = 0;
            for (int n = 0; n < 8; n++) {
               int selfX = i + dx[n];
               int selfY = j + dy[n];
               if (selfX < 0 || selfY < 0 || selfX >= grid.size() 
               || selfY >= grid[0].size()) continue;
               if (grid[selfX][selfY] == '1') selfAlive++; 
            }
            if (selfAlive == 3) c2++;
            // choose the one with better live cells
            if (c1 >= c2) grid[i][j] = '1';
         }
      }
   }   
   // iterate one period and determine number of alive
   for (int i = 0; i < grid.size(); i++) {
      for (int j = 0; j < grid[i].size(); j++) {
         int alive = 0;
         for (int n = 0; n < 8; n++) {
            int mX = i + dx[n];
            int mY = j + dy[n];
            if (mX < 0 || mY < 0 || mX >= grid.size() 
            || mY >= grid[0].size()) continue;
            if (grid[mX][mY] == '1') alive++;
         }
         if (grid[i][j] == '1' && (alive == 2 || alive == 3)) res++;
         else if (grid[i][j] == '0' && alive == 3) res++;
      }
   }
   return res;
}

Problem 3 (1000 Points): HandlesSpelling
Category: Dynamic Programming, String Manipulation
URL: http://www.topcoder.com/stat?c=problem_statement&pm=10864

This problem required us to play a game and work out the maximum score based on a set of choices we are forced to make. The game is that we have a string of letters in which we can "stamp" a set of pre-determined strings onto it. We need to maximise the score based on A^2 - B where A is the longest contiguous sequence of letters covered by the stampings and B is the number of uncovered letters.

For example, if the string we are given is HELLO and we are given 3 badges namely: E, HE, L. We can stamp them as follows: {HE}{L}{L}O. We stamped HE with the second badge and the 2 L's with the third badge. The value of A is 4 because we have a contiguous sequence of stampings (HELL) and B is 1 because we didn't have a way to stamp the letter O. Hence the score given this set of operations is 4^2 - 1 = 15, which turns out to be the maximum score we can get.

One strategy is to determine the maximum value of A. Why? A is where our score gets maximised, no other measure increases our score except for the value of A so it's only natural that we want to get A as high as possible. We also note that the maximum contiguous sequence (i.e. maximum value A) is in the optimal solution. Why? Consider two configurations:

xxxxx....----n----....xxxxxxxx
xxxxx--m--.......--m--xxxxxxxx

Here the x's denote similar configurations, so we only need to focus on maximising the centre part of m's, n's and dots. We let n be the maximum contiguous length possible in our string, let that be denoted as length x. Let's assign m to be length x-1. Then our goal is to prove that it is always optimal to choose n over the 2 m's.

Then the score for our first configuration is:
F = (x * x) - (x-1) - (x-1) = x^2 - 2x + 2 (length x ^ 2 minus two sections of x-1 uncovered)

The score for our second configuration is:
G = (x-1) * (x-1) - x = x^2 - 3x + 1 (length (x-1) ^ 2 minus one section of x uncovered)

Hence for all positive values of length x, the score for the first is better than the second. In fact, this is a stricter argument than what is required, as if the sections of n and m were disjoint they'll both be covered, hence there must be some conflict between n and m which forces us to choose between one or the other. Our argument will hold for such a scenario.

Our first task is then to find the maximum contiguous sequence of our string - this can be done easily through basic dynamic programming. Then we are tasked with trying all configurations which have the same maximum contiguous sequence to determine the minimum number of uncovered letters. Again, this can be done via a similar dynamic programming algorithm where we divide our main string into a left and right sub-string and count the maximum number of covered cells we can get. Then we simply subtract this by the size of the left and right sub-string respectively to get the number of uncovered letters.

The implementation which uses the strategy above can be seen as:

class HandlesSpelling {
public:
   int spellIt(vector <string>, vector <string>);
};

#define UNI -12345678
#define INF 12345678

vector<string> _badges;
string _S;

// returns true if str is a prefix to S starting at idx
bool isPrefix(int idx, const string& str, string S) {
   if (idx + str.size() > S.size()) return false;
   for (int i = idx; i < idx + str.size(); i++) {
      if (S[i] != str[i-idx]) return false;
   }
   return true;
}

int prefix[1001][51];   // true if badge j is a prefix at index i
int A[1001];   // keeps track of the longest sequence ending with index i
int B[1001];   // left hand side
int C[1001];   // right hand side
int maxSize;

// maximise coverage using ptr as the memoisation table
int func(int idx, int* ptr) {
   int res = 0;
   if (idx >= maxSize) return 0;
   if (ptr[idx] != -1) return ptr[idx];
   for (int i = 0; i < _badges.size(); i++) {
      if (prefix[idx][i] && idx + _badges[i].size() <= maxSize) {
         res >?= func(idx + _badges[i].size(), ptr) + _badges[i].size();
      }
   }
   res >?= func(idx+1, ptr);
   return ptr[idx] = res;
}

int HandlesSpelling::spellIt(vector <string> parts, vector <string> badges) {
   string data;
   for (int i = 0; i < parts.size(); i++) data += parts[i];
   _badges = badges;
   _S = data;
   // pre-compute the prefixes for speed
   for (int i = 0; i < data.size(); i++) {
      for (int j = 0; j < badges.size(); j++) {
         if (isPrefix(i, badges[j], _S)) { prefix[i][j] = 1; }
      }
   }
   // determine the longest contiguous sequence in our string
   for (int i = _S.size()-1; i >= 0; i--) {
      for (int j = 0; j < _badges.size(); j++) {
         if (isPrefix(i, _badges[j], _S)) {
            A[i] >?= A[i+_badges[j].size()] + _badges[j].size();
         }
      }
   }
   int longest = 0;
   for (int i = 0; i < _S.size(); i++) {
      longest >?= A[i];
   }
   // choose the best score out of all combinations which use A[i] == longest
   int res = UNI;
   for (int i = 0; i < _S.size(); i++) {
      if (A[i] == longest) {
         // branch off to left and right
         string left = _S.substr(0, i);
         string right = _S.substr(i+A[i]);
         // reset memo table for each instance
         memset(B,-1,sizeof(B));
         memset(C,-1,sizeof(C));
         int lOffset = 0, rOffset = 0;
         maxSize = i;
         if (left.size() > 0) lOffset = left.size() - func(0,B);
         maxSize = _S.size();
         if (right.size() > 0) rOffset = right.size() - func(i+A[i],C);
         res >?= (A[i] * A[i]) - (lOffset) - (rOffset);
      }
   }
   return res;
}