Monday, September 14, 2009

Google Code Jam Round 1C

Overview:
This was the last opportunity for qualifiers to advance into Round 2. Congratulations to everyone who has made it to Round 2! Bad luck to those that didn't, there's always next year!

This round was greeted with a set of interesting problems. Problem A was the easy problem in the set, and most competitors were able to solve the small input with some incorrect submissions for the large case. Problem B proved rather interesting and had multiple correct approaches. One of which required a little mathematical manipulation to derive a short and safe implementation, the other had the danger of precision issues but was based on a simpler concept without the use of much mathematics. Problem C was a fairly subtle DP problem in disguise but this did not deter the top competitors to solve all three problems in under 40 minutes.

Links:
Google Code Jam Statistics: http://www.go-hero.net/jam/
Google Code Jam Scoreboard: http://code.google.com/codejam/contest/scoreboard?c=189252
Google Code Jam Unofficial Scoreboard/Online Source Viewer (Thanks to Zibada on TC): http://zibada.ru/gcj/2009/round1a.html, http://zibada.ru/gcj/2009/round1b.html, http://zibada.ru/gcj/2009/round1c.html

Problem A: All Your Base
Category: Greedy
URL: http://code.google.com/codejam/contest/dashboard?c=189252#s=p0

The problem asks us to determine the minimum possible number we can derive from an unknown number representation system. To do this we must assign values to each of the characters present in the symbols. For example let's say we are given "cats" as an input we need to determine what the 'c' means numerically, what the 'a' means numerically etc. Since we want the smallest possible number, it suggests a greedy strategy of assigning what each letter means. In our case, we first see 'c' and therefore wish for this to be as small as possible. However, there is an additional constraint that no number starts with a 0 and since 'c' is the first symbol it cannot be 0. Therefore, the next obvious choice is to give it a value of 1. Next, we process 'a' and assign it a value of 0 since it hasn't been used yet and it's not the first symbol. Following this, 't' gets assigned 2 and 's' gets assigned 3.

Now the next task is to determine the base system the aliens work on, it can be easily shown we want the base as small as possible - yet large enough to allow us to use all the symbols. Therefore, the base system is simply the number of unique characters/letters in the alien number. We are then tasked with converting this to "human" form (i.e. decimal representation) which can be done iteratively like any base conversion algorithm. Implementation wise, we can keep track of our character assignments in a map data structure. When we encounter a letter, we first check whether or not we have seen the letter - if we haven't then we greedily assign it the next available (and smallest) number. If we have, then we simply skip over the character.

#include <iostream>
#include <vector>
#include <string>
#include <sstream>
#include <algorithm>
#include <stack>
#include <queue>
#include <map>

using namespace std;

map<char,int> memo;

int main() {
   int nT; cin >> nT;
   string str;
   int k = 1;
   while (nT-- && cin >> str) {
      memo.clear();
      int next = 0;
      memo[str[0]] = 1;
      for (int i = 1; i < str.size(); i++) {
         if (memo.count(str[i]) == 0) {
            memo[str[i]] = next;
            next++;
            if (next == 1) next++;
         }
      }
      long long val = 0;
      int base = next == 0 ? 2 : next;
      for (int i = 0; i < str.size(); i++) {
         val = val * base + memo[str[i]];
      }
      cout << "Case #" << k << ": " << val << "\n";
      k++;
   }
   return 0;
}

Problem B: Centre of Mass
Category: Maths, Ternary Search
URL: http://code.google.com/codejam/contest/dashboard?c=189252#s=p1

This is a tricky problem to tackle due to precision issues, so it's safer to go for a closed-form solution. Basically, we are given N fireflies each starting off in different positions and each going their own directions with a constant velocity. We need to calculate the distance and time in which the centre of mass of the N fireflies are closest to the origin (0,0,0). We need to make one key observation to heavily simplify this problem. If we look closely, we can collapse all of the N fireflies into one single entity (this is heavily suggested by the notes section in the actual problem statement). If we only have a single point (i.e. the centre of mass) we simply need to find the point in time/distance in which it comes closest to the origin. With some visualisation we can see that the centre of mass movement is divided into two cases:

- There is no movement hence the earliest time and closest distance it is to the origin is at time = 0.
- There is movement (i.e. velocity) then the earliest time and closest distance is unique (as there would be a point in which it gets closer and closer and after the minimum distance it gets larger and larger).

With these two conditions, we have multiple approaches to solve the problem. The first approach is to use a ternary search over the set of possible times, as there is only one minima for the movement case - we are guaranteed that the algorithm will converge to the spot eventually. Ternary search works on functions which have only one extrema that is either strictly increasing then strictly decreasing or the other way around. For more information on this search technique look at: http://en.wikipedia.org/wiki/Ternary_search. The second approach is to differentiate the distance equation - as we know there is only 1 solution we can confidently differentiate and solve for d/dt = 0 to obtain the unique solution. We will now proceed to demonstrate this approach.

We first need to collapse the N fireflies into the centre of mass vector with respect to time. This can be done by a straight summation over the coefficients of the starting positions and velocities. Let S = (Sx, Sy, Sz) be the summation of the starting positions of the N fireflies (this is a straight application of the formula given in the notes section). Let D = (Dx, Dy, Dz) be the summation of the velocities of the N fireflies (similar argument applies to the velocity). Then we yield a distance (with respect to time T) equation of:

D = sqrt((Sx + Dx * T)^2 + (Sy + Dy * T)^2 + (Sz + Dz * T)^2)

We can ignore the sqrt for differentiation purposes as it will get cancelled out later when it's multiplied by 0 (another reason is the function is strictly increasing, and as a result has no effect on the location of the extrema):

D = (Sx + Dx * T)^2 + (Sy + Dy * T)^2 + (Sz + Dz * T)^2

Now let's differentiate (using the Chain Rule):

dD/dT = 2*(Sx + Dx * T)*(Dx) + 2*(Sy + Dy * T)*(Dy) + 2*(Sz + Dz * T)*(Dz)

Let dD/dT = 0,

0 = 2*(Sx + Dx * T)*(Dx) + 2*(Sy + Dy * T)*(Dy) + 2*(Sz + Dz * T)*(Dz)
0 = 2*((Sx + Dx * T)*Dx + (Sy + Dy * T)*Dy + (Sz + Dz * T)*Dz)
0 = (Sx + Dx * T)*Dx + (Sy + Dy * T)*Dy + (Sz + Dz * T)*Dz

Expanding:

0 = Sx*Dx + Dx*Dx*T + Sy*Dy + Dy*Dy*T + Sz*Dz + Dz*Dz*T

Pushing all the T terms into the LHS:

-Dx*Dx*T - Dy*Dy*T - Dz*Dz*T = Sx*Dx + Sy*Dy + Sz*Dz
T(-Dx*Dx - Dy*Dy - Dz*Dz) = Sx*Dx + Sy*Dy + Sz*Dz

Therefore T is equal to:

T = Sx*Dx + Sy*Dy + Sz*Dz / (-Dx*Dx - Dy*Dy - Dz*Dz)

Now we need to be careful with dividing by 0, if (-Dx*Dx - Dy*Dy - Dz*Dz) is 0 this means that there is no movement (i.e. our first condition). We can safely make a separate case for this as the closest time is at t=0. Using these concepts we just need to do it which is pretty simple:

#include <iostream>
#include <string>
#include <algorithm>
#include <cmath>

using namespace std;

int main() {
   int T; cin >> T;
   for (int tt = 0; tt < T; tt++) {
      int N; cin >> N;
      double dX = 0.0, dY = 0.0, dZ = 0.0;
      double sX = 0.0, sY = 0.0, sZ = 0.0;
      // summation of the different components
      for (int i = 0; i < N; i++) {
         double x, y, z, tx, ty, tz;
         cin >> x >> y >> z >> tx >> ty >> tz;
         dX += tx; dY += ty; dZ += tz;
         sX += x; sY += y; sZ += z;
      }
      dX /= N; dY /= N; dZ /= N;
      sX /= N; sY /= N; sZ /= N;
      int before = 0;
      double t;
      // check for division by zero case
      if (dX * dX + dY * dY + dZ * dZ < 1e-09) {
         before = 1;
      } else {
         t = -(sX*dX + sY*dY + sZ*dZ) / (dX*dX + dY*dY + dZ*dZ);
      }
      double dist = 0;
      // time can't be negative, if it is - then the optimal time is on t = 0
      if (t < 0 || before) {
         t = 0;
         dist = sqrt(sX*sX + sY*sY + sZ*sZ);
      } else {
         dist = sqrt((sX+dX*t)*(sX+dX*t)+(sY+dY*t)*(sY+dY*t)+(sZ+dZ*t)*(sZ+dZ*t));
      }
      cout << "Case #" << (tt+1) << ": ";
      cout << dist << " " << t << "\n";
   }
   return 0;
}

Problem C: Bribe the Prisoners
Category: Dynamic Programming
URL: http://code.google.com/codejam/contest/dashboard?c=189252#s=p2

For the final problem, we are given a set of prison cells (from 1 to P) and inside each of these prison cells is one prisoner. Prisoners communicates with its neighbours if the cell is adjacent and there is at least one prisoner inside. As a prisoner is released, the neighbours find out (if any) and this effect gets echoed in both directions until it hits the end of the prison (i.e. left or right wall) or when there is no one in the neighbouring prison (because they were released earlier). We are then tasked to bribe each prisoner who hears about a prisoner getting released 1 gold coin to prevent them from destroying their cell. We need to identify the optimal sequence of prisoners to release as to minimise the number of gold coins spent.

The obvious way is to brute force all possible ordering of prisoners and compute the minimum gold coins spent. This works easily for the small case as there are only 5 prisoners awaiting to be released at the worst case. This obviously does not work for the large case as there can be as many as 100 prisoners to be released. To optimise this we turn to dynamic programming. Seeing there can be as many as 100 prisoners to be released we can't keep a bitmask of which prisoners we have released as it will certainly TLE and/or exceed memory limit (on all current systems). The key observation is to divide the prison into sub-prisons. This can be seen by noting that the ends of the prison cells act as barriers/walls in which the word does not go through. If we release prisoners from cells - the cells we have just freed implicitly acts like the ends of our original prison.

In addition, we need to make a small optimisation as the prisons can be as large as 10,000 in the large case. We need to note that we will never free from a cell which is not in the list of prisoners to be freed. Hence instead of keeping track of the index of the prison cells itself, we keep track of the index of the prisoners to be freed. A simple sort suffices to keep the implementation neat. Hence the space complexity is only O(Q*Q) and since Q is at most only 100 - this is pretty much nothing.

So we define the DP state as:

D(n,m) = the minimum number of gold spent to free prisoners between index n and m of the prisoners to be freed.

Our recurrence relation is then simply defined as:

D(n,m) = min { Cost(i) + D(n,i) + D(i,m) } where n < i < m. (free prisoner i)
D(n,m) = 0 if m - n < 2 (base case)

The base case is for situations where there are no prisoners to be freed between n and m. Cost(i) is defined as the cost to free prisoner index i given that the empty prison to the left is n and the empty prison to the right is m. This can be calculated in constant time as we know which locations index n and m corresponds to. There is a small trick for implementing this - we add explicit left and right barriers to our original prison to make it conform to our recursion. This way we don't need to handle for the original prison special case which would otherwise be unbounded.

Implementing this should be simple with either memoisation (top-down) or a bottom-up approach:

#include <iostream>
#include <vector>
#include <string>
#include <sstream>
#include <algorithm>
#include <stack>
#include <queue>
#include <map>

using namespace std;

int memo[111][111];

vector<int> data;

int func(int leftIdx, int rightIdx) {
   // base case
   if (rightIdx - leftIdx < 2) {
      return 0;
   }
   if (memo[leftIdx][rightIdx] != -1) return memo[leftIdx][rightIdx];
   int res = 1<<30;
   for (int i = leftIdx+1; i < rightIdx; i++) {
      // split on prisoner index i
      int calc = (data[i] - data[leftIdx] - 1) + (data[rightIdx] - data[i] - 1);
      calc = calc + func(leftIdx, i) + func(i, rightIdx);
      res = min(res,calc);
   }
   return memo[leftIdx][rightIdx] = res;
}

int main() {
   int N;
   cin >> N;
   for (int t = 0; t < N; t++) {
      int P, Q; cin >> P >> Q;
      vector<int> pris;
      for (int i = 0; i < Q; i++) {
         int val; cin >> val; pris.push_back(val);
      }
      // add explicit barriers to the ends of the prisons
      pris.push_back(0);
      pris.push_back(P+1);
      // ensure that the prisoner indexes are strictly increasing
      sort(pris.begin(),pris.end());
      data = pris;
      memset(memo,-1,sizeof(memo));
      int best = func(0, data.size()-1);
      cout << "Case #" << t+1 << ": " << best << "\n";
   }
   return 0;
}

No comments:

Post a Comment