Pages

Thursday, August 20, 2009

TC Member Pilot Round 1 Division 2

This is the first member-driven "Single Round Match" for Topcoder (which also isn't rated). Division 2 had an interesting set of problems which proved to be slightly more difficult than the norm.

Division 2 250-pointer (TriFibonacci)
This problem can be viewed:
http://www.topcoder.com/stat?c=problem_statement&pm=10587&rd=13935

Basically this problem asks us to "fill in the blank" of a "Tri-Fibonacci" sequence which is defined as F(a) = F(a-1) + F(a-2) + F(a-3). The small catch here is that there can be sequences where this isn't well-defined (i.e. invalid). All the integers in the sequence must be strictly positive (i.e. zeroes are not allowed).

To solve this problem there are several options:
1. Brute force the blank and check whether or not it leads to a valid Tri-Fibonacci sequence. Due to the constraints this would take around 20 million operations which should run in time. If we can't find a valid sequence then we return -1.

2. Compute the blank based on the sequence given. Roughly speaking, we can divide this into two cases. One is where the number is within the first three terms and hence need to deduce the value by taking the sum of the other 2 and subtracting it from the next Tri-Fibonacci number.

For example, given {2, -1, 3, 7}. We compute 2+3=5, and then we can deduce the blank by noting 5+x=7 (by definition of the recurrence).

The other case is when the missing term is on the fourth term or beyond. Then it's easy to just use the given recurrence and compute the missing number from the last 3 numbers in the sequence before the blank. We then proceed to do a check on whether or not the unique number we obtained from the sequence is a valid Tri-Fibonacci. If it is, we accept and return the value. Otherwise we return -1.

The solution below uses approach 2:


class TriFibonacci {
public:
int complete(vector <int>);
};

bool checkFib(const vector<int>& A) {
for (int i = 3; i < A.size(); i++) {
if (A[i] != A[i-1] + A[i-2] + A[i-3]) return 0;
}
return 1;
}

int TriFibonacci::complete(vector <int> A) {
int pos = 0;
for (int i = 0; i < A.size(); i++) {
if (A[i] < 0) { pos = i; break; }
}
if (pos < 3) {
int s = 0;
for (int i = 0; i < 3; i++) if (A[i] >= 0) s += A[i];
if (A[3] - s <= 0) return -1;
A[pos] = A[3] - s;
return checkFib(A) ? A[pos] : -1;
}
A[pos] = A[pos-1] + A[pos-2] + A[pos-3];
return checkFib(A) ? A[pos] : -1;
}


Division 2 500-pointer (ErdosNumber)
This problem can be viewed:
http://www.topcoder.com/stat?c=problem_statement&pm=10377&rd=13935

This problem asks us to compute the Erdos Number based on a given set of publications. If you are interested in this number there are some facts on Wikipedia: http://en.wikipedia.org/wiki/Erd%C5%91s_number.

Again like most problems there are several ways to solve this. The bulk of the problem is really just parsing the string into a suitable data structure. This can be done using a map which maps a string (author names) and converts it into an integer index (for the adjacency matrix). We simply read each publication at a time and if it's a new author name we insert it into the map. Once we have read all the authors for a given publication we iterate in pairs and assign a weight of 1 to each edge between the authors. Then we simply need to run a shortest path algorithm from Erdos to all the other authors. There are several options available:

1. Most simplest is to use a Floyd-Warshall algorithm. This runs in O(n^3) time and since there are at most 100 vertices, it will easily run in time.
2. Use a breadth-first search from Erdos keeping track of the depth. This works as all the edges have a weight of 1. Reasonably simple to code and should be of choice if you don't know the Floyd-Warshall's All-Pairs shortest path algorithm. It's also more efficient.
3. Alternative methods include using Dijkstra's from Erdos to all vertices (Warshall's is more efficient and easier to code). You can also use Bellman-Ford's algorithm which will get the answer from the Erdos vertex to all other vertices.

A small implementation note is to keep track if a vertex is not reachable from the Erdos vertex.

See the below implementation which uses Floyd-Warshall's algorithm:


class ErdosNumber {
public:
vector <string> calculateNumbers(vector <string>);
};

#define INF (1<<20)
int adjmat[101][101];

vector <string> ErdosNumber::calculateNumbers(vector <string>
publications) {
vector<string> res;
map<string,int> authorMapping;
int authorIdx = 0;
for (int i = 0; i < 101; i++) for (int j = 0; j < 101; j++)
adjmat[i][j] = i == j ? 0 : INF;
for (int i = 0; i < publications.size(); i++) {
istringstream iss(publications[i]);
string author;
vector<string> assocs;
while (iss >> author) {
if (authorMapping.count(author) == 0) {
authorMapping.insert(make_pair(author,authorIdx++));
}
assocs.push_back(author);
}
for (int j = 0; j < assocs.size(); j++) {
for (int k = j+1; k < assocs.size(); k++) {
int first = authorMapping[assocs[j]];
int second = authorMapping[assocs[k]];
adjmat[first][second] = adjmat[second][first] = 1;
}
}
}
int erdosIdx = authorMapping["ERDOS"];
for (int k = 0; k < authorMapping.size(); k++) {
for (int i = 0; i < authorMapping.size(); i++) {
for (int j = 0; j < authorMapping.size(); j++) {
adjmat[i][j] <?= adjmat[i][k] + adjmat[k][j];
}
}
}
for (map<string,int>::iterator it = authorMapping.begin();
it != authorMapping.end(); it++) {
ostringstream oss;
oss << it->first;
int dist = adjmat[erdosIdx][it->second];
if (dist < INF) oss << " " << dist;
res.push_back(oss.str());
}
return res;
}


Division 2 1000-pointer (CantorDustEasy)
This problem can be viewed:
http://www.topcoder.com/stat?c=problem_statement&pm=10588&rd=13935

Finally the Division 2 Hard problem! This is a somewhat tricky problem and takes some hand-drawn diagrams to really see the picture. The problem is as given: given a recursive fractal algorithm determine the number of occurrences of a specified pattern. Given the low constraints of the pattern (50x50 at most) we can try and calculate the fractal up to 3^4 = 81 > 50 and attempt to recursively build on the answer. Why do we only need up to a time of 4? As each black square "sub-section" is divided into the four corners there is an empty space between them made up of 3^(t-1) blocks. This means that no pattern can possible "cross" over and given the constraint there must be at least 1 black square in the pattern - it makes this approach feasible. This is somewhat difficult to implement though so we attempt to make it easier.

On a side note, you can't naively enumerate all the patterns in a given fractal. The size at 9 is 3^18 which is nearly 400 million. We can't store this much let alone compute pattern matching on each cell which would certainly TLE.

So there is an easier approach but requires some effort to discover. If we draw a few diagrams we can see that all rows are either the same or they are empty (made up entirely of white cells). This is due to the symmetric property of the fractal, likewise - the columns are either the same or they are empty.

But how does this help us count the number of occurrences of a pattern? Firstly it helps eliminate infeasible patterns - if the non-empty rows/columns don't match then we can immediately discard it as a feasible pattern. The next observation we need to note is that our pattern now has either rows and columns that are the same or they are empty. Let's take a non-empty row R, since we know that all of the non-empty rows in the actual fractal F are the same. We can just count the number of R's we see in F. How does this help?

To illustrate the concept, take "X.X...X.X" as a sample row from F. Our pattern's row is: ".X.", using a string finding algorithm we find that it occurs twice in F. Hence if our original (non-collapsed) pattern only had 1 row of ".X." then we immediately retrieve the answer we have. Note this is a 1 dimensional problem - we need to convert this to our 2 dimensional problem. So we do the same thing with columns and determine how many occurrences. Let's say the count we have for the rows is "x" and the count we have for columns is "y". Then by the multiplication principle, the total number of occurrences in the 2D space is simply x * y (draw some diagrams to convince yourself - if we have non-empty rows that have x occurrences each then the total number is just how many times these x occurrences occur in a vertical fashion from the column pattern).

Combining this together we yield a simple implementation:


class CantorDustEasy {
public:
int occurrencesNumber(vector <string>, int);
};

string tab[10];
int visited[10];

string build(int num) {
if (num == 0) return "X";
if (visited[num]) return tab[num];
visited[num] = 1;
return tab[num] = build(num-1) + string(pow(3.0,(double)num-1),'.') + build(num-1);
}

int CantorDustEasy::occurrencesNumber(vector <string> pattern, int t) {
int res = 0;
string str = build(t);
string emptyRow = string(pattern[0].size(),'.');
string emptyCol = string(pattern.size(),'.');
string rows = "";
for (int i = 0; i < pattern.size(); i++) {
if (pattern[i] == emptyRow) continue;
if (rows != "" && rows != pattern[i]) return 0;
rows = pattern[i];
}
string cols = "";
for (int i = 0; i < pattern[0].size(); i++) {
string col;
for (int j = 0; j < pattern.size(); j++) {
col += pattern[j][i];
}
if (col == emptyCol) continue;
if (cols != "" && cols != col) return 0;
cols = col;
}
int numMatchRow = 0, numMatchCol = 0;
string::size_type pos = 0;
while ((pos = str.find(rows,pos)) != string::npos) {
numMatchRow++;
pos++;
}
pos = 0;
while ((pos = str.find(cols,pos)) != string::npos) {
numMatchCol++;
pos++;
}
return res = numMatchRow * numMatchCol;
}

No comments:

Post a Comment