Wilan: Bipartite Matching Algorithm

The problem:

Given a graph G=(V,E) where G is a bipartite graph. Then G can be separated into two disjoint vertex sets A and B where there exists no edges (u,v) where both u and v belong to the same vertex set. We define a matching M to be a set of edges from A to B where each vertex V belongs to at most one such edge. Then the maximum matching refers to the highest possible cardinality of M following the rules we defined above.

Applications:

We can define X to represent a set of workers and Y to be a set of jobs. There is an edge (u,v) in E iff worker u in X is sufficiently skilled enough to work in job v in Y. If we were tasked to assign the maximum number of workers to exactly 1 job then the answer is simply the maximum matching of the graph G.

Algorithm:

Define an alternating path P as a path which alternates between the two sets of vertices A and B. More specifically the alternating P follows the form of: (u0, v0), (v0, u1), (u1, v1), (v1, u2) etc. where each uX is in A and vY is in B. An augmenting path is an alternating path with each (uX, vX) edge being matched (let's define this to be colour blue) and each (vX, u(X+1)) edge being unmatched (define this to be colour red). Recall that one vertex can only be matched to one edge. One special note about an augmenting path is that it ends with an unmatched (uX, vX) edge and we can simply alternate the matchings along the path to obtain exactly 1 more matching in the graph.

For example, take:

A --- B ___ C --- D ___ E --- F ___ G --- H

Where --- denotes an unmatched edge and ___ denotes a matched edge. If H isn't in the matching M then we can alternate:

A ___ B --- C ___ D --- E ___ F --- G ___ H

to obtain 1 more matching (3 vs 4 matchings).

Then our algorithm is to keep finding augmenting paths and improving the number of matchings iteratively. When there are no more augmenting paths we are done.

Theorem: If there exists no augmenting paths then the number of matchings is optimal.

(Informal) Proof: Assume there are no augmenting paths in M. To increase the number of matchings we need to match a new edge (that is currently coloured red). As there are no augmenting paths there exists no red edges from a vertex x in A to a vertex y in B where vertex y is currently unmatched (otherwise there would be an augmenting path). Now if we introduce a new matching into M, we will revert the changes done by an augmenting path:

... C ___ D --- E ___ F --- G ___ H

Matching F and G would force:

... C ___ D --- E --- F ___ G --- H

The subset is the same as our previous alternating path which wasn't augmented. Therefore, introducing a new matching with no augmenting path present does not increase the total number of matchings. Therefore, the absence of augmenting paths means the number of matchings is optimal.

Other Notes:

To find the actual augmenting path, a simple DFS will suffice. A simple way to implement this is to keep track of the colours of each edge (inside the adjacency matrix where red colour = 1, blue colour = 2) as well as keeping track of which side we are on to know which colour edges we need to traverse. We only traverse red edges when we are on the left side and we only traverse blue edges when we are on the right side. Note that if there are no blue edges then we have found an augmenting path (as the right side vertex is unmatched).

Another note is that the number of matchings never goes down so once a vertex is matched in iteration i, then it is also matched in iteration i+1 - although the actual vertex it is matched to can change. Furthermore, we can derive which vertices are matched together simply by looking at the matching array. For vertex i (on the left side) it is matched to matching[i] on the right side, -1 if it isn't matched. For more details, see the implementation below (in the class used for SPOJ TAXI).

The maximum flow algorithm can be reduced to the bipartite matching problem, assign a super source to the set of vertices in A and assign a super sink to the set of vertices in B. Let each edge between A and B have a capacity of 1, and each edge between the source and A as well as B and the sink to have a capacity of infinity. Then the maximum matching is simply the maximum flow from the source to the sink.

Applying the Algorithm (SPOJ TAXI):

To apply the matching algorithm, let's take an easy example. This problem statement can be accessed via:
http://spoj.pl/problems/TAXI

The problem asks us to assign a taxi to exactly 1 passenger. There's a time limit where each taxi must reach the passenger by, this criterion is the basis on whether there is an edge between taxi i and passenger j. We asked to find the maximum number of matchings we can make between the taxi to the passengers. All we need to do is construct the bipartite graph and run the matching algorithm to print out the results.

The implementation is shown below:


#include <iostream>
#include <string>
#include <algorithm>
#include <sstream>
#include <vector>

using namespace std;

class BipartiteMatching {
private:
 vector<vector<int> > adjmat;
 vector<int> matched;
 vector<int> visited;

 int N;
 int M;

 bool augment(int node, int side) {
   visited[side*N+node] = 1;
   if (!side) {
     // left side
     for (int i = 0; i < M; i++) {
       if (adjmat[node][i] == 1 && !visited[i+N]) {
         // check for right-side unmatched node
         if (matched[i+N] == -1) {
           adjmat[node][i] = 2;
           matched[i+N] = matched[node] = i;
           return true;
         }
         // otherwise dfs more
         bool f = augment(i, 1-side);
         if (f) {
           // augmenting path found
           adjmat[node][i] = 2;
           matched[i+N] = matched[node] = i;
           return true;
         }
       }
     }
     return false;
   } else {
     // right side
     for (int i = 0; i < N; i++) {
       if (adjmat[i][node] == 2 && !visited[i]) {
         bool f = augment(i, 1-side);
         if (f) {
           // augmenting path found
           adjmat[i][node] = 1;
           return true;
         }
       }
     }
     return false;
   }
   return false;
 }

public:
 BipartiteMatching(int _N, int _M) : N(_N), M(_M) {
   // initialise the data structures
   adjmat = vector<vector<int> >(N, vector<int>(M, 0));
   visited = vector<int>(N+M,0);
   matched = vector<int>(N+M,0);
 }

 void Reset() {
   for (int i = 0; i < adjmat.size(); i++) {
     fill(adjmat[i].begin(),adjmat[i].end(),0);
   }
 }

 void AddEdge(int i, int j) {
   adjmat[i][j] = 1;
 }

 int maxMatching() {
   int res = 0;
   bool valid = true;
   // all vertices are unmatched at the start
   for (int i = 0; i < N + M; i++) matched[i] = -1;
   // while there is at least 1 valid augmenting path keep
   // iterating
   while (valid) {
     valid = false;
     for (int i = 0; i < N; i++) {
       if (matched[i] == -1) {
         fill(visited.begin(),visited.end(),0);
         bool f = augment(i,0);
         if (f) valid = true;
       }
     }
   }
   // count the number of matches
   for (int i = 0; i < N; i++) {
     if (matched[i] != -1) { res++; }
   }
   return res;
 }
};

class point {
public:
 int x;
 int y;
 point() { }
 point(int _x, int _y) : x(_x), y(_y) { }
};

bool operator<(const point& lhs, const point& rhs) {
 if (lhs.x != rhs.x) return lhs.x < rhs.x;
 if (lhs.y != rhs.y) return lhs.y < rhs.y;
 return false;
}

int main(int argc, char** argv) {
 ios_base::sync_with_stdio(false);
 int numTest = 0;
 cin >> numTest;
 int p, t, s, c;
 BipartiteMatching* bpm = new BipartiteMatching(406,206);
 while (numTest-- && cin >> p >> t >> s >> c) {
   vector<point> people;
   vector<point> taxi;
   // people
   int x, y;
   while (p-- && cin >> x >> y) {
     people.push_back(point(x,y));
   }
   while (t-- && cin >> x >> y) {
     taxi.push_back(point(x,y));
   }
   // make graph
   bpm->Reset();
   for (int i = 0; i < people.size(); i++) {
     for (int j = 0; j < taxi.size(); j++) {
       long long dist = abs(people[i].x-taxi[j].x)+
         abs(people[i].y-taxi[j].y);
       if (dist*200 <= c*s) {
         // can reach
         bpm->AddEdge(i,j);
       }
     }
   }
   // calculate maximum bipartite matching
   cout << bpm->maxMatching() << "\n";
 }
 return 0;
}

Wilan

Pages

Wednesday, August 26, 2009

Bipartite Matching Algorithm

1 comment: