USACO OPEN11 llang Analysis

USACO OPEN11 Problem 'llang' Analysis

by Fatih Gelgi and Albert Gu

To solve the problem, you need to notice that the communication network can be represented as a graph: vertices correspond to cows and there is an edge between two cows if they speak a common language. The crucial point is the transitive property of communication, i.e. passing messages. That means all cows in a connected component can communicate each other. Hence, the problem turns out to be finding the connected components. In the next step, we need to establish communication between connected components. Choosing two arbitrary cows each from different components establishes a communication between these two components. Thus, if the number of connected components in the graph is C, C-1 books will be sufficient. To create a working solution, the language of cow 1 can be taught to one cow in each connected component.

Consider the example below:

1 ----- 2                        8 ------ 9
|       |      5 --- 6 --- 7     |
3 ----- 4                       10
   
   C1                C2             C3

The graph above has 3 connected components: C1={1,2,3,4}, C2={5,6,7}, C3={8,9,10}. Teaching the language of cow 1 to cow 5 in C2 and cow 8 in C3 will yield a valid solution.

The biggest trick is constructing the graph fast enough. A naive method of directly comparing every pair of cows to see if they have a common language takes time at least O(n^2), which is too slow. One improvement is to look at language 1 and connect all cows that speak language 1, and repeat for all the languages. However, it is too slow in extreme cases when many cows speak the same language.

There are many ways to speed this up to speed this up to linear time; the method presented below just creates a new set of vertices for languages, and draws edges between cows and languages (this is known as a bipartite graph). The number of edges then equals the total number of languages spoken by the cows, which in the problem is given to be at most 100,000.

For example, for the sample input, the graph would look as follows:

Cows   Languages
1 ------ 3 
  \_ _ _
        \
2 ------ 2
   
   
3 ------ 1

In this case, there are 2 connected components: one contains cows 1 and 2 and languages 2 and 3, and the other contains cow 3 and language 1.

Connected components can be identified using DFS that requires O(V + E) memory and time on an adjacency list. The final step, establishing communication between connected components can be done in linear time.

Below is Albert's code:

#include <cstdio>
#include <vector>
using namespace std;
const int MAXN = 10005, MAXM = 30005;

int n, m, ans;
vector<int> adj[MAXN+MAXM], sol;

bool seen[MAXN+MAXM];
void dfs(int v) {
	seen[v] = true;
	for (int i = 0; i < adj[v].size(); i++) {
		if (!seen[adj[v][i]])
			dfs(adj[v][i]);
	}
}

int main() {
	FILE *fin = fopen("llang.in", "r");
	FILE *fout = fopen("llang.out", "w");

	fscanf(fin, "%d %d", &n, &m);
	for (int i = 1; i <= n; i++) {
		int a, b;
		fscanf(fin, "%d", &a);
		for (int j = 0; j < a; j++) {
			fscanf(fin, "%d", &b);
			// adjacency list
			//   cow i is connected to language b
			adj[i].push_back(b+n);
			adj[b+n].push_back(i);
		}
	}

	// identify connected components
	for (int i = 1; i <= n; i++) {
		if (!seen[i]) {
			dfs(i);
			sol.push_back(i);
			ans++;
		}
	}

	// solution: # of connected components - 1
	fprintf(fout, "%d\n", ans-1);
	for (int i = 1; i < sol.size(); i++) {
		// first cow in each component learns 
                //   the first language of the cow 1 
		fprintf(fout, "%d %d\n", adj[sol[0]][0]-n, sol[i]);
	}

	return 0;
}