I bumped into an interesting problem again during an interview. Unfortunately I am also not sure that I found an optimal (or good enough) algorithm. I thought about sharing to see if anyone knows a classic solution or a more efficient variation of my implementation.
Context
One way to represent a source code repository (eg, Git and Mercurial) is through a commits graph:
ExamplesGraph with linear commits
C1-C2-C3-C4
One-twig graph
C5-C6
/
C1-C2-C3-C4
Graph with a branch that has been merged again (some better translation for merged ?) on the main line ( C7
has two direct predecessors: C4
and C6
)
C5-C6
/ \
C1-C2-C3-C4-C7
The problem
Implement a function that finds the most recent common predecessor of any two commits of the graph. The Commits graph is represented by a String[]
called commits containing all the ids of the commits sorted by date (from the most recent to the most old) and a String[][]
called relatives .
The predecessors' ids for a commit at position i
can be found in parentes[i]
. The last position in the relatives array is always null
since it represents the predecessors of the commit root. For example, the graph of Example 3 is represented by:
String[] commits = {"C7", "C6", "C5", "C4", "C3", "C2", "C1"};
String[][] parentes ={{"C6","C4"}, {"C5"}, {"C2"}, {"C3"}, {"C2"}, {"C1"}, null};
If one of the commits is the predecessor of the other, this commit is the common predecessor.
Signature of method
String predecessorComum(String[] commits, String[][] parentes, String commit1,
String commit2);
Example:
The call:
String predecessor = predecessorComum(commits, parentes, "C4", "C6");
Returns C2
Complexity
You can write a O (n) solution for this problem.
A first solution
My attempt to solve this problem began with the classic LCA algorithms (see this issue of the utluiz ), unfortunately the solution in time linear were complex and not very easy to implement using predecessor list.
I then implemented the following (in my view, not optimal) steps.
Map of index IDs (complexity O (n) ) so you can navigate efficiently in arrays:
final Map<String, Integer> idParaPosicao = new HashMap<String, Integer>();
for (int i = 0; i < commits.length; i++) {
final String id = commitHashes[i];
idParaPosicao .put(id, i);
}
Recursive auxiliary method to stack all predecessors of a commit
to root, including itself (complexity O (p + n) ):
private Stack<Integer> predecessores(Stack<Integer> visitados, int id,
String[][] parentes, Map<String, Integer> idParaPosicao) {
visitados.push(id);
final String[] parentesD = parentes[id];
if (parentesD != null) {
for (String parente : parentesD) {
predecessores(visitados, idParaPosicao.get(parente), parentes, hashToIndex);
}
}
return visitados;
}
Mount the stack for the two commits :
final Stack<Integer> predsC1 = predecessores(new Stack<Integer>(), indiceC1, parentes,
idParaPosicao);
final Stack<Integer> predsC2 = predecessores(new Stack<Integer>(), indiceC2, parentes,
idParaPosicao);
Unfold the predecessors from the root while they are the same. The last commit is the most recent predecessor (complexity O (n) ):
int lca = -1;
while (!predsC1.isEmpty() && !predsC2.isEmpty()) {
final int a1 = predsC1.pop();
final int a2 = predsC2.pop();
if (a1 == a2) {
lca = a1;
} else {
break;
}
}
Return the commit :
return commits[lca];
Self-criticism:
O(n + p)
time, it has a strong constant up front, most likely better algorithms already exist.