How can I optimize a recursive method to find ancestors?

10

I have a Pessoa class that has relationships for your parent (these can be null at any time).

In a certain part of my code I need to find out if one person is ancestor of the other.

I was able to get a recursive solution of the following nature:

public boolean isAncestor(Person p) {
    if (p == null)
        return false;
    if (this.father.equals(p) || this.mother.equals(p))
        return true;
    else if(isAncestor(p.father) || isAncestor(p.mother))
        return true;
    return false;
}

The problem is that these family trees can become reasonably large. Besides that this implementation is looking a lot like the trivial recursive implementation of Fibonacci ... That is, it may be my impression, but I believe this is a method of O(2^n) nature.

Can you think of a more efficient way to check for ancestors? Who knows there is no solution with loops + cells or a non-exponential recursion to solve the problem?

    
asked by anonymous 13.12.2013 / 14:09

4 answers

9

The problem is not exponential, it is linear but linear in the number of ancestors of the person: there is no way to guarantee that John is not an ancestor of Mary without examining all of Mary's ancestors.

It even reduces this by eliminating already examined ancestors (the same person may appear more than once as ancestral - marriage between cousins, for example), at the expense of linear space in the number of ancestors. >

The easiest thing to do is use a queue to resolve this:

import java.util.*;

public boolean isAncestor(Person p) {
  if (p == null) return false;

  final Queue<Person> notChecked = new LinkedList<Person>();
  final Set<Person> checked = new HashSet<Person>();

  if (father != null) notChecked.add(father);
  if (mother != null) notChecked.add(mother);

  while (notChecked.peek() != null) {
    final Person nextPerson = notChecked.remove();

    if (!checked.contains(nextPerson)) {
      if (nextPerson == p) return true;

      checked.add(nextPerson);
      if (nextPerson.father != null) notChecked.add(nextPerson.father);
      if (nextPerson.mother != null) notChecked.add(nextPerson.mother);
    }
  }

  return false;
}

Another possibility is to use a stack, based on ArrayList . Class ArrayList is much more efficient than LinkedList in terms of space and speed, but the implementation is slightly more complicated:

public boolean isAncestor(Person p) {
  if (p == null) return false;

  final List<Person> notChecked = new ArrayList<Person>();
  final Set<Person> checked = new HashSet<Person>();

  if (father != null) notChecked.add(father);
  if (mother != null) notChecked.add(mother);

  while (notChecked.size() > 0) {
    final int index = notChecked.size() - 1;
    final Person nextPerson = notChecked.remove(index);

    if (!checked.contains(nextPerson)) {
      if (nextPerson == p) return true;

      checked.add(nextPerson);
      if (nextPerson.father != null) notChecked.add(nextPerson.father);
      if (nextPerson.mother != null) notChecked.add(nextPerson.mother);
    }
  }

  return false;
}

Pretty much the same, but much faster.

The first version will soon find nearby ancestors - the second version, for example, will search all the ancestors of the mother before looking for the father. The two versions consume the same effort if the person is not ancestral, the second one will be much faster and consume less memory.

    
13.12.2013 / 19:35
5

You will hardly find an effective solution, since the number of candidates in this case grows exponentially with each generation (i.e. one person has 2 parents, 4 grandparents, 8 great-grandparents etc.). The reverse search (starting with the ancestor and looking for your children) will potentially be even more costly since a person can have more than 2 children. The use of dynamic programming may pay off in this case (eg, if a person has the same ancestor on both sides of a parent), but depending on the configuration of your dataset this may or may not make any significant difference.

You can eliminate recursion using an explicit stack, but it will still grow as much as the implicit stack created by your recursive solution. Therefore, unless you are experiencing memory exhaustion problems, there is no reason to change your current strategy (depth search through recursion).

    
13.12.2013 / 14:17
4

One way around this problem is to use dynamic programming (or memoing).

The idea is to create a cache of the result of the function, so you do not have to consult the whole hierarchy again if you have already solved the problem for the object. You can do this by creating a HashMap attribute in the class and querying at the beginning of the method.

private Map<Person, Person> memoAncestors = new HashMap<Person, Person>();

// ....

public boolean isAncestor(Person p) {
    if (p == null)
        return false;
    if (memoAncestors.containsKey(p)) {
        return memoAncestors.get(p);
    }
    if (this.father.equals(p) || this.mother.equals(p)) {
        memoAncestors.put(p, true);
        return true;
    } else if(isAncestor(p.father) || isAncestor(p.mother)) {
        memoAncestors.put(p, true);
        return true;
    }
    return false;
}

This solution is a bit more efficient, as it will avoid a lot of unnecessary tree "ups" and does not hurt the readability of the code very much - a good tradeoff . It is also a known optimization for the Fibonacci problem, it fits into what you are calling non-exponential recursion.     

13.12.2013 / 14:13
2

This is a classic problem of graphs, and you can use some techniques for this, depending on the size of the problem you have. If your problem is only at a few levels, then calculating the whole tree may not have a problem and your solution is good.

Again, it would be best to take a look at algorithms related to graphs to see what would be best for your situation, but let's assume you have thousands of families with no connection to each other. For this situation, it is worth to make a union of the families in a data set via "tree flatenning", and check if the two objects are in the same set. Because this is a quick operation, you can quickly discard people who are not from the same family, and therefore one can not be ancestor of the other. So, if you determine that they are people of the same family, you can revert to your algorithm, since the tree will no longer be as large.

The page on the Wikipedia about Disjoint-set data structure can help you find a solution for your case.

    
13.12.2013 / 14:42