How to implement the algorithm of the Theory of the six degrees of separation?

4

What is it?

Theory that, in the world, it takes at most six bonds of friendship so that two people are connected.

Question:

I have been trying to implement this algorithm for some time, but I can not imagine how I would make the links, I want to do as a form of PoC , to better understand how one of the algorithms social networks use.

    
asked by anonymous 24.04.2014 / 06:20

1 answer

5

This problem can not only be solved with classic algorithms of the graph theory , as there are already implemented implementations in several languages , including C #.

First, consider that your data mass contains people and relationships between people. Regardless of how this is represented ( your comment suggests a adjacency list ), this forms a graph where each person is a vertex and the relation "so-and-so is a friend of beltran" constitutes a edge . As we simply want the "distance" between two people, it can be considered that this graph has no "weights" (ie you are or are friends of so and so) and - although this is not relevant in this case - the friendship relationship is (if A is a friend of B, then B is friends with A). The result is then an non-directed graph with no weights .

The proposition to be tested:

  

A maximum of six friendships are required for any two people to be connected.

in graph language is equivalent to:

  

For every vertex A and every vertex B, the smallest path between A and B is less than or equal to 6.

In other words, testing the proposition comes down to solving the smallest path problem for every vertex V, and test if all the resulting paths are less than or equal to 6. The simplest and most efficient algorithm for this type of graph is the search in width .

In your particular case:

  • Create a Person x Person matrix where every element is "not visited";
  • For each person P :
  • Create a queue (FIFO) - in C #, Queue - and put P in it, with distance zero;
  • Consider the number of people not visited NV as the total number of people;
  • While NV > 0 :
  • If the queue is empty, stop - there are people who are not connected in any way.
  • Remove the first person X from the queue, and check its distance to P ;
  • If X has already been visited before, return to the beginning of the loop (there is a shorter path between P and X , already discovered);
  • Otherwise, assign the distance from P to X (and from X to P), and decrement NV ;
    • Here you can test whether the distance is greater than 6 .
  • For each friend of X , add it in the queue, with the distance incremented by 1 .
  • A search for "c sharp breadth first" returns several results (such as this answer in SOen ), but for the particularity I try to implement it according to the algorithm described above (perhaps using the linked code as reference) - since some optimizations can be made in relation to the more general problem (stop when all the vertices have already been visited, stop when the distance is greater than 6 , etc.).

        
    24.04.2014 / 08:09