In distributed computing how does the election of a leader work?

14
  

Leader election is a distributed systems problem that   consensually select a process in a set of   processes with the objective of selecting a leader for a given   assignment   

In distributed computing how does the election of a leader work? When is choosing a leader important?

    
asked by anonymous 02.11.2016 / 09:54

1 answer

9

In your question, the answer already exists, perhaps to better understand it, one must initially conceptualize distributed computing.

Distributed System

A distributed or parallel processing system is a system that interconnects multiple processing nodes (individual computers, not necessarily homogeneous) so that a high-power process runs on the "most available" node, or even subdivided by multiple nodes . Obvious gains are therefore made in these solutions: any task, if divisible into several subtasks, can be performed in parallel.

When is the choice of a leader important?

The election becomes necessary when the distributed system is being started for the first time or the previous leader can not communicate with the other processes due to a failure.

In distributed computing how does a leader's election work?

There are several algorithms that make the election of the leader, each specific to some situation.

Ring Algorithm

The ring or CSF algorithm, Le Lann, Chang and Roberts initials, serves to elect a leader if the processes are arranged in a ring. Each process must know its neighbor on the right and left and must have a unique, fixed, and assigned numeric identifier before the election commences. Originally this algorithm aimed at recovering a lost token in a network with ring-shaped topology, choosing a node of the network that would serve as a starting point for the new token .

Running the algorithm seeks to elect the highest identifier process and make all ring members recognize the new leader.

If one of the nodes identifies the loss of the token, it initiates the election by sending an election message with its node number to the right-hand neighbor. If the node receiving the election message has an identifier greater than that entered in the message it received, it passes an election message to its right-hand neighbor with its own identifier. Otherwise it accepts that the node that has the identifier contained in the message will be the leader and pass it to its right neighbor. If the node receives a message with the identifier identical to yours, it declares itself a leader. This event only occurs when the message containing the largest identifier has circulated throughout the ring making all its members aware of the result.

Bully algorithm

The bully algorithm serves to elect a leader between processes identified by a unique, fixed, and assigned numeric identifier before the start of the election. However in this case the topology is not limited to a ring and each process can communicate with any other in the system. Again the execution of the algorithm seeks to elect the highest identifier process and make everyone recognize the new leader.

If one of the processes identifies the loss of contact with the leader, he initiates a new election by sending everyone else a message containing his identifier. All nodes respond to the process that initiated the election with their own identifiers. If the process that initiated the election has the greatest identifier among all the others, it proclaims itself leader and warns all the others, otherwise it waits for the process of greater identifier to start an election and become leader. This algorithm has this name precisely because of its behavior of bully (bully / bully). The process of greater identifier prevails over those of smaller number and even if one of these wins an election, quickly takes the elected post by proposing a new election.

Wiki-pt

Example

#!/usr/bin/python
import socket
import os
import sys

## Variaveis globais
mySock = 0
## Path onde estao os sockets
SPATH='/tmp/bully'

## Funcao q manda mensagem
def sendMsg(dest,msg):
        global mySock
        try:
                ns = socket.socket(socket.AF_UNIX,socket.SOCK_STREAM)
                ns.connect((dest))
                ns.send(msg+';'+mySock)
                ns.close()
                return 1
        except:
                return 0
## Funcao q fica se comunicando com o coordenador para ver se esta ativo
def verificaCoord(coord):
        global SPATH
        global mySock

        while True:
                ## Verifica se o coordenador esta ativo
                status = sendMsg(coord,'1')
                if status == 0:
                        print 'Coodenador parado '+str(coord)+'. Iniciando Eleicao'
                        list = os.listdir(SPATH)
                        for i in list:
                                if int(i) > int(mySock):
                                        status = sendMsg(i,'E')
                                        if status == 0:
                                                print 'Erro enviando E ao processo '+i
                        break
                else:
                        print 'Conectou no coordenador '+coord
        return 0

def main():
        ## Variaveis
        fCord = 1
        BUF=1024

        global mySock
        global SPATH
        try:
                ## Se recebeu um argumento he um processo recussitado
                mySock = sys.argv[1]
                ## Testa se existem processos com prioridade maior
                list = os.listdir(SPATH)
                for i in list:
                        if int(i) > int(mySock):
                                status = sendMsg(i,'1')
                                if status == 1:
                                        fCord = 0 ## Se houver processos maiores, nao he coordenador
                                        continue
                ## Se for o maior anuncia a todos
                if fCord == 1:
                        for i in list:
                                if int(i) < int(mySock):
                                        status = sendMsg(i,'C')
        except:
                ## Nome do socket = pid do processo
                mySock=str(os.getpid())
                fCord = 0
        print 'Iniciando. PID='+str(mySock)
        ## Verifica se existe o diretorio para os sockets
        try:
                os.chdir(SPATH)
        except:
                ## Se nao existe cria o diretorio
                os.mkdir(SPATH)
        try:
                ## Cria o socket
                s = socket.socket(socket.AF_UNIX,socket.SOCK_STREAM)
                s.bind((mySock)) ## Atribui um nome ao socket
                s.listen(1) ## Estabelece fila para conexoes
                ## Inicialmente o menor PID he o coordenador
                list = os.listdir(SPATH)
                ## Ordena os pids e pega o primeiro
                list.sort(cmp)
                if list:
                        COORD=list[0]
                else:
                        COORD=mySock
                ## Nao he o coordenador
                if COORD != mySock and fCord == 0:
                        ## Divide o processo
                        childPID = os.fork()
                        if childPID == 0:
                                ## O filho permanece testando o coordenador
                                verificaCoord(COORD)
                                #sys.exit(1)
                else:
                        COORD = mySock
                        print 'Eu sou o coordenador '+str(mySock)
                ## Todos os processos pais permanecem em wait, aguardando conexao
                conn, addr = s.accept()
                data = conn.recv(BUF)
                while True:
                        if data:
                                msg = data.split(';')
                                ## Se a mensagem recebida for de eleicao
                                if msg[0] == 'E':
                                        print 'Recebido mensagem '+msg[0]+' de PID='+str(msg[1])
                                        ## Enviar msg para pid menor parar eleicao
                                        sendMsg(msg[1],'PE')
                                        #procura se tem alguem maior q ele
                                        list = os.listdir(SPATH)
                                        maior = 1
                                        for i in list:
                                                if int(i) > int(mySock):
                                                        #print 'Enviar para PID '+i
                                                        status = sendMsg(i,'E')
                                                        if status == 1:
                                                                maior = 0
                                        #se for o maior envia msg para todos como novo coord.
                                        if maior == 1:
                                                print 'Eu sou o novo Coordenador '+str(mySock)
                                                os.kill(childPID,9)
                                                for i in list:
                                                        if int(i) < int(mySock):
                                                                print 'enviando C para '+str(i)
                                                                send = sendMsg(i,'C')

                                ## Recebeu msg para parar eleicao
                                if msg[0] == 'PE':
                                        print 'Recebida mensagem '+msg[0]+' de PID='+str(msg[1])
                                ## Recebeu msg de verificacao de conexao
                                #if msg[0] == '1':
                                #       print 'Recebida mensagem '+msg[0]+' de PID='+str(msg[1])
                                ## Recebeu msg de coordenador
                                if msg[0] == 'C':
                                        print 'Novo coordenador '+msg[1]
                                        os.kill(childPID,9)
                                        childPID = os.fork()
                                        if childPID == 0:
                                                ## Novo filho gerado monitora o novo coordenador
                                                verificaCoord(msg[1])
                                                #sys.exit(1)
                                data = conn.recv(BUF)
                        else:
                                conn, addr = s.accept()
                                data = conn.recv(BUF)
        except:
                print "Unexpected error:", sys.exc_info()[0]
                s.close()
                os.remove(SPATH+'/'+str(mySock))
                print 'Processo '+str(mySock)+' parando'


## Invoca a funcao main na inicializacao do programa
if __name__ == '__main__':
        main()

More information in English Wiki-en

    
07.11.2016 / 00:49