In your question, the answer already exists, perhaps to better understand it, one must initially conceptualize distributed computing.
Distributed System
A distributed or parallel processing system is a system that interconnects multiple processing nodes (individual computers, not necessarily homogeneous) so that a high-power process runs on the "most available" node, or even subdivided by multiple nodes . Obvious gains are therefore made in these solutions: any task, if divisible into several subtasks, can be performed in parallel.
When is the choice of a leader important?
The election becomes necessary when the distributed system is being started for the first time or the previous leader can not communicate with the other processes due to a failure.
In distributed computing how does a leader's election work?
There are several algorithms that make the election of the leader, each specific to some situation.
Ring Algorithm
The ring or CSF algorithm, Le Lann, Chang and Roberts initials, serves to elect a leader if the processes are arranged in a ring. Each process must know its neighbor on the right and left and must have a unique, fixed, and assigned numeric identifier before the election commences. Originally this algorithm aimed at recovering a lost token in a network with ring-shaped topology, choosing a node of the network that would serve as a starting point for the new token .
Running the algorithm seeks to elect the highest identifier process and make all ring members recognize the new leader.
If one of the nodes identifies the loss of the token, it initiates the election by sending an election message with its node number to the right-hand neighbor.
If the node receiving the election message has an identifier greater than that entered in the message it received, it passes an election message to its right-hand neighbor with its own identifier. Otherwise it accepts that the node that has the identifier contained in the message will be the leader and pass it to its right neighbor.
If the node receives a message with the identifier identical to yours, it declares itself a leader. This event only occurs when the message containing the largest identifier has circulated throughout the ring making all its members aware of the result.
Bully algorithm
The bully algorithm serves to elect a leader between processes identified by a unique, fixed, and assigned numeric identifier before the start of the election. However in this case the topology is not limited to a ring and each process can communicate with any other in the system. Again the execution of the algorithm seeks to elect the highest identifier process and make everyone recognize the new leader.
If one of the processes identifies the loss of contact with the leader, he initiates a new election by sending everyone else a message containing his identifier.
All nodes respond to the process that initiated the election with their own identifiers.
If the process that initiated the election has the greatest identifier among all the others, it proclaims itself leader and warns all the others, otherwise it waits for the process of greater identifier to start an election and become leader.
This algorithm has this name precisely because of its behavior of bully (bully / bully). The process of greater identifier prevails over those of smaller number and even if one of these wins an election, quickly takes the elected post by proposing a new election.
Wiki-pt
Example
#!/usr/bin/python
import socket
import os
import sys
## Variaveis globais
mySock = 0
## Path onde estao os sockets
SPATH='/tmp/bully'
## Funcao q manda mensagem
def sendMsg(dest,msg):
global mySock
try:
ns = socket.socket(socket.AF_UNIX,socket.SOCK_STREAM)
ns.connect((dest))
ns.send(msg+';'+mySock)
ns.close()
return 1
except:
return 0
## Funcao q fica se comunicando com o coordenador para ver se esta ativo
def verificaCoord(coord):
global SPATH
global mySock
while True:
## Verifica se o coordenador esta ativo
status = sendMsg(coord,'1')
if status == 0:
print 'Coodenador parado '+str(coord)+'. Iniciando Eleicao'
list = os.listdir(SPATH)
for i in list:
if int(i) > int(mySock):
status = sendMsg(i,'E')
if status == 0:
print 'Erro enviando E ao processo '+i
break
else:
print 'Conectou no coordenador '+coord
return 0
def main():
## Variaveis
fCord = 1
BUF=1024
global mySock
global SPATH
try:
## Se recebeu um argumento he um processo recussitado
mySock = sys.argv[1]
## Testa se existem processos com prioridade maior
list = os.listdir(SPATH)
for i in list:
if int(i) > int(mySock):
status = sendMsg(i,'1')
if status == 1:
fCord = 0 ## Se houver processos maiores, nao he coordenador
continue
## Se for o maior anuncia a todos
if fCord == 1:
for i in list:
if int(i) < int(mySock):
status = sendMsg(i,'C')
except:
## Nome do socket = pid do processo
mySock=str(os.getpid())
fCord = 0
print 'Iniciando. PID='+str(mySock)
## Verifica se existe o diretorio para os sockets
try:
os.chdir(SPATH)
except:
## Se nao existe cria o diretorio
os.mkdir(SPATH)
try:
## Cria o socket
s = socket.socket(socket.AF_UNIX,socket.SOCK_STREAM)
s.bind((mySock)) ## Atribui um nome ao socket
s.listen(1) ## Estabelece fila para conexoes
## Inicialmente o menor PID he o coordenador
list = os.listdir(SPATH)
## Ordena os pids e pega o primeiro
list.sort(cmp)
if list:
COORD=list[0]
else:
COORD=mySock
## Nao he o coordenador
if COORD != mySock and fCord == 0:
## Divide o processo
childPID = os.fork()
if childPID == 0:
## O filho permanece testando o coordenador
verificaCoord(COORD)
#sys.exit(1)
else:
COORD = mySock
print 'Eu sou o coordenador '+str(mySock)
## Todos os processos pais permanecem em wait, aguardando conexao
conn, addr = s.accept()
data = conn.recv(BUF)
while True:
if data:
msg = data.split(';')
## Se a mensagem recebida for de eleicao
if msg[0] == 'E':
print 'Recebido mensagem '+msg[0]+' de PID='+str(msg[1])
## Enviar msg para pid menor parar eleicao
sendMsg(msg[1],'PE')
#procura se tem alguem maior q ele
list = os.listdir(SPATH)
maior = 1
for i in list:
if int(i) > int(mySock):
#print 'Enviar para PID '+i
status = sendMsg(i,'E')
if status == 1:
maior = 0
#se for o maior envia msg para todos como novo coord.
if maior == 1:
print 'Eu sou o novo Coordenador '+str(mySock)
os.kill(childPID,9)
for i in list:
if int(i) < int(mySock):
print 'enviando C para '+str(i)
send = sendMsg(i,'C')
## Recebeu msg para parar eleicao
if msg[0] == 'PE':
print 'Recebida mensagem '+msg[0]+' de PID='+str(msg[1])
## Recebeu msg de verificacao de conexao
#if msg[0] == '1':
# print 'Recebida mensagem '+msg[0]+' de PID='+str(msg[1])
## Recebeu msg de coordenador
if msg[0] == 'C':
print 'Novo coordenador '+msg[1]
os.kill(childPID,9)
childPID = os.fork()
if childPID == 0:
## Novo filho gerado monitora o novo coordenador
verificaCoord(msg[1])
#sys.exit(1)
data = conn.recv(BUF)
else:
conn, addr = s.accept()
data = conn.recv(BUF)
except:
print "Unexpected error:", sys.exc_info()[0]
s.close()
os.remove(SPATH+'/'+str(mySock))
print 'Processo '+str(mySock)+' parando'
## Invoca a funcao main na inicializacao do programa
if __name__ == '__main__':
main()
More information in English Wiki-en