You want to create the following C
set:
Todothis,youneedtoanswertwoquestions:
Howtoidentifye
belongingtoA
?Howtoidentifye
notbelongingtoB
?AsinourcaseA
isafile,thene
belongstoA
ifitistheresultofreadingthefile.Sowewereabletoanswerthefirstquestion.
Now,howdoyouknowife
doesnotbelongtoB
?Theoperationofnon-pertinenceimpliesthefollowing:
IfB
isacommonset,itmeansthatIwillalwaysneedtocompareallofitselementstomakesurethate
doesnotbelongtoB
.Butwedonothavetosticktocommonclusters,Icanhaveorderedclusters,hashmapsortreenames,allofthesealternativesallowoptimizationstobemadeinthenumberofcomparisonsmade.Anefficientsearchdatastructurewillreducethenumberofoperationsperformed.
Forthescopeofthisanswer,I'mnotoptimizingtheamountofcomparisons.I'malsotakingintoaccountthatthefilecontainingtheB
setisrelativelysmall,withafewmegabytesmaximum.
SincetheB
setisalwaysqueriedasawhole,andtheA
setonlyhasimportanceofgeneratingthenextelement,atthestartofmyalgorithmIwilltotallypreloadB
.TheinitializationoftheA
set,inthiscase,willsimplybetoopenthefile.
Thegeneralideaismoreorlessasfollows:
inicializaconjuntoAinicializaconjuntoBfaça:pegue_e_opróximoelementodeAse_e_nãopertenceaB:adiciona_e_emCenquantonãochegouaofimdoconjuntoA
InitializingA
wouldonlyopenthefile.IfweweretotrytooptimizethecomparisonsbetweenA
andB
,wecouldusesomedatastructureinA
asanorderedvector.
inicializaçãoconjuntoA:abrearquivo"listaCompleta"
Initializing the B
set here is your complete reading. Since we do not know its total size a priori, we can use a linked list, whose node contains a nome
structure and a pointer to the next linked list element:
inicialização conjunto B:
nodo_lista *conjuntoB = NULL
abre arquivo "listaPresenca"
faça:
nodo_lista *novoElemento = malloc(sizeof(nodo_lista))
novoElemento->next = conjuntoB
lê do arquivo "listaPresenca" no endereço &(novoElemento->valor)
se a leitura deu certo:
conjuntoB = novoElemento
enquanto não der fim do arquivo "listaPresenca"
fecha arquivo "listaPresenca"
Getting the next element is just giving fread
the way you did it:
pegue _e_ o próximo elemento de A:
nome _e_
lê do arquivo "listaCompleta" no endereço &_e_
The relevance of e
to B
is being treated by looking at the entire set B
, so we need to do a complete iteration:
_e_ pertence a B?
nodo_lista *elementoB = conjuntoB
enquanto elementoB != NULL:
se elementoB->valor é igual a _e_:
retorna "_e_ pertence a B"
elementoB = elementoB->next
retorna "_e_ não pertence a B"
The comparison between two elements of type nome
is comparing the nomePessoa
field of the two objects using strcmp
:
_a_ é igual a _b_?
retorna strcmp(_a_.nomePessoa, _b_.nomePessoa) == 0
Add element e
to set C
can either put the new element in a list or write directly to the file with fwrite
. If you use the alternative to be in a list, after completing the analysis of all items of A
, it is necessary to write these items to the file.
Differences between our approaches
Basically, I am filling the B
set in working memory while you try to keep using external memory (such as HD). The problem with your approach is that before you re-examine an item in the A
set, you would have to reposition the file that represents the B
set to start again. A simple fseek
soon after reading the first file, forcing the second file back to the beginning, would sometimes return to the beginning of the iteration of the B
set.
As your alternative involves a quadratic amount of readings to external memory, I did not find it to be a practical solution. So, by using a linear amount of readings to the external memory, I fill in the whole B
set.