Value comparison within date frame

5

Hello, I have a database, with about 50000 observation, as follows, just figurative values:

nome<-c("joão","pedro", "joãoo")
identificador<-c(123456,124578,123456)
valor<-c(2145,350,23)
dados=data.frame(nome,identificador,valor)

I would like to identify individuals with the same identifier and create a new variable as follows:

nome=c("joão","pedro", "joãoo","maria","mariaa","carla","felipe","vitor","pedro","vitorr")
identificador=c(123456,124578,123456,000,000,123,156,2222,3232,2222)
valor=c(2145,350,23,32,12,32,1,2,54,4)'
validor=c(1,0,1,2,2,0,0,3,0,3)
dados=data.frame(nome,identificador,valor,validor)

I did this to identify the same identifiers, but I can not do this variable.

x<-dados$identificador
length(x)
i=1
k=1
validor=0
validor[1:50000]=0
for(i in 1:50000){
  for(j  in 1:50000){
    if(x[j]==x[i] & i!= j ){
      validor[j]=k
    }
  }
}

I would like to create a function that would produce the variable valuer of the form that was shown. I hope to have been clear, and thank you very much for the help.

    
asked by anonymous 14.03.2017 / 20:38

1 answer

4

I think this is very close to what you want. The difference is that the equal identifiers will not be in that order: 1,2,3 ...

library(uniqueAtomMat)
library(tuple)
identificador<-c(123456,124578,123456,000,000,123,156,2222,3232,2222)
validor<-grpDuplicated(identificador) # agrupa idenficadores iguais dentro de uma mesma categoria
validor[match(orphan(validor),validor)]<-0  #Atribui zero aos identificaores órfãos.
    
15.03.2017 / 02:34