I want to exclude rows that have a certain value in a column.
Let's suppose I have a DF where the first column is an index with letters of the alphabet, but, I do not know the position, I want the lines with a vowel index to be removed as I do.
I want to exclude rows that have a certain value in a column.
Let's suppose I have a DF where the first column is an index with letters of the alphabet, but, I do not know the position, I want the lines with a vowel index to be removed as I do.
To search for alphanumeric patterns, it is best to use grep
or grepl
.
set.seed(6323) # Torna os resultados reprodutíveis
n <- 100
DF <- data.frame(A = sample(LETTERS, n, TRUE), X = rnorm(n))
inx <- grepl("[AEIOU]", toupper(DF[[1]]))
DF2 <- DF[!inx, ] # Usando um indíce lógico
DF3 <- subset(DF, !inx) # Usando avaliação não-standard
identical(DF2, DF3)
#[1] TRUE
head(DF2)
# A X
#2 F -0.54113708
#3 D -0.72646708
#4 V 0.02213349
#6 X -0.64141533
#7 F -1.06416864
#8 Y -0.90681239
Another equivalent way will be with %in%
.
inx2 <- toupper(DF[[1]]) %in% c("A", "E", "I", "O", "U")
DF4 <- DF[!inx2, ] # Usando um indíce lógico mais uma vez
identical(DF2, DF4)
#[1] TRUE
I understood later that I could only do a reassignment on the variable that contained my Data Frame forcing it to conform to the conditions I wanted.
In this case the "profile" is a variable with the value that I want to redeem.
df <- df[df$Perfil.do.entrevistado==perfil,]
So he assigns everyone that the line has the desired value and the others are automatically discarded.