How to exclude rows from a Data Frame in R based on the values of one of the columns?

1

I want to exclude rows that have a certain value in a column.

Let's suppose I have a DF where the first column is an index with letters of the alphabet, but, I do not know the position, I want the lines with a vowel index to be removed as I do.

    
asked by anonymous 08.03.2018 / 22:30

2 answers

5

To search for alphanumeric patterns, it is best to use grep or grepl .

set.seed(6323)    # Torna os resultados reprodutíveis

n <- 100
DF <- data.frame(A = sample(LETTERS, n, TRUE), X = rnorm(n))

inx <- grepl("[AEIOU]", toupper(DF[[1]]))

DF2 <- DF[!inx, ]        # Usando um indíce lógico
DF3 <- subset(DF, !inx)  # Usando avaliação não-standard

identical(DF2, DF3)
#[1] TRUE

head(DF2)
#  A           X
#2 F -0.54113708
#3 D -0.72646708
#4 V  0.02213349
#6 X -0.64141533
#7 F -1.06416864
#8 Y -0.90681239

Another equivalent way will be with %in% .

inx2 <- toupper(DF[[1]]) %in% c("A", "E", "I", "O", "U")
DF4 <- DF[!inx2, ]        # Usando um indíce lógico mais uma vez

identical(DF2, DF4)
#[1] TRUE
    
09.03.2018 / 11:52
0

I understood later that I could only do a reassignment on the variable that contained my Data Frame forcing it to conform to the conditions I wanted.

In this case the "profile" is a variable with the value that I want to redeem.

df <- df[df$Perfil.do.entrevistado==perfil,]

So he assigns everyone that the line has the desired value and the others are automatically discarded.

    
15.03.2018 / 22:49