How to remove a column from the data.frame in R?

5

Suppose a generic data.frame, such as:

set.seed(1)
dados <- data.frame(y=rnorm(100), x= rnorm(100), z=rnorm(100), w=rnorm(100))
head(dados)
           y           x          z          w
1 -0.6264538 -0.62036668  0.4094018  0.8936737
2  0.1836433  0.04211587  1.6888733 -1.0472981
3 -0.8356286 -0.91092165  1.5865884  1.9713374
4  1.5952808  0.15802877 -0.3309078 -0.3836321
5  0.3295078 -0.65458464 -2.2852355  1.6541453
6 -0.8204684  1.76728727  2.4976616  1.5122127

How do I exclude columns from the data.frame?

    
asked by anonymous 17.02.2014 / 18:10

2 answers

5

There are several ways to do this.

The simplest way is to assign NULL to the column, for example, to remove the x column:

dados$x <- NULL
head(dados)
           y          z          w
1 -0.6264538  0.4094018  0.8936737
2  0.1836433  1.6888733 -1.0472981
3 -0.8356286  1.5865884  1.9713374
4  1.5952808 -0.3309078 -0.3836321
5  0.3295078 -2.2852355  1.6541453
6 -0.8204684  2.4976616  1.5122127

You can also delete several at a time by placing the minus sign on columns that you do not want to be selected, for example, to delete the first and third column:

dados<-dados[,-c(1,3)]
head(dados)
            x          w
1 -0.62036668  0.8936737
2  0.04211587 -1.0472981
3 -0.91092165  1.9713374
4  0.15802877 -0.3836321
5 -0.65458464  1.6541453
6  1.76728727  1.5122127

Another way is to reference the columns by name, creating a vector of columns to be deleted and leaving in the data.frame only those columns that are not in this vector:

excluir <- c("x", "y")
dados <- dados[,!(names(dados)%in% excluir)]
head(dados)
       z          w
1  0.4094018  0.8936737
2  1.6888733 -1.0472981
3  1.5865884  1.9713374
4 -0.3309078 -0.3836321
5 -2.2852355  1.6541453
6  2.4976616  1.5122127
    
17.02.2014 / 18:17
2

To remove only one column, I prefer the mode used by @carloscinelli

dados$x <- NULL
head(dados)
           y          z          w
1 -0.6264538  0.4094018  0.8936737
2  0.1836433  1.6888733 -1.0472981
3 -0.8356286  1.5865884  1.9713374
4  1.5952808 -0.3309078 -0.3836321
5  0.3295078 -2.2852355  1.6541453
6 -0.8204684  2.4976616  1.5122127

For other cases, I prefer to use the command subset

To keep columns x and w , use:

dados <- subset(dados, select = c(x, w))
head(dados)
            x          w
1 -0.62036668  0.8936737
2  0.04211587 -1.0472981
3 -0.91092165  1.9713374
4  0.15802877 -0.3836321
5 -0.65458464  1.6541453
6  1.76728727  1.5122127

To exclude columns x and y , use the - sign before the vector with the column names

dados <- subset(dados, select = -c(x, y))
head(dados)
           z          w
1  0.4094018  0.8936737
2  1.6888733 -1.0472981
3  1.5865884  1.9713374
4 -0.3309078 -0.3836321
5 -2.2852355  1.6541453
6  2.4976616  1.5122127

It's worth noting the use of the [] operator. To keep only the w column we can use:

excluir <- c("x", "y", "z")
dados <- dados[,!(names(dados) %in% excluir)]
head(dados)
[1]  0.8936737 -1.0472981  1.9713374 -0.3836321  1.6541453  1.5122127

but in this case dados is transformed into a vector . To fix, use the drop = FALSE

dados <- dados[,!(names(dados) %in% excluir), drop = FALSE]
head(dados)
           w
1  0.8936737
2 -1.0472981
3  1.9713374
4 -0.3836321
5  1.6541453
6  1.5122127

or, if you prefer to use subset

dados <- subset(dados, select = c(w))
head(dados)
           w
1  0.8936737
2 -1.0472981
3  1.9713374
4 -0.3836321
5  1.6541453
6  1.5122127
    
17.02.2014 / 19:12