Select column of a data.frame - division of Database in R

3

I imported a table as a database to handle in R. However, I need to do some calculations with only a few columns in this table.

How do I select only these columns for the calculations?

    
asked by anonymous 13.07.2017 / 20:24

2 answers

3

There are several ways to select columns from a data.frame in R, let's use data.frame mtcars as an example. To find out which columns exist, you can ask to see the names or colnames of data.frame :

names(mtcars)
 [1] "mpg"  "cyl"  "disp" "hp"   "drat" "wt"   "qsec" "vs"   "am"   "gear" "carb"

To select any of these columns, for example, the mpg column, you can use $mpg , brackets [,"mpg"] as an array, or double brackets as if it were a [["mpg"]] list:

mtcars$mpg
mtcars[, "mpg"]
mtcars[["mpg"]]

These three ways mentioned return a vector as a result. You can also select a data.frame containing the mpg column (note the difference, you get a date .frame and not a vector). For this you will use the simple bracket as if it were a list:

mtcars["mpg"]

Or also use the were array, with the drop = FALSE argument.

mtcars[ ,"mpg", drop = FALSE]

If you want to select more than one column, you can use either the simple bracket as a list or the simple bracket as an array.

mtcars[ ,c("mpg", "cyl")] # seleciona duas colunas
mtcars[c("mpg", "cyl")] # seleciona duas colunas

Note that the array form now returns a data.frame , since you are selecting more than one column. There are convenience functions to do this too, like the subset function that rafael mentioned. It will return you a data.frame with the column mpg and not a vector:

subset(mtcars, select = c("mpg","cyl"))

And each data manipulation package also has its way of selecting columns. For example, dplyr has the select function, which is very similar to the subset mentioned:

mtcars %>% select(mpg, cyl)
    
16.07.2017 / 23:23
1

It would be easier if you included your database (or some part of it) so we could work on it. Take a look at the dput function for this purpose. It would also be nice if you included the code you developed / tried to develop.

As for your doubt, the subset functions, from the R base itself, or the select function, from the dplyr package, should help you.

    
13.07.2017 / 20:50