Error converting numbers. How to convert factors to numbers?

12

In the following example:

dados <- data.frame(x=c("11", "10", "20", "15"), y=c("25", "30", "35", "40"))
dados
   x  y
1 11 25
2 10 30
3 20 35
4 15 40

When trying to transform the variable x into number, instead of 11, 10, 20 15 appear:

as.numeric(dados$x)
[1] 2 1 4 3

How do I convert x to numbers?

    
asked by anonymous 14.03.2014 / 06:32

2 answers

7

If you analyze the structure of the object you will see where the problem occurs:

str(unclass(dados$x))
atomic [1:4] 2 1 4 3
- attr(*, "levels")= chr [1:4] "10" "11" "15" "20"

The object dados$x is composed of the vector [2,1,4,3] with the levels attribute. This attribute appears on the console when the dados$x is printed.

To solve the problem, in addition to the solution already mentioned, you can adopt the following solution:

as.numeric(levels(dados$x))[dados$x]

In the first part of the solution, the attributes of the object dados$x are extracted and numbered. R automatically places these values in ascending order. Then you use [dados$x] to leave them in the original order.

This solution is slightly more efficient than as.numeric(as.character(dados$x)) , though it may be harder to remember.

    
14.03.2014 / 15:40
10

In R , the default behavior of data.frame is to transform texts into factors. This can lead to unexpected results when numbers, during the data import / manipulation process, are erroneously interpreted as texts and transformed into factors.

In general, when working with data.frames , it is worth putting the stringsAsFactors = FALSE option to prevent variables that should not be treated as factors.

However, once the variable has been improperly transformed into a factor, a possible solution is to convert it to character first before moving to number:

as.numeric(as.character(dados$x))
[1] 11 10 20 15
    
14.03.2014 / 06:39