Use of the sub function in the R - string with special characters

3

I'm running a database with the following values:

data $ Col_Nova

asked by anonymous 27.10.2017 / 14:33

3 answers

5

You can change the entire column encoding at one time

dados$Col_Nova <- iconv(dados$Col_Velha, to = "latin1//TRANSLIT", from = "UTF-8")
    
27.10.2017 / 15:41
1

You have put the backslashes

dados$Col_Nova <- sub(pattern = "[A-z].{2}cnic[A-z]\s.*", "Técnico de enfermagem", dados$Col_Nova)
    
27.10.2017 / 15:08
1

Another alternative, which I consider more elegant, is to treat the file encoding in the load of the same, instead of fixing misconfigured load errors.

You can set the encoding in the file read, as follows:

csvFile <- file("arquivo.csv", encoding="UTF-8")
data <- read.csv(csvFile)

Following your case in the other comments, it's possible that the setting looks like this:

dados <- read.csv(file(arquivos[indice_arquivos], encoding="UTF-8"), header=T)

Since you did not post the full code, you can not guarantee it. But if it is not, it will be next to that.

    
28.10.2017 / 02:22