problem with R enconding


I've been asked to do text analysis and am having trouble with encoding, does anyone know how I can translate these strings directly?

Example of how the file is appearing:

vocês dizerem que não!!! Até quando

Another example:

â¤ï¸(...) Comilanças é amigo secreto na casa clean!ðŸŽ

I've tried using this function:


and I got this output:

    [1] "UTF-8"        "windows-1252" "windows-1250" "UTF-16BE"     "UTF-16LE"     "Shift_JIS"    "windows-1254"
    [8] "IBM420_ltr"  

    [1] ""   "pt" "cs" ""   ""   "ja" "tr" "ar"

    [1] 1.00 0.63 0.28 0.10 0.10 0.10 0.02 0.01

If anyone can help, I'll be grateful!

asked by anonymous 08.12.2017 / 16:06

1 answer


Maybe the problem is already in the import. How are you importing the files? When the default does not work correctly I usually try to set encoding = "UTF8" .

Try: read.csv ("filename.csv", encoding="UTF-8")

Anything you try to read here: link

23.04.2018 / 15:10