I am writing a script, which I will make public, to open the RAIS microdata (unidentified, available here ) in R using MonetDB. However, the bank does not accept a vignette (,) as a decimal separator. Each RAF UFano.txt file is quite large (up to 7GB) so the solution can not require modifications that fit into RAM. Two alternatives:
a) import into the database, as if everything were string, and then do within SQL an UPDATE creating new columns for numeric variables and substituting "," for ".".
b) preprocess the file, replacing in the .txt with a comma by point.
The perch is about the alternative "b".
Is there any efficient way to do this replacement? AjDamico indicates a slow form, replacing line by line here .
As an example we can start with the Acre archive of 2012 (AC2012.txt), which I this link
As it is to be packaged as an R command, the solution can not depend on the OS or require the installation of things outside of R.