I'm trying to make the following code to read a giant file that does not fit into memory.
library(dplyr)
arq_grande <- file("dados2014.csv", "r")
tam_chunk <- 1e2
df1 <- read.csv(arq_grande, nrows = 10, header = T, sep = ",", dec = ".")
df_filtrado <- df1 %>% filter(TP_SEXO == 'M')
write.table(df_filtrado, "sexoM.csv", row.names = F, sep = ",", dec = ".")
nrow <- 1
repeat {
df <- read.csv(arq_grande, header=FALSE, col.names = names(df1), nrows = tam_chunk)
cat("Read", nrow(df), "rows\n")
if (nrow(df) == 0)
break
df_filtrado <- df1 %>% filter(TP_SEXO == 'M')
write.table(df_filtrado, "sexoM.csv", append = T, col.names = F, row.names = F, sep = ",", dec = ".")
}
close(arq_grande)
The problem with it is that it does not advance in reading. Just keep repeating the first 10 lines without stopping.
The writing looks like it's working.