ASCII reading files of the Demographic Census 2010

2

Does anyone know where to download the ASCII (.sas) files for reading the microdata of the 2010 IBGE Demographic Census?

I know the Anthony Damico just keeps some files on his site (see below how to do the download), but I am looking for the files made available by IBGE itself. Damico does not provide, for example, the file for reading the mortality basis.

   # download arquivo SAS de pessoas

     download.file( "https://raw.github.com/ajdamico/asdfree/master/Censo%20Demografico/SASinputPes.txt" , "LEPESSOAS.sas" )

ps. In the IBGE / census 2010 site it is possible to download the microdata and documentation, but there is no information about the read files in SAS

UPDATE (02 Oct 2015)

I confirmed @Rcoster's response with two IBGE researchers. IBGE does not make the SAS read files available on the site. I followed the suggestion of @Rcoster and created a script that:

  • Download 2010 census data and documentation
  • uses the variable dictionary in excel to build the base reading file in .txt and convert to data.table
  • save bases in .csv

The script is very fast and is available here . Suggestions are welcome.

    
asked by anonymous 28.09.2015 / 17:20

2 answers

0

Complementing @Rcoster's response, here is an alternative solution where you control the syntax of reading .txt from the dictionary file .xls .

# Load libraries
  library(data.table)
  library(readxl)


# Abre arquivo Excel com dicionario de variaveis
  dic_dom <- read_excel("./Documentacao/Layout_microdados_Amostra.xls", sheet =1, skip = 1)
  dic_pes <- read_excel("./Documentacao/Layout_microdados_Amostra.xls", sheet =2, skip = 1)
  dic_mor <- read_excel("./Documentacao/Layout_microdados_Amostra.xls", sheet =4, skip = 1)

# converte para data.table
  setDT(dic_dom)
  setDT(dic_pes)
  setDT(dic_mor)

# Cria funcao para computar largura das variaveis, e muda nome de posicao inicial e final
  computeWidth <- function(dataset){dataset[is.na(DEC), DEC := 0]
                                    dataset[, width := INT + DEC]
                                    setnames(dataset,colnames(dataset)[3],"pos.ini")
                                    setnames(dataset,colnames(dataset)[4],"pos.fin")
                                    }



# Aplica funcao
  lapply(list(dic_dom,dic_pes,dic_mor), computeWidth)

This code is taken from this scritp here, which downloads and reads the bases of the 2010 demographic census

    
02.10.2015 / 22:40
2

These files are not available from IBGE. What IBGE offers is a file with a layout of each of the banks ( Layout / Layout_microdados_Amostra.xls), which allows you to make your own syntax.

    
28.09.2015 / 18:12