How to group the microdata of people from the census by residence?

4

I am trying to answer the following question: How many couples with children under 18 do both parents work out?

Given a% census of people in the 2010 census (such as that of Acre ), first thing I did was filter the table by couples with children.

  censo <- read.csv("AC.csv", sep = "\t")

  # V5090 -- TIPO DE COMPOSIÇÃO FAMILIAR DAS FAMÍLIAS ÚNICAS E CONVIVENTES PRINCIPAIS
  #  1 - Casal sem filho(s)
  #  2 - Casal sem filho(s) e com parente(s)
  #  3 - Casal com filho(s)   <----------------------
  #  4 - Casal com filho(s) e com parente(s)  
  #  5 - Mulher sem cônjuge com filho(s)
  #  6 - Mulher sem cônjuge com filho(s) e com parente(s)
  #  7 - Homem sem cônjuge com filho(s)
  #  8 - Homem sem cônjuge com filho(s) e com parente(s)
  #  9 - Outro
  #  Branco

 censo_cf <- censo[which(censo$"V5090"  == 3),]

Then I filtered so that at least one of the children was under 18 years old:

# V6660 IDADE DO ÚLTIMO FILHO TIDO NASCIDO VIVO ATÉ 31 DE JULHO DE 2010:
censo_cf18 <- censo_cf[which(censo_cf$V6660  < 18),]

My next step would be to group the respondents by domicile (to later check which homes they both worked). Although I did not see this documented anywhere for the 2010 census, according to 2000 census documentation (page 83) the variable censo would be:

  

Identification of the domicile

So, I would expect that within my subsets (couples with children) all households had at least three respondents (husband, wife and child). However, only three households had this:

# V0300 CONTROLE
table_V0300 <- table(censo_cf18$V0300)
pessoas_por_domicilio  <- table(table_V0300)
pessoas_por_domicilio

   1    2    3 
9340   57    3

What is my error?

    
asked by anonymous 13.11.2015 / 15:54

2 answers

3

Your error is in this part:

censo_cf18 <- censo_cf[which(censo_cf$V6660  < 18),]

The moment you do this, you are cutting out 1) men (this variable only exists for women) and children. So the count of how many times the variable V0300 (which is also the control in the 2010 census) is being done in the wrong way, and therefore the unexpected result.

What you should do is store this variable ( V0300 ) of the cases you want (households with at least 1 child under 18 years old, consisting of couple and child (ren) and where the couple works) select these households.

A code (using the data.table package and the database that I have of the sample already has the labels, but it is easy to adapt to data.frame and without the labels):

# Primeiro filtro - Pegar o código das residências dos casais com filhos

Filtro1 <- dados[V5090 == 'Casal com filho(s)', V0300]

# Segundo filtro - Pegar, dos casais com filhos, as mães que tem filhos com menos de 18 anos

Filtro2 <- dados[V0300 %in% Filtro1 & V6660 < 18, V0300]

# Agora, pegar somente o responsável ou seu conjuge:

temp <- dados[V0300 %in% Filtro2 & V0502 %in% c('Pessoa responsável pelo domicílio', 'Cônjuge ou companheiro(a) de sexo diferente', 'Cônjuge ou companheiro(a) do mesmo sexo'), .(V0300, PessoaTrabalhando = V0641 == 'Sim')] # Aqui já pego só as variáveis de interesse.

# Terceiro filtro - Pegar, dos casais com filhos e com pelo menos 1 filho com menos de 18 anos, os que ambos trabalham.

Filtro3 <- temp[, .(PessoasTrabalhando = sum(PessoaTrabalhando)), by = V0300][PessoasTrabalhando == 2, V0300]

# Agora sim da para fazer as análises

novosdados <- dados[V0300 %in% Filtro3, ]
novosdados[, .(N = .N), by = V0300][, table(N)]

# Resultado em Porto Alegre:
# N
#    3    4    5    6    7    8    9   10   11   12   13 
# 1570 1139  397  141   52   17   13    6    1    1    1

Just remembering that the sample data should be weighted by the V0010 variable. If I'm not mistaken , the weight of the household is the same as that of the person in charge. Taking advantage, you can download the documentation of the 2010 Census in this link (IBGE's own FTP ).

    
13.11.2015 / 18:11
1

Good afternoon rcoster and coelacanth. Thank you for your conversation, solved one of my doubts.

I would like to contribute that your solutions ignore secondary families, eg. the family of the daughter living with the family of the parents. In this case, the primary family has V5040 = 1 and the secondary family V5040 = 2. V5090 = 2, "couple with children" only applies to the primary family; for others you need V5100 = 2. In other words, their solutions should still include the secondary families (most likely to have children

01.05.2016 / 19:19