Merge two series (zoo) of the same variable by intersecting and filling

4

I have two zoo series like this:

a:

data        valor
01-02-2010   2
01-03-2010   0
01-04-2010   9

b:

data        valor
01-06-2010   3
01-07-2010   6
01-08-2010   2

I want a set c like this:

c:

data        valor
01-02-2010   2
01-03-2010   0
01-04-2010   9
01-05-2010   NA
01-06-2010   3
01-07-2010   6
01-08-2010   2

How to proceed?

    
asked by anonymous 15.06.2017 / 06:01

2 answers

3

The idea was basically to create an adaptation of the rbind.zoo() function. It checks which contiguous indexes are not present in a or b , creates a new zoo object with only these indexes and NA values and joins the new zoo to the originals, filling them. >

library(zoo)

# Criar exemplo reprodutível
seq_mes <- function(inicio, tamanho) {
  seq(from = as.Date(inicio), by = '1 month', length.out = tamanho)
}

a <- zoo(c(2, 9, 0), seq_mes('2010-02-01', 3))
b <- zoo(c(3, 6, 2), seq_mes('2010-06-01', 3))

# Definir e aplicar função
rbind_preenche <- function(..., periodo) {
  unido <- rbind(...)
  periodo_total <- seq(from = min(index(unido)),
                       to = max(index(unido)), by = periodo)

  a_criar <- periodo_total[! periodo_total %in% index(unido)]
  novo <- zoo(rep(NA, length(a_criar)), a_criar)
  rbind(unido, novo)
}

rbind_preenche(a, b, periodo = '1 month')
    
16.06.2017 / 16:55
1

I resolved with a little gambiarra. I imagine there should be some simpler way, but this one below is working as it should for at least this example.

First, I created the data frames a and b , as in the example of the question:

a <- data.frame(data=seq(from=as.Date("01-02-2010", format="%m-%d-%Y"), 
  to=as.Date("01-04-2010", format="%m-%d-%Y"), "days"), valor=c(2, 0, 9))

b <- data.frame(data=seq(from=as.Date("01-06-2010", format="%m-%d-%Y"), 
  to=as.Date("01-08-2010", format="%m-%d-%Y"), "days"), valor=c(3, 6, 2))

Next, I created all the dates that should appear in the final frame data, called c . I called these dates data_final . I created a daily sequence, starting at the very least of the dates and ending at most of them:

data_final <- seq(from=min(a$data, b$data), to=max(a$data, b$data), "days")

Then just create the data frame c . The first version of it has dates stored inside data_final and only NA in the value column:

c <- data.frame(data=data_final, valor=NA)

This ugly, only updating the positions of column valor which have equivalent dates in a and b :

c$valor[data_final %in% a$data] <- a$valor[a$data %in% data_final]
c$valor[data_final %in% b$data] <- b$valor[b$data %in% data_final]
c
        data valor
1 2010-01-02     2
2 2010-01-03     0
3 2010-01-04     9
4 2010-01-05    NA
5 2010-01-06     3
6 2010-01-07     6
7 2010-01-08     2
    
15.06.2017 / 20:01