R: Insert date difference into a function for time difference

4

I found a function made by J.Ahumada and found it super interesting, everything to do with my work. The function is to separate photographic records of a species into a given sample unit (ua) from a desired range of independence.

I created an object with the information of a certain species in a single sample unit. The object is called "paca".

paca <- filter(meus.dados, ua==" GF", data, hora, especie=="Cuniculus paca")
paca

     ua       data     hora        especie
   (chr)     (time)    (chr)          (chr)
1     GF 2012-06-02 01:12:00 Cuniculus paca
2     GF 2012-06-11 23:50:00 Cuniculus paca
3     GF 2012-06-12 00:06:00 Cuniculus paca
4     GF 2012-06-12 01:16:00 Cuniculus paca
5     GF 2012-07-11 20:35:00 Cuniculus paca
6     GF 2012-07-24 23:52:00 Cuniculus paca
7     GF 2012-08-01 21:39:00 Cuniculus paca
8     GF 2012-08-09 02:37:00 Cuniculus paca
9     GF 2012-08-11 00:24:00 Cuniculus paca
10    GF 2012-08-13 00:55:00 Cuniculus paca
11    GF 2012-08-13 19:47:00 Cuniculus paca
12    GF 2012-08-15 19:16:00 Cuniculus paca
13    GF 2012-08-18 02:35:00 Cuniculus paca
14    GF 2012-08-18 22:28:00 Cuniculus paca
15    GF 2012-08-24 02:27:00 Cuniculus paca

When you rotate the function, it returns the sequence of numbers corresponding to the line number created by R (1 to 15). And when the record does not respect the 60min interval it repeats the line number where the record is.

reg.independentes<-function(dados,independencia){

   l<-length(dados$data)
   intervalo<-diff(dados$data)
   intervalo<-intervalo/60 #informar intervalo de independência em minutos
   intervalo<-as.numeric(intervalo)
   ev<-1;res<-numeric()
   cond<-intervalo> independencia 
   for(i in 1:(l-1)){
   if(!cond[i]) ev<-ev
   else ev<-ev+1
   res<-c(res,ev)

   }
  c(1,res)
 }

 reg.independentes(paca, 60)
 [1]  1  2  3  3  4  5  6  7  8  9  9 10 11 11 12 12 13 14 15 

The function does not consider the fact that the record was on different dates, it is only considering the time. Generating two situations:

First: repeating lines where the record was on the same date, but with intervals greater than 60min. For example, it repeats line 3, when checking the records, they have interval greater than 60min, as desired (date is equal and different time - 00:06 and 01:16). I did not understand why, it was not for this record being signaled !!

Second: repeating rows where the record is on different dates only the time is similar. The function is not considering whether the date is different, for example, it flagged the line (9, 11, and 12) but the records are on different dates, becoming independent.

The record is considered Non-Independent if it occurs on the same day and in an interval less than 60 minutes. If the record is at similar times, but on different dates they are considered Independent (this is what I need the function to do)

I tried to change the formula of the function but I can not. I would add that the function would return a table with only the independent records .... Can anyone help me?

    
asked by anonymous 08.05.2016 / 17:32

1 answer

2

Working with functions created by other people is not very simple (especially without explaining the algorithm), so I found it simpler to do from scratch.

First of all, it's important that you transform your data into a time format that R understands, to simplify interval measurement. Your data paste shows that the date is as time , but the time is not. As I started from everything in text format, the form would be as follows:

paca <- read.table(text = "ua       data     hora        especie
     GF 2012-06-02 01:12:00 Cuniculus_paca
     GF 2012-06-11 23:50:00 Cuniculus_paca
     GF 2012-06-12 00:06:00 Cuniculus_paca
     GF 2012-06-12 01:16:00 Cuniculus_paca
     GF 2012-07-11 20:35:00 Cuniculus_paca
     GF 2012-07-24 23:52:00 Cuniculus_paca
     GF 2012-08-01 21:39:00 Cuniculus_paca
     GF 2012-08-09 02:37:00 Cuniculus_paca
     GF 2012-08-11 00:24:00 Cuniculus_paca
    GF 2012-08-13 00:55:00 Cuniculus_paca
    GF 2012-08-13 19:47:00 Cuniculus_paca
    GF 2012-08-15 19:16:00 Cuniculus_paca
    GF 2012-08-18 02:35:00 Cuniculus_paca
    GF 2012-08-18 22:28:00 Cuniculus_paca
    GF 2012-08-24 02:27:00 Cuniculus_paca", 
                   stringsAsFactors = FALSE, header = TRUE)

paca$data_completa <- strptime(paste(paca$data, paca$hora),
                              format = "%Y-%m-%d %H:%M:%S")

I put the date and time information into a single string and used the as.Date function to transform it into a date format.

To duplicate the index of measurements that meet your criteria, we need only check which intervals are less than the limit (in this case, 60 minutes), and repeat these positions. The final function looks like this:

reg_independentes <- function(dados, independencia) {
  intervalo <- diff(dados) #Apenas a informação de tempo é necessária. A função diff calcula o intervalo entre o valor e o seu valor anterior. 
  units(intervalo) <- "mins" # Precisamos disso para garantir que faremos a comparação em minutos, sempre.
  repetir <- which(intervalo < independencia) # Verifcamos quais intervalos são menores que o valor independencia.
  sort(c(0, seq_along(intervalo), repetir)+1) # Juntamos os valores por ordem crescente. Precisamos do 0 e do + 1 porque sempre há um intervalo a menos que o número de valores.
}

Using the function:

reg_independentes(paca$data_completa, 60)
# [1]  1  2  3  3  4  5  6  7  8  9 10 11 12 13 14 15

I think the result is now correct, but if it is not you should be able to make the necessary adjustment.

    
09.05.2016 / 19:42