I get NA when I convert character to time (POSIXlt)

3

Why do I get NA when I do this character conversion to POSIXlt?

    library(bReeze)
    data(winddata)

    tempo <- winddata[,1]
    tempo[1:6] # Preview 
    # [1] "06.05.2009 11:20" "06.05.2009 11:30" "06.05.2009 11:40"

    tempo_POSIX <- strptime(tempo, format = "%d.%m.%Y %H:%M")
    sum(is.na(tempo_POSIX))
    # [1] 6

    valores_NA <- which(is.na(tempo_POSIX))
    tempo[valores_NA]
    # [1] "18.10.2009 00:00" "18.10.2009 00:10" "18.10.2009 00:20" 
    # [3] "18.10.2009 00:30" "18.10.2009 00:40" "18.10.2009 00:50"

As you can see, the values that were converted to NA behave normally ... they follow the same format as the others.

Oddly enough, the error DOES NOT occur if you pass a value to the tz argument

    tempo_POSIX <- strptime(tempo, format = "%d.%m.%Y %H:%M", tz = "GMT")
    sum(is.na(tempo_POSIX))
    # [1] 0

My system information is:

    > sessionInfo()
    R version 3.0.2 (2013-09-25)
    Platform: x86_64-w64-mingw32/x64 (64-bit)

    locale:
    [1] LC_COLLATE=Portuguese_Brazil.1252  LC_CTYPE=Portuguese_Brazil.1252   
    [3] LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C                      
    [5] LC_TIME=Portuguese_Brazil.1252    

    attached base packages:
    [1] stats     graphics  grDevices utils     datasets  methods   base     

    other attached packages:
    [1] bReeze_0.4-0

    loaded via a namespace (and not attached):
    [1] tools_3.0.2
    
asked by anonymous 26.09.2014 / 16:12

1 answer

3

In the as.POSIXlt help, there is the following passage which highlights that formatting date time needs a time-zone and will validate this time and may cause problems in daylight saving time (DST ):

  

Character input is first converted to class "POSIXlt" by strptime:   numeric input is first converted to "POSIXct". Any conversion that   needs to go between the two date-time classes requires a time zone:    > conversion from "POSIXlt" to "POSIXct" will validate times in the   selected time zone. One issue is what transitions happen to and from DST

When you do strptime(tempo, format = "%d.%m.%Y %H:%M") , you are converting the object to the POSIXlt class.

class(tempo_POSIX)
[1] "POSIXlt" "POSIXt" 

But when you do is.na() , you are converting to POSIXct. Note that the is.na.POSIXlt method uses the as.POSIXct function:

is.na.POSIXlt
function (x) 
is.na(as.POSIXct(x))
<bytecode: 0x26519a14>
<environment: namespace:base>

Daylight saving time in Brazil, in 2009, began on October 18 at 00:00. That is, considering daylight saving time, there is no 00:00 in Brazil on October 18, 2009, because when the clock turned 23:59 the previous day, it automatically jumped to 01:00 in the morning.

So when you do is.na() you are transforming the date into POSIXct and this conversion validates the date provided with your locale (which is probably Brazil / São Paulo, because since you did not specify the time zone, it will be used the system). And since there is no 00:00 on October 28th in this time zone, this results (correctly, but unexpectedly) in NA. When you put the GMT-time zone or another date (as London), it does the conversion normally, so it worked with tz = "GMT" (and that's why it worked with Djongs, it should be in another locale).

    
26.09.2014 / 18:29