Character conversion to Date - 4 digits

5

I'm having a hard time using as.Date to convert a column with dates (year with only two characters) into date. As.Date autocompletes the year in the wrong way, producing results like 2066 (instead of 1966)

    > head(data_de_nascimento)
    [1] "16/03/66" "11/06/87" "21/11/75" "05/09/70" "15/08/70" "15/08/70"
    > str(data_de_nascimento)
    chr [1:4245] "16/03/66" "11/06/87" "21/11/75" "05/09/70" "15/08/70" ...
    > data_de_nascimento_format <- as.Date(data_de_nascimento,
    + format = "%d/%m/%y" )
    > head(data_de_nascimento_format)
    [1] "2066-03-16" "1987-06-11" "1975-11-21" "1970-09-05" "1970-08-15"    
    
asked by anonymous 28.07.2016 / 21:38

1 answer

5

This is a strange rule of century inference in the y year mask.

In documentation we have the following conversion information of dates with two digits:

  

%y Year without century (00-99). At the entrance, values between 00 and 68 are   prefixed by 20 and values between 69 and 99 by 19 - this is the behavior   specified by the POSIX 2004 and 2008 standards, but they [the authors of the   also say that "it is   expected that in a future version the standard century inferred from a year of two   digits will change ".

In this way, it is up to the application to manipulate the century according to its own rules, for example, you can fix which dates in the future should be converted to the 20th century:

data_de_nascimento = c("16/03/66", "11/06/87", "21/11/75", "05/09/70", 
                       "15/08/70", "15/08/70")
d <- as.Date(data_de_nascimento, "%d/%m/%y" )  
data_de_nascimento_format <- as.Date(
     ifelse(d > Sys.Date(), format(d, "19%y-%m-%d"), format(d))) 

You can replace Sys.Date() with a specific cut-off date. For example, if you want dates between 00 and 10 to be inferred for the 21st century use "2010-12-31" . In this case, 16/03/10 will be interpreted as "2010-03-16" , however 16/03/11 will be interpreted as "1911-03-16" .

See working on Ideone

Reference: SOen - Add year to century without date, and

    
28.07.2016 / 22:51