Following the example of the original database and the new database:
Initially I would like to point out that the ideal is to always ask questions with reproducible examples. In your case you should have provided the data.frame data that I ended up having to type ;-). To better understand how to ask a question with a reproducible example, please read this help: Creating a Minimum, Complete, and Verifiable Example
In the first part I'm simply creating a data.frame equal to what you provided in the image.
## Criando o exemplo como um data.frame
dados <- data.frame(
Processo = c(201701, 201701, 201702, 201702, 201702, 201703, 201703, 201704, 201704, 201704),
Grupo = c('A', 'A', 'B', 'B', 'B', 'C', 'C', 'A', 'A', 'A'),
Data = c('01/02/2017', '15/02/2017', '20/03/2017', '18/04/2017', '01/07/2017', '15/02/2017', '20/02/2017', '01/03/2017', NA, '05/06/2017')
)
One important thing you need to know about R is that when reading a dataset with dates the R will initially "understand" those dates as strings. You will need to convert these strings to the R date format so that you can do addition and subtraction operations with dates:
## Convertendo para data
dados$Data <- as.Date(dados$Data, format = '%d/%m/%Y')
See that I have provided a format argument that shows R how days, months and year are represented. I used the upper Y because the year is displayed with 4 digits.
Finally use dplyr to group and then calculate the difference between the highest and the lowest date. See that I used the na.rm = T option to remove NA.
## Carregando o pacote dplyr
library(dplyr)
## Agrupando e calculando a diferença entre as datas com o dplyr
dados %>%
group_by(Processo, Grupo) %>%
arrange(desc(Data)) %>%
summarise(Total_Dias = max(Data, na.rm = T) - min(Data, na.rm = T))
The result is exactly the final table you posted:
# A tibble: 4 x 3
# Groups: Processo [?]
Processo Grupo Total_Dias
<dbl> <fct> <time>
1 201701. A 14
2 201702. B 103
3 201703. C 5
4 201704. A 96