How to calculate the time from point A to point B?

0

I need to make a time control that a truck takes from a factory X to a Y reseller, in addition to calculating how long it has been in that resale.

I have GPS data and I've extracted information that was relevant because it's a lot of data and would be very slow to use. My table looks like this:

+----+------------------+-----------+---------------+
| ID |   DATA E HORA    |   LOCAL   | ENTRADA/SAIDA |
+----+------------------+-----------+---------------+
|  1 | 28/01/2017 07:27 | FABRICA   | Entrada       |
|  2 | 28/01/2017 09:00 | FABRICA   | Saida         |
|  3 | 29/01/2017 00:02 | REVENDA 1 | Entrada       |
|  4 | 29/01/2017 04:00 | REVENDA 1 | Saida         |
|  5 | 29/01/2017 08:00 | REVENDA 2 | Entrada       |
|  6 | 29/01/2017 10:00 | REVENDA 2 | Saida         |
|  7 | 29/01/2017 20:00 | FABRICA   | Entrada       |
|  8 | 29/01/2017 23:00 | FABRICA   | Saida         |
+----+------------------+-----------+---------------+

I have to take a day / hour interval where he was in a resale and do these calculations (know how long he stayed in the resale and how long it took to get there from the factory X).

Estimating resale time seems to be easier (Exit - Entry) The biggest problem is time from factory to resale. I would like to know how I can do this (it can be by QUERY SQL itself if possible)

PS: The table template can be changed, I did it this way because I found it easier to use; I am using SQLite but I can use any BD that is like it

Thank you.

    
asked by anonymous 30.01.2017 / 02:21

2 answers

5
  

Note 1 : Your question is tagged / questions / tagged / python "class=" post-tag "title=" show questions tagged 'python' " > python , so I'm going to respond with a suggestion on how to do this using Python. But if   your data is kept in a SQL database, it is certainly   much better you already bring the filtered results (that is, only the   sub-table with the route) or, better yet, already calculate this   directly there (with a stored procedure , for example). If no   know how to mount a query so open another question and   SQL.

     

Note 2 : To facilitate I put your data in a local CSV file, and read it with the function read_csv of Pandas. At the   your case, just switch to read_sql . This is the file   CSV used:

ID,DATA E HORA,LOCAL,ENTRADA/SAIDA
1,28/01/2017 07:27,FABRICA,Entrada
2,28/01/2017 09:00,FABRICA,Saida
3,29/01/2017 00:02,REVENDA 1,Entrada
4,29/01/2017 04:00,REVENDA 1,Saida
5,29/01/2017 08:00,REVENDA 2,Entrada
6,29/01/2017 10:00,REVENDA 2,Saida
7,29/01/2017 20:00,FABRICA,Entrada
8,29/01/2017 23:00,FABRICA,Saida

One way to get what you want quite easily is to use the Pandas library. This is a fantastic library for data manipulation and analysis. It has a considerable learning curve, but it's worth it. Note that it depends on another very useful library called Numpy .

Using this library, I have prepared a fairly simple example that does the following:

  • It reads your complete data (from a CSV in my case, but will be from the SQL in your) in a table (which in Pandas is a DataFrame ).
  • Filter the rows of the table to get the route (leaving the factory and arriving at the desired reseller).
  • Calculates the time elapsed along this route, simply subtracting the start time (value of the "DATE AND TIME" column of the last row of the filtered sub-table) from the time of arrival (value of the "DATE AND TIME" column of the first line of the filtered sub-table).
  • Repeat steps 2 and 3, but this time for the stay (the time remaining at the resale, considering the arrival and departure of it).
  • Note that there are several simplifications in this code, as it is only illustrative. For example, it assumes that there are only two records of the resale (one entry and one exit). In more complex cases, you will have to work with filters to ensure that you get the records in the correct order (use tabela.index to get the indexes and tabela.iloc to access the row of an index).

    Finally, the result is given in a Pandas framework used to handle elapsed time (called Timedelta ). It stores the elapsed time as the sum of days + seconds. You have the seconds, so to have the partial times in hours or minutes you just have to do the divisions by 3600 or 60 properly.

    Here is the code:

    import sys
    import pandas as pd
    import numpy as np
    
    # ---------------------------------------------------------------------------
    def calculaTempos(tabela, nomeRevenda):
    
        ######################################
        # Cálculo do tempo de percurso
        ######################################
    
        # Debug!!!!!!
        print('Tabela completa com todos os todos os dados: \n{}\n\n'.format(tabela))
        # Debug!!!!!!
    
        # Busca TODAS as ocorrências de FABRICA + Saida
        filtro = (tabela['LOCAL'] == 'FABRICA') & (tabela['ENTRADA/SAIDA'] == 'Saida')
        resp = tabela[filtro]
        if resp.empty:
            return None, None
    
        # Por simplificação, considera como partida a primeira delas
        partida = resp.index[0]
    
        # Busca TODAS as ocorrências de nomeRevenda + Entrada
        filtro = (tabela['LOCAL'] == nomeRevenda) & (tabela['ENTRADA/SAIDA'] == 'Entrada')
        resp = tabela[filtro]
        if resp.empty:
            return None, None
    
        # Por simplificação, considera como chegada a última delas
        chegada = resp.index[len(resp)-1]
    
        # Filtra da tabela original os dados entre a partida e a chegada calculados
        tTrajeto = tabela[partida:chegada+1]
    
        # Debug!!!!!!
        print('Tabela parcial com o trajeto Fabrica->Revenda: \n{}\n\n'.format(tTrajeto))
        # Debug!!!!!!
    
        # O tempo de percurso é o horário de chegada (a última linha da tabela)
        # menos o horário de partida (a primeira linha da tabela)
        tempoPercurso = tTrajeto.iloc[len(tTrajeto)-1]['DATA E HORA'] - tTrajeto.iloc[0]['DATA E HORA']
    
        ######################################
        # Cálculo do tempo de parada
        ######################################
    
        # Busca TODAS as ocorrências de nomeRevenda + Saida
        filtro = (tabela['LOCAL'] == nomeRevenda) & (tabela['ENTRADA/SAIDA'] == 'Saida')
        resp = tabela[filtro]
        if resp.empty:
            return tempoPercurso, None
    
        # Por simplificação, considera como saída a última delas
        saida = resp.index[len(resp)-1]
    
        # Filtra da tabela original os dados entre a chegada e a saída calculados
        tParada = tabela[chegada:saida+1]
    
        # Debug!!!!!!
        print('Tabela parcial com a parada na Revenda: \n{}\n\n'.format(tParada))
        # Debug!!!!!!
    
        # O tempo de percurso é o horário de chegada (a última linha da tabela)
        # menos o horário de partida (a primeira linha da tabela)
        tempoParada = tParada.iloc[len(tParada)-1]['DATA E HORA'] - tParada.iloc[0]['DATA E HORA']
    
        return tempoPercurso, tempoParada
    
    # ---------------------------------------------------------------------------
    def main(args):
        tabela = pd.read_csv('teste.csv')
        tabela['DATA E HORA'] = pd.to_datetime(tabela['DATA E HORA'])
    
        tempoPercurso, tempoParada = calculaTempos(tabela, 'REVENDA 1')
    
        # A estrutura Timedelta do pandas armazena tempo decorrido em dias + segundos,
        # então precisa fazer uma conversãozinha pra ter os valores separadamente.
        dias, segundos = tempoPercurso.days, tempoPercurso.seconds
        horas = segundos // 3600
        minutos = (segundos % 3600) // 60
        segundos = segundos % 60
        print('Tempo de percurso entre fábrica e revenda: {:02d} dias e {:02d}:{:02d}:{:02d} horas'.format(dias, horas, minutos, segundos))
    
        dias, segundos = tempoParada.days, tempoParada.seconds
        horas = segundos // 3600
        minutos = (segundos % 3600) // 60
        segundos = segundos % 60
        print('Tempo parado na revenda: {:02d} dias e {:02d}:{:02d}:{:02d} horas'.format(dias, horas, minutos, segundos))
    
        return 0
    
    # ---------------------------------------------------------------------------
    if __name__ == '__main__':
        sys.exit(main(sys.argv[1:]))
    

    And the output from it:

    Tabela completa com todos os todos os dados:
       ID         DATA E HORA      LOCAL ENTRADA/SAIDA
    0   1 2017-01-28 07:27:00    FABRICA       Entrada
    1   2 2017-01-28 09:00:00    FABRICA         Saida
    2   3 2017-01-29 00:02:00  REVENDA 1       Entrada
    3   4 2017-01-29 04:00:00  REVENDA 1         Saida
    4   5 2017-01-29 08:00:00  REVENDA 2       Entrada
    5   6 2017-01-29 10:00:00  REVENDA 2         Saida
    6   7 2017-01-29 20:00:00    FABRICA       Entrada
    7   8 2017-01-29 23:00:00    FABRICA         Saida
    
    
    Tabela parcial com o trajeto Fabrica->Revenda:
       ID         DATA E HORA      LOCAL ENTRADA/SAIDA
    1   2 2017-01-28 09:00:00    FABRICA         Saida
    2   3 2017-01-29 00:02:00  REVENDA 1       Entrada
    
    
    Tabela parcial com a parada na Revenda:
       ID         DATA E HORA      LOCAL ENTRADA/SAIDA
    2   3 2017-01-29 00:02:00  REVENDA 1       Entrada
    3   4 2017-01-29 04:00:00  REVENDA 1         Saida
    
    
    Tempo de percurso entre fábrica e revenda: 00 dias e 15:02:00 horas
    Tempo parado na revenda: 00 dias e 03:58:00 horas
    
        
    30.01.2017 / 21:08
    0

    With MySQL you can use the TIMESTAMPDIFF

    With this command SELECT below you can know the amount of minutes between the two points. To select the ID of each point, in your interface you put a filter for the user to select the Location and the Point of Entry / Exit. Each option you put to the user will have an associated ID that you will use from input for the query SQL .

    In the example, I am assuming the factory output as the point of origin and the destination point for the reseller 2.

    SELECT TIMESTAMPDIFF ( 
        MINUTE,
        (select DATA from Tablela WHERE ID=2),
        (select DATA from Tablela WHERE ID=5)
        )
    

    Result

    1380 minutos
    

    You can change the parameter MINUTE to HOUR , for example, but with minutes you will have a greater precision that you can convert to decimal times if you want.

        
    30.01.2017 / 17:16