I have a small part of the data below, sequence numbers:
a = [(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15),(6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 19, 20, 23, 24, 25),(1, 2, 3, 4, 10, 11, 12, 13, 14, 16, 17, 20, 22, 24, 25),(3, 4, 7, 8, 9, 10, 13, 14, 15, 16, 18, 19, 20, 22, 24)]
, 17, 20, 22, 24, 25],[3, 4, 7, 8, 9, 10, 13, 14, 15, 16, 18, 19, 20, 22, 24]]
I came to the analysis below:
sequencia = [(15,), (10, 2, 3), (4, 5, 2, 1, 1, 2), (2, 4, 4, 3, 1, 1)]
seqMax = [[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15],
[6, 7, 8, 9, 10, 11, 12, 13, 14, 15],
[10, 11, 12, 13, 14],
[7, 8, 9, 10]]
That is:
-
First list of numbers:
(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
, has 15 sequential numbers:(15,)
and the largest sequence found is itself; -
Second list of numbers:
(6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 19, 20, 23, 24, 25)
, has 3 sequences:(10, 2, 3)
where10 = 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
;2 = 19, 20
and3 = 23, 24, 25
; and the largest sequence is 10:6, 7, 8, 9, 10, 11, 12, 13, 14, 15
.
I've set up a function, complete code below, but I would like to validate with you if there is an easier way to code:
import pandas as pd
a = [(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15),(6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 19, 20, 23, 24, 25),(1, 2, 3, 4, 10, 11, 12, 13, 14, 16, 17, 20, 22, 24, 25),(3, 4, 7, 8, 9, 10, 13, 14, 15, 16, 18, 19, 20, 22, 24)]
def sequencia_linha(x):
a = list(x)
ab = []
sequencia=[]
seqMaxInterno=[]
for n in a:
if len(ab)>0:
if max(ab)+1 == n:
ab.append(n)
cont+=1
else:
if len(ab) > len(seqMaxInterno):
seqMaxInterno=ab
sequencia.append(cont)
ab=[]
ab.append(n)
cont=1
else:
ab.append(n)
cont=1
if len(ab) > len(seqMaxInterno):
seqMaxInterno=ab
sequencia.append(cont)
seqMax.append(seqMaxInterno)
return tuple(sequencia)
seqMax=[]
sequencia=[]
for x in a:
sequencia.append(sequencia_linha(x))