Pandas iterrows, how to do the second looping using the index

Question

Pandas iterrows, how to do the second looping using the index

Navigation

#1 by (1 votes)

1

for index, row in candles.iterrows():
    if (row['Twintower'] == 1):

I would like to do a second looping from the moment he finds this condition, ie from this index down, or row down, I tried several options

TypeError                                 Traceback (most recent call last)
<ipython-input-161-e34b05ba34a8> in <module>
      1 for index, row in candles.iterrows():
      2     if (row['Twintower'] == 1):
----> 3         for row in range(index, candles):
      4             print(1)

TypeError: 'Timestamp' object cannot be interpreted as an integer

python numpy pandas

asked by anonymous 22.12.2018 / 22:55

1 answer

Printing items from an associative array in random order [closed] How do I make div fill the corners of the screen? [closed]

score 1 · Answer 1

You do not seem to need a second loop there - you still need to run through the rows in the dataframe only once, though for two different purposes: you want to go through the first few lines until you find the moment your condition is true for the first time, and from there go through the other lines, performing some other action.

You could even do it in two steps - the first one just to note the value of "index" when your condition is true, and another, having that value, down. It's what you're trying to do, and the efficiency of the program would be the same, after all, each line would be tracked only once. However, the iterrows function does not accept a start line. (And because of this you tried to use a range to get a row index number - this is wrong at several levels And the level at which you give error is that the index axis of your dataframe is not an integer, but a Pandas TimeStamp object - so the error you have when calling the range).

So, since iterrows does not allow a start index, a legal way to work there is to have another variable, which indicates whether you've reached your point of interest or not - and only then do the actions that would be executed in its second loop. The key to this is to skip execution of a portion of the loop block with the use of the continue command: it simply jumps to the next loop execution.

So what you're trying to do can be written as:

region_of_interest = False
for index, row in candles.iterrows():
    if (row['Twintower'] == 1):
        region_of_interest = True
    if not region_of_interest:
        # Até que a condição de cima seja verdadeira a primeira vez,
        # retona ao inicio do loop aqui
        continue

    # Aqui vai o código que você estava qurendo colocar
    # no "segundo loop".
    ...

If you really want to "crop" the dataframe from the index where the condition is true, this is also possible - in this case, perhaps it is best to create a new copy of the dataframe with only the lines of interest, and then , repeat iterrows :

for row_number, (index, row) in enumerate(candles.iterrows()):
    if (row['Twintower'] == 1):
        # Encerra este loop nesse ponto
        break
else:
   # Else do comando for - este bloco só é executado se o comando
   # break acima não acontecer nunca.
   raise ValueError("O dataframe não tem uma linha onde Twintower == 1")

# O atributo ".loc" do dataframe retorna um objeto que tem a cópia
# dos dados do dataframe, mas é endereçavel com a sintaxe de "[]"
# com números do Python (e se você recortar uma fatia desse objeto,
# tem  um novo dataframe)
candles2 = candles.loc[row_number:]

for index, row in candles2.iterrows():
    # aqui é seu segundo loop, somente na região de interesse.
    ...