0

I have a data made of 20 rows and 2500 columns. Each column is a unique product and rows are time series, results of measurements. Therefore each product is measured 20 times and there are 2500 products.

My data is defined as DataFrame and I want to write down the number of the row (index) where a specific condition (such as: x> 3) is met for the first time, for all columns(products, so that I will have an array in the end.

I tried using loops and iterrow but failed at executing.

P.S: I used idxmax() in order to get the row id of max value but this time I want to get the index of the cell where a condition is met for the first time and then break.

Cœur
  • 37,241
  • 25
  • 195
  • 267
meliksahturker
  • 922
  • 2
  • 11
  • 20

1 Answers1

1

Simply use .gt + .idxmax, which will give you the index of the first time your condition is met.

import pandas as pd
import numpy as np

np.random.seed(12)
df = pd.DataFrame(np.random.randint(1,5,(20,2500)))

df.gt(3).idxmax()
#0        0
#1        0
#2        4
#3        4
#4        1
#...
#2496     8
#2497     0
#2498     5
#2499     1
ALollz
  • 57,915
  • 7
  • 66
  • 89