How can I optimize this value assignation using pandas

Question

I have a DataFrame in Pandas with a column 'register' which can be either 0 or some positive number, I want to create a new column 'Working' which is 1 if that row in 'register' or any of the 7 previous ones is not 0. I tried iterating over them but as it is a big DataFrame it works extremely slow. This is my code:

df['working'] = 0
for i in range(len(df['register'])):
    if df['register'][i] != 0 or \
        (i>1 and df['register'][i-1] != 0) or\
        (i>2 and df['register'][i-2] != 0) or\
        (i>3 and df['register'][i-3] != 0) or\
        (i>4 and df['register'][i-4] != 0) or\
        (i>5 and df['register'][i-5] != 0) or\
        (i>6 and df['register'][i-6] != 0):
        df['working'][i] = 1
    else:
        df['working'][i] = 0

I also tried using this and looked like this:

df['working']=df['register'].apply(lambda x: 1 if x!=0 or x.shift(1)!=0 or x.shift(2)!=0 or x.shift(3)!=0 or x.shift(4)!=0 or x.shift(5)!=0 or x.shift(6)!=0 else 0)

But I got:

AttributeError: 'float' object has no attribute 'shift'

Is there a better way to do this using pandas?

Thanks in advance.

@QuangHoang Thanks for the reply! I executed `df['working'] = df['register'].rolling(6).any()` but I got `AttributeError: 'Rolling' object has no attribute 'any'` — Gamopo, Mar 04 '20 at 16:10

score 1 · Accepted Answer · answered Mar 04 '20 at 16:12

1

This should work, you may want to pass min_periods=1 to rolling

df['working'] = df['register'].ne(0).rolling(6).sum().gt(0)

answered Mar 04 '20 at 16:12

Quang Hoang

146,074
10
56
74

Surbhi Gupta · Answer 2 · 2020-03-04T17:28:37.383

1

Try:

conditional_value= [1]
condition = [df['register'].rolling(8).sum()>0]
df['working'] = np.select(condition, working, default=0)

you can provide additional conditions and corresponding values:

condition = [condition 1, condition 2, ......, condition n]
conditional_values = [value 1, value 2, ........, value n]

edited Mar 04 '20 at 17:28

answered Mar 04 '20 at 16:32

Surbhi Gupta

19
3

While this code may answer the question, providing additional context regarding how and/or why it solves the problem would improve the answer's long-term value. – Ari Cooper-Davis Mar 04 '20 at 16:56

How can I optimize this value assignation using pandas

2 Answers2