0

I have a dataframe in pandas. One of its columns is called 'upperband', and this column is full of values (type: numpy.float64). I have the following line inside of an if statement:

dataframe['upperband'][current] = dataframe['upperband'][current-1] 

, where current is going from 1 too the length of the dataframe. The code inside this if statement, executes well, but dataframe['upperband'][current] will not change to the new value whatsoever. It only remains the same as the old value.

More to that, I have 2 different codes with the same piece of code, and one of them is working and the other one doesn't work. How can I fix this? This doesn't make any sense.

Minimal reproductible example: Initial dataframe:

    upperband
0   1330
1   1350
2   1380
3   1360
4   1300
5   1290

current goes from 0 to 5, and when the dataframe['upperband'][current] > dataframe['upperband'][current-1], i want the dataframe['upperband'][current] value to be the same as dataframe['upperband'][current-1]. In this case, when current = 3, i want the dataframe['upperband'][current] to be 1380 (the previous value, since the current value > previous value)

Expected result:

      upperband
  0   1330
  1   1350
  2   1380
  3   1380
  4   1380
  5   1380

The result i get: the same as initial dataframe

3 Answers3

1

You can check whether the current row is less than the previous row, and if satisfied, you can update the current row with the nearest maximum value.

You can use np.where, along with shift() and cummax():

Sample DF:

>>> df
   upperband
0       1330
1       1350
2       1380
3       1360
4       1300
5       1290
6       1400
7       1500
8       1400

Code and result:

import pandas as pd
import numpy as np
df['upperband2'] = np.where(df['upperband'].shift(1) >= df['upperband'],
                           df['upperband'].cummax(),
                           df['upperband'])


   upperband  upperband2
0       1330        1330
1       1350        1350
2       1380        1380
3       1360        1380
4       1300        1380
5       1290        1380
6       1400        1400
7       1500        1500
8       1400        1500

I added some lines to illustrate well.

sophocles
  • 13,593
  • 3
  • 14
  • 33
  • The output is the same as the sample df – chronovirus Sep 08 '21 at 10:49
  • This would be great, but i can't use it. It has to be done while iterating, because there's some conditions to be verified – chronovirus Sep 08 '21 at 11:21
  • You can still add more conditions here. Can you outline the extra contidions? I'll try and adjust the code – sophocles Sep 08 '21 at 11:31
  • `if df['in_uptrend']==False:` then do that – chronovirus Sep 08 '21 at 11:38
  • Can you add a sample of your DataFrame? using ```df.to_dict``` so that I can replicate it? If that's the only extra condition, I can adjust the code above. – sophocles Sep 08 '21 at 11:38
  • The dataframe is a bit too large for stackoverflow comment limit, and the code doesen't put a value on 'upperband' column, because it needs more data to procces. But here it goes the df.to_dict: {'time': {0: Timestamp('2021-05-31 00:00:00'), 1: Timestamp('2021-06-01 00:00:00'), 2: Timestamp('2021-06-02 00:00:00')}, 'open': {0: 2385.82, 1: 2706.15, 2: 2634.31}, 'close': {0: 2706.15, 1: 2634.57, 2: 2706.22}, 'upperband': {0: nan, 1: nan, 2: nan}, 'lowerband': {0: nan, 1: nan, 2: nan}, 'in_uptrend': {0: True, 1: True, 2: True}} – chronovirus Sep 08 '21 at 11:48
1

I just tried to reproduce your approach and it seems that what I did provides the output you expect. Note that you should check that df['upperband'][current] < df['upperband'][current-1].

upper_bands = [1330, 1350, 1380, 1360, 1300, 1290]
df = pd.DataFrame(upper_bands, columns = ['upperband'], index = range(6))
df
# output: 
#     upperband
# 0   1330
# 1   1350
# 2   1380
# 3   1360
# 4   1300
# 5   1290

for current in range(1, len(df)):
    if df['upperband'][current] < df['upperband'][current-1]:
        df['upperband'][current] = df['upperband'][current-1]
        
df
# output:
#   upperband
# 0 1330
# 1 1350
# 2 1380
# 3 1380
# 4 1380
# 5 1380

EDIT:

Based on @sophocles' answer and the resulting comments, you can simply add a condition to the np.where() condition, as below:

upper_bands = [1330, 1350, 1380, 1360, 1300, 1290]
df = pd.DataFrame(upper_bands, columns = ['upperband'], index = range(6))
df['in_uptrend'] = False
df.loc[3,'in_uptrend'] = True
df
#   upperband   in_uptrend
# 0 1330    False
# 1 1350    False
# 2 1380    False
# 3 1360    True
# 4 1300    False
# 5 1290    False

df['upperband'] = np.where((df['upperband'].shift(1) >= df['upperband']) & (~df['in_uptrend']),
                           df['upperband'].cummax(),
                           df['upperband'])
df
# output:
#   upperband   in_uptrend
# 0 1330    False
# 1 1350    False
# 2 1380    False
# 3 1360    True
# 4 1380    False
# 5 1380    False
  • Yes, thie example seems to work for me aswell, but there's something wierd with my code. How does this example work, while my code (the one i asked about) doesn't when it is literally the same thing?! – chronovirus Sep 08 '21 at 10:57
  • what exactly is the code that is causing problems? – Alexandre B-k Sep 08 '21 at 11:17
  • please could you provide the code snippet that is not outputting what you expect, along with a couple of lines of testable input data + your expectation of the output based on this input? – Alexandre B-k Sep 08 '21 at 11:42
  • i get the input data from binance api (crypto prices and stuff). i add more columns in order to make a buy or a sell order. but at one point i have to check if the current 'upperband' is greater than previous 'upperband', and store the previous value in the current value. The code runs fine, it gets in that if statement, but it simply won't store the value of the previous in the current 'upperband'. If i try to store the value of previous 'upperband' in another variable, it works just fine, but if I TRY TO store it in df['upperband'][current], it simply will not store it... – chronovirus Sep 08 '21 at 11:55
  • I'd like to provide more info, but i get the data from binance api. It simply doesn't let me store new data over old data – chronovirus Sep 08 '21 at 11:55
  • try setting the value using `df.loc[current,'upperband']` instead of `df['upperband'][current]` – Alexandre B-k Sep 08 '21 at 11:59
  • about the edit... thank you very much, but it still doesn't work... thank you for all the help, i think i'm giving this project up... it simply doesn't want to work, and it doesn't make any sense. I had another project with the slightly different data, and it worked just fine. This one just says "no" to me :) – chronovirus Sep 08 '21 at 12:16
  • Also, i realized that this will not work , because the 'in_uptrend' initially is true, but changes in the iteration – chronovirus Sep 08 '21 at 12:47
0

Interesting behavior indeed. This question's most upvoted answer was a good solution to my problem, in case someone faces this again