0

This is how my data looks like:

Day Price A Price B Price C
1     0        0      0
2     0        0      0
3     0        0      0
4     0        0      0
5   64503   43692   79982
6   86664   69990   53468
7   77924   62998   68911
8   66600   68830   94396
9   82664   89972   49614
10  59741   48904   49528
11  34030   98074   72993
12  74400   85547   37715
13  51031   50031   85345
14  74700   59932   73935
15  62290   98130   88818

I have a small python script that outputs a sum for each column. I need to input an n value (for number of days) and the summing will run and output the values.

However, for example, given n=5 (for days), I want to output only Price A/B/C rows starting from the next day (which is day 6). Hence, the row for Day 5 should be '0'.

How can I produce this logic on Pandas ?

The idea I have is to use the n input value to then, truncate values on the rows corresponding to that particular (n day value). But how can I do this on code ?

if dataframe['Day'] == n:
    dataframe['Price A'] == 0 & dataframe['Price B'] == 0 & dataframe['Price C'] == 0
Dinesh
  • 654
  • 2
  • 9
  • 31
  • Possible duplicate of [Pandas: vectorize local range operations (max & sum for \[i:i+2\] rows)](https://stackoverflow.com/questions/56305799/pandas-vectorize-local-range-operations-max-sum-for-ii2-rows) – Simon May 25 '19 at 15:01

3 Answers3

1

Pseudo Code

  • Make sure to sort by day
  • shift columns 'A', 'B' and 'C' by n and fill in with 0
  • Sum accordingly

All of that can be done on one line as well

Buckeye14Guy
  • 831
  • 6
  • 12
1

It is simply

dataframe.iloc[:n+1] = 0

This sets the values of all columns for the first n days to 0

# Sample output

dataframe
   a  b
0  1  2
1  2  3
2  3  4
3  4  2
4  5  3

n = 1
dataframe.iloc[:n+1] = 0

dataframe
   a  b
0  0  0
1  0  0
2  3  4
3  4  2
4  5  3

This truncates all for all the previous days. If you want to truncate only for the nth day.

dataframe.iloc[n] = 0
skillsmuggler
  • 1,862
  • 1
  • 11
  • 16
1

You can filter rows by condition and set all columns without first by iloc[mask, 1:], for next row add Series.shift:

n = 5
df.iloc[(df['Day'].shift() <= n).values, 1:] = 0
print (df)
    Day  Price A  Price B  Price C
0     1        0        0        0
1     2        0        0        0
2     3        0        0        0
3     4        0        0        0
4     5        0        0        0
5     6        0        0        0
6     7    77924    62998    68911
7     8    66600    68830    94396
8     9    82664    89972    49614
9    10    59741    48904    49528
10   11    34030    98074    72993
11   12    74400    85547    37715
12   13    51031    50031    85345
13   14    74700    59932    73935
14   15    62290    98130    88818
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252