0

I got a dataframe like:

 Type:  Volume: Date:     Price:....
 Q     10      2016.6.1   10
 Q     20      2016.6.1   20
 T     10      2016.6.2 
 Q     10      2016.6.3
 T     20      2016.6.4
 T     20      2016.6.5
 Q     10      2016.6.6

here is the full dataframe

and I want to add up the value of 'volume' only if two(or more) Ts are consecutive and delete one of the row

i.e. to :

 Q     10      2016.6.1
 Q     20      2016.6.1 
 T     10      2016.6.2 
 Q     10      2016.6.3
 T     20+20=40 2016.6.4
 Q     10      2016.6.6

now I'm using a if loop:

l = len(df)
Volume = df['Volume']
Type = df['Type']

for i in range(2,l-1):
    if Type[i] == 'Trade':
        if Type[i] == 'Trade' and Type[i+1] == 'Trade' :     
            Volume[i] = Volume[i]+Volume[i+1]
            df = np.delete(fd, (i), axis=0)

However, I am getting an error:

ValueError: Shape of passed values is (8, 303540), indices imply (8, 303541)

Also, I would like to change the 'if' loop to a 'while' loop so I can handle data more easily if there are more than two consecutive type 'Trade' data

Martin Evans
  • 45,791
  • 17
  • 81
  • 97
bing
  • 195
  • 2
  • 11

1 Answers1

1

If you want to edit an iterable while looping over it, it's generally safer to work on a copy of the data inside the loop and replace the original with that updated copy afterwards. This avoids Python getting confused about its position in the iteration (which is the problem that seems hinted at in your error, as it complains about indices).

Bart Van Loon
  • 1,430
  • 8
  • 18
  • thanks for the answer. May I ask the way to achieve this? how to work on a copy of the data inside the loop and replace the original with that updated copy afterwards? – bing Sep 08 '17 at 09:19
  • Look at https://stackoverflow.com/questions/6022764/python-removing-list-element-while-iterating-over-list for example. – Bart Van Loon Sep 08 '17 at 09:23
  • so should I use df.remove(i) at the end instead? – bing Sep 08 '17 at 09:31