0

I have a Pandas data frame as mentioned below:

sr_no  amount_credit_debit
  1     1000
  2     1234
  3    -2378
  4    -1290
  5     3000
  6    -4567
  8     5678
  9     1390
  10   -2346
  11   -2876
  12   -9065
  13   -6743

I have to count groups of consecutive negative numbers in above df.

(-2378 and -1290) = First negative instance
(-4567) = Second negative instance
(-2346,-2876,-9065,-6743) = Third negative instance

output is 3 which is my answer.
I have tried a lot but cannot get right answer

Mykola Zotko
  • 15,583
  • 3
  • 71
  • 73
Shyam
  • 357
  • 1
  • 9
  • the logic that led you to the number 3 is not clear. i see 7 negative instances. – Gulzar Oct 12 '20 at 11:48
  • I can see the logic, each contiguous block is a single negative instance. I would take the indices of all negatives and then count the number of times the index increments by more than 1, i.e. 3, 4, 6, 10 would give me 2 and then add one for the first instance – Andrew Oct 12 '20 at 11:50
  • 1
    @jezrael The question is about pandas `DataFrame` and in your duplicated answer is about numpy `array`. I voted to reopen this question. – Mykola Zotko Oct 12 '20 at 12:18
  • @MykolaZotko - There is also numpy tag, so OP need also numpy solutions, what is better for performance here. – jezrael Oct 12 '20 at 12:26
  • 1
    @MykolaZotko - If no numpy tag then I agree, should be reopened. – jezrael Oct 12 '20 at 12:46

1 Answers1

2

First, you can find group numbers for negative values:

m = df['amount_credit_debit'].lt(0)
(m != m.shift())[m].cumsum()

Output:

2     1
3     1
5     2
8     3
9     3
10    3
11    3

And then find the max group number:

(m != m.shift())[m].cumsum().max()

Output:

3
Mykola Zotko
  • 15,583
  • 3
  • 71
  • 73
  • Hi Mykola Zotko, This solution was great and helped me but can you elaborate how it works? – Shyam Oct 13 '20 at 06:59
  • 1
    There are three steps: in the first step `(m != m.shift())` you get group boards, in the second step `[m]` you select only negative values and in the last step you use `cumsum()` to enumerate those groups. Try to play around and understand each step. – Mykola Zotko Oct 14 '20 at 09:30