-3

I would like be able to select a subset of values that may includes few zeros from a set including many zeros. For example:

Input:

item
 0
 0
 0
 0
 0
 0
 0
 1
 2
 3
 0
 0
 0
 6
 8
 8
 9
 0
 0 
 0
 0
 0
 0
 0
 0

I would like to select the subset below that satisfy the following condition: between two no-zero values there is a total number of zeros that is less than 10 (e.g. 3-6).

Please, any help on this would be very appreciated.

Thanks in advance. Best Regards, Carlo

Output:

item
 1
 2
 3
 0
 0
 0
 6
 8
 8
 9
Carlo Allocca
  • 591
  • 1
  • 7
  • 19
  • This question has no sense because the items in a set are not ordered (see here: https://stackoverflow.com/questions/9792664/set-changes-element-order). If you want to solve this issue, you should use a `list` or a `numpy` array. – GLR Sep 06 '17 at 11:45
  • @GLR, I am using dataframe. Thanks for your help. !!! – Carlo Allocca Sep 06 '17 at 12:20

1 Answers1

0

As GLR pointed out you can not use a set for this. In case you had a pandas series you can create a new variable with the number of consecutive values with a combination of shift and cumsum as the grouper. With that you can filter out zeros with a threshold for consecutive appearances.

threshold = 4
consecutives = series.groupby((series!=series.shift(1)).cumsum()).transform('count')
series = series[(series!=0)|(consecutives<threshold)]
Out[18]: 
7     1
8     2
9     3
10    0
11    0
12    0
13    6
14    8
15    8
16    9
P.Tillmann
  • 2,090
  • 10
  • 17
  • Thank you very much Tillmannn. It was my mistake to use the world set and not dataframe. Sorry for the confusion. Would you mind to reformulate your solution according to the dataframe structure? Thanks in advance. – Carlo Allocca Sep 06 '17 at 12:25
  • I have just tried your solution on my dataframe and I got the following result: if you have two no-zero series, it always takes both together. I am going to write a new question about it and make it clearer. Thanks for your help. – Carlo Allocca Sep 06 '17 at 12:46