0

I am beginner, and I really need help on the following:

I need to do similar to the following but on a two dimensional dataframe Identifying consecutive occurrences of a value

I need to use this answer but for two dimensional dataframe. I need to count at least 2 consecuetive ones along the columns dimension. Here is a sample dataframe: my_df=

 0 1 2
0 1 0 1
1 0 1 0
2 1 1 1
3 0 0 1
4 0 1 0
5 1 1 0
6 1 1 1
7 1 0 1

The output I am looking for is:

  0 1 2
0 3 5 4

Instead of the column 'consecutive', I need a new output called "out_1_df" for line

df.Count.groupby((df.Count != df.Count.shift()).cumsum()).transform('size') * df.Count

So that later I can do

    threshold = 2; 
    out_2_df= (out_1_df > threshold).astype(int)

I tried the following:

out_1_df= my_df.groupby(( my_df != my_df.shift(axis=0)).cumsum(axis=0)) 
out_2_df =`(out_1_df > threshold).astype(int)`  

How can I modify this?

Chane
  • 45
  • 7
  • 2
    I see someone downvoted this question, but there are no comments. When possible, provide clarification on how a question might be improved. – Galen Apr 19 '20 at 20:19
  • Please share some input data and also your expected output, something like 5-10 rows and 5-10 columns. See [this](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) if you need some guidance on how to do it :) – Ben.T Apr 19 '20 at 21:21
  • 1
    @Ben.T I have put sample input and output. thank you for the feedback and helping me on this. – Chane Apr 19 '20 at 22:41
  • `My dataframe has 0 and 1 values. I need to count at least 2 consecuetive greater than one values along the rows dimention` I suspect the answer is `0` then ? Since you're saying you have `0` and `1` in your dataframes, and you also show on your example, that there are no other numbers `greater than one` – Grzegorz Skibinski Apr 19 '20 at 23:07
  • @GrzegorzSkibinski, I need to count the ones if they occur at least 2 consecutive times along the rows dimension. For example in the first column of the sample data, I have three ones consecutively from the 5th to 7th rows. Therefor, I am only looking for ones occur at least two consecutive times. – Chane Apr 19 '20 at 23:23
  • @GrzegorzSkibinski, I have corrected the sample output. I checked the code for axis=0. but it did not work. – Chane Apr 19 '20 at 23:40
  • @Chane according to your output, you are looking for the maximum of consecutive 1 in a column? – Ben.T Apr 20 '20 at 01:25
  • @Ben.T I am looking for at least two or more ones in one column. – Chane Apr 20 '20 at 06:04
  • @Ben.T I have edited the input and output. – Chane Apr 20 '20 at 06:12

1 Answers1

1

Try:

import pandas as pd

df=pd.DataFrame({0:[1,0,1,0,0,1,1,1], 1:[0,1,1,0,1,1,1,0], 2: [1,0,1,1,0,0,1,1]})

out_2_df=((df.diff(axis=0).eq(0)|df.diff(periods=-1,axis=0).eq(0))&df.eq(1)).sum(axis=0)

>>> out_2_df

[3 5 4]
Grzegorz Skibinski
  • 12,624
  • 2
  • 11
  • 34
  • I have corrected the sample output. Please check it now. I checked the code above for axis=0. but it did not work. – Chane Apr 19 '20 at 23:43
  • I understand- please check my answer - now it should do the trick for you. – Grzegorz Skibinski Apr 20 '20 at 06:59
  • Yes, this does the trick. However, I am getting error when I try to reshape it. Let's say the length of 'out_2_df ' is 60. From here, I want to reshape to (3, 4, 5). I tried the following; 'out_3_df= np.reshape(out_2_df, (3,4,5))' The error I am getting is: Length of passed values is 3, Index implies 60. How do I reshape the output? – Chane Apr 20 '20 at 10:25
  • Why do you want to reshape - the output is a list of `len`=3, why would you want to reshape it to `(3,4,5)`? What's exactly what you want to get at the end? – Grzegorz Skibinski Apr 20 '20 at 11:37
  • @Chane If I understand correctly, you need to convert out_2_df to numpy array before reshaping so if the length is 60, then try: `np.reshape(out_2_df.to_numpy(), (3,4,5))`. Also, if this answer does what you are looking for, it would be good to [accept it](https://stackoverflow.com/help/someone-answers), it rewards the person that has helped you and helps to manage the site. it is not mandatory, but it is a good way to say thank you :) – Ben.T Apr 20 '20 at 12:03
  • 2
    This is what I am looking for. Thank you guys! – Chane Apr 20 '20 at 12:20
  • 1
    @Ben. T you're obviously right it's a Series- my mistake, thanks! – Grzegorz Skibinski Apr 20 '20 at 12:45