1

''' dataframe is like

1 2 4 ....n

0 2 0 ....n

1 0 4 ....n

0 0 4 ....n

now i want to count 1 from column 1, count 2 from column 2,count 4 from column 3 and so on.

but i also want to count few values by adding columns like

1+2 , 1+4 , 2+4 , 1+2+4

0+2 , 0+0 , 2+0 , 0+2+0

1+0 , 1+4 , 0+4 , 1+0+4

0+0 , 0+4 , 0+4 , 0+0+4

Count 3 count 5 count 6 count 7 from above columns respectively.

count 1 from column a, 2 from column b, 3 from column a+b, 4 from column c, 5 from column a+c, 6 from column b+c, 7 from column a+b+c. like this.

Store all these values/number in list, array or dataframe like

Values/Number , Title , Frequency

1 , a , 2

2 , b , 2

3 , a+b , 1

4 , c , 3

5 , a+c , 2

6 , b+c , 1

7 , a+b+c , 1

'''

  • 1
    Welcome to StackOverflow. Please take the time to read this post on [how to provide a great pandas example](http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples). Provide a reproducible dataframe (preferably the `pd.DataFrame` code for the dataframe) and the expected output according to that – anky Feb 16 '20 at 04:59

1 Answers1

1

Use previous solution first:

from itertools import chain, combinations
#https://stackoverflow.com/a/5898031
comb = chain(*map(lambda x: combinations(df.columns, x), range(2, len(df.columns)+1)))

cols = df.columns
for c in comb:
    df[f'{"+".join(c)}'] = df.loc[:, c].sum(axis=1)
print (df)
   a  b  c  a+b  a+c  b+c  a+b+c
0  1  2  4    3    5    6      7
1  0  2  0    2    0    2      2
2  1  0  4    1    5    4      5
3  0  0  4    0    4    4      4

df1 = df.apply(pd.value_counts)
print (df1)
     a    b    c  a+b  a+c  b+c  a+b+c
0  2.0  2.0  1.0  1.0  1.0  NaN    NaN
1  2.0  NaN  NaN  1.0  NaN  NaN    NaN
2  NaN  2.0  NaN  1.0  NaN  1.0    1.0
3  NaN  NaN  NaN  1.0  NaN  NaN    NaN
4  NaN  NaN  3.0  NaN  1.0  2.0    1.0
5  NaN  NaN  NaN  NaN  2.0  NaN    1.0
6  NaN  NaN  NaN  NaN  NaN  1.0    NaN
7  NaN  NaN  NaN  NaN  NaN  NaN    1.0

And then DataFrame.agg with DataFrame.idxmax and DataFrame.max for new DataFrame, DataFrame.reset_index for column from index and last rename columns:

c = {'index':'Values/Number','idxmax':'Title','max':'Frequency'}
df2 = df1.agg(['idxmax','max'], axis=1).reset_index().rename(columns=c)
print (df2)
   Values/Number  Title Frequency
0              0      a         2
1              1      a         2
2              2      b         2
3              3    a+b         1
4              4      c         3
5              5    a+c         2
6              6    b+c         1
7              7  a+b+c         1
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252