0

Give a example:

data:

    Group1  Group2 date value1  value2
0   A   01/20/20    0   1
1   A   01/25/20    0   3
2   A   02/28/20    0   2
3   B   01/25/20    0   1
4   B   01/27/20    2   2
5   C   01/29/20    0   5
6   C   01/30/20    2   6

I want get a pivot table to count the frequences of different values in Group2 and make Group2 column the index of the Final table. It's very easy when the index and value of the pivot table are not the same using pandas in python. While when they are same Python will raise a error, I can't solve this problem.

The output I want get is a table like the following table to get the frequency of different values in column 'Group 1' of the data:

Group 1 Frequency
A 3
B 2
C 2
mozway
  • 194,879
  • 13
  • 39
  • 75
Kebu
  • 1
  • 1

1 Answers1

0

I'm not sure how your desired output looks like, but I sounds like you are looking for something like this:

import pandas as pd
from io import StringIO

data = StringIO("""
Group1,Group2,date,value1,value2
0,A,01/20/20,0,1
1,A,01/25/20,0,3
2,A,02/28/20,0,2
3,B,01/25/20,0,1
4,B,01/27/20,2,2
5,C,01/29/20,0,5
6,C,01/30/20,2,6
""")

df = pd.read_csv(data, sep = ",")

pd.crosstab(index=df['Group2'], columns=[df['value1'], df['value2']], 
                normalize='index')

Output

value1         0                             2     
value2         1         2         3    5    2    6
Group2                                             
A       0.333333  0.333333  0.333333  0.0  0.0  0.0
B       0.500000  0.000000  0.000000  0.0  0.5  0.0
C       0.000000  0.000000  0.000000  0.5  0.0  0.5

Or are you just inerested in one value column?

pd.crosstab(index=df['Group2'], columns=df['value2'], 
                normalize='index')

Output

value2         1         2         3    5    6
Group2                                        
A       0.333333  0.333333  0.333333  0.0  0.0
B       0.500000  0.500000  0.000000  0.0  0.0
C       0.000000  0.000000  0.000000  0.5  0.5
TimTeaFan
  • 17,549
  • 4
  • 18
  • 39
  • I want to make a pivot table to get the frequency of different values in column 'Group 1' of the data. The output is just like this: |Group 1 | Frequency | |---|---| | A | 3 | | B | 2 | | C | 2 | `pd.pivot_table`seems to raise an error when the `index` parameter and the `values` parameter are the same. I don't know how to solve this problem in the beginning. I succeeded in handling with it using `pd.groupby().count()`. – Kebu Jul 31 '22 at 10:13