0

i recently looked at Get statistics for each group (such as count, mean, etc) using pandas GroupBy?. It does not address my issue.

Is there a way to get separate count of values for multiple df columns?

See my code:

Import pandas as pd
Import numpy as np
fle1 = r’k:\file1.xlsx’

df1 = pd.read_excel(fle1, sheet_name=”Sheet1”)
df2 = df1.select_dtypes(exclude=np.number)
col_names = df2.columns.values.tolist()

col_names contains ['Column1', 'Column2', 'Column3']

My dataframe looks like.

Column1     Column2          Column3  
Checking    Car              House  
Checking    Car              House  
Checking    Car              House  
House       Checking         Car 
House       Checking         Car 

I'm Looking for output like:

Column1   Count  Column2    Count   Column3 Count
Checking    3    Car        3       House    3
House       2    Checking   2       Car      2

Not sure what next steps are, any help would be greatly appreciated.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459

1 Answers1

1

You can use pd.concat and groupby:

(
    pd.concat([df[x].groupby(df[x]).size().to_frame('Count').reset_index() 
               for x in df.columns],1)
)

    Column1     Count   Column2     Count   Column3 Count
0   Checking    3       Car         3       Car     2
1   House       2       Checking    2       House   3
Allen Qin
  • 19,507
  • 8
  • 51
  • 67