0
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

df = pd.read_csv("G:\learning python\medical-data visualizer/medical_examination.csv")

df["overweight"] = (df["weight"]/pow(df["height"]/100, 2) > 25).astype(int)

df["cholesterol"] = (df["cholesterol"] > 1).astype(int)
df["gluc"] = (df["gluc"] > 1).astype(int)

df_cat = pd.melt(df, id_vars =["cardio"], value_vars = ["cholesterol", "gluc", "smoke", "alco", "active", "overweight"])
df_cat = df_cat.groupby(['cardio','variable','value']).size()
print(df_cat)

This is my series:

cardio  variable     value
0       active       0         6378
                     1        28643
        alco         0        33080
                     1         1941
        cholesterol  0        29330
                     1         5691
        gluc         0        30894
                     1         4127
        overweight   0        15915
                     1        19106
        smoke        0        31781
                     1         3240
1       active       0         7361
                     1        27618
        alco         0        33156
                     1         1823
        cholesterol  0        23055
                     1        11924
        gluc         0        28585
                     1         6394
        overweight   0        10539
                     1        24440
        smoke        0        32050
                     1         2929

I'd like to convert it into a dataframe with column names cardio, variable, value and total, for the last unnamed column in the series. I tried using .to_frame(), but the dataframe takes only 1 column name and thus I cant put all the four column names correctly. How can I do this? Thanks in advance!

tryingtobeastoic
  • 175
  • 1
  • 14

0 Answers0