0

I have gone through this thread - Pandas: filling missing values by mean in each group To impute mean of the group by using pandas transform function. However the column that i am referring to has a string and which is different ex: Car Model missing mileage/engine power etc... Find attached sample data, let me know how to impute the valuesenter image description here

user16755
  • 1
  • 1

1 Answers1

0

EDIT:

df = pd.read_csv('../input/vehicle-dataset-from-cardekho/Car details 
v3.csv')

#mileage filled with mean mileage

df['mileage'] = df['mileage'].str.split().str[0].astype(float)
df['mileage'].fillna(df.mileage.mean(), inplace = True)

#engine filled with mean engine

df['engine'].fillna(df.engine.mean(), inplace = True)

#max_power, torque, seats filled with most frequent value

df['max_power'].fillna(df.max_power.value_counts().index[0], inplace = 
True)
df['torque'].fillna(df.torque.value_counts().index[0], inplace = True)
df['seats'].fillna(df.seats.value_counts().index[0], inplace = True)
BoomBoxBoy
  • 1,770
  • 1
  • 5
  • 23
  • I am asking how to impute missing mileage/engine/bhp of a bmw 520d series car based on bmw 520 series model cars. – user16755 Dec 09 '20 at 16:45
  • If you can post the specific pandas dataframe and problem I could help you out, without it I am making assumptions about the data. The more specific the better! – BoomBoxBoy Dec 09 '20 at 18:09
  • Hi, This is the data set i am using - https://www.kaggle.com/nehalbirla/vehicle-dataset-from-cardekho?select=Car+details+v3.csv There are missing values in mileage, engine, max_power variables. How do i impute the missing variables? – user16755 Dec 10 '20 at 14:48