I have gone through this thread - Pandas: filling missing values by mean in each group To impute mean of the group by using pandas transform function. However the column that i am referring to has a string and which is different ex: Car Model missing mileage/engine power etc... Find attached sample data, let me know how to impute the valuesenter image description here
Asked
Active
Viewed 75 times
0
-
1Can you post an example of your code? – Dr. Mantis Tobbogan Dec 09 '20 at 16:28
-
I dont have code, i am trying to find how to apply transform where the column names are different... I have attached the snapshot of the data – user16755 Dec 09 '20 at 16:53
1 Answers
0
EDIT:
df = pd.read_csv('../input/vehicle-dataset-from-cardekho/Car details
v3.csv')
#mileage filled with mean mileage
df['mileage'] = df['mileage'].str.split().str[0].astype(float)
df['mileage'].fillna(df.mileage.mean(), inplace = True)
#engine filled with mean engine
df['engine'].fillna(df.engine.mean(), inplace = True)
#max_power, torque, seats filled with most frequent value
df['max_power'].fillna(df.max_power.value_counts().index[0], inplace =
True)
df['torque'].fillna(df.torque.value_counts().index[0], inplace = True)
df['seats'].fillna(df.seats.value_counts().index[0], inplace = True)

BoomBoxBoy
- 1,770
- 1
- 5
- 23
-
I am asking how to impute missing mileage/engine/bhp of a bmw 520d series car based on bmw 520 series model cars. – user16755 Dec 09 '20 at 16:45
-
If you can post the specific pandas dataframe and problem I could help you out, without it I am making assumptions about the data. The more specific the better! – BoomBoxBoy Dec 09 '20 at 18:09
-
Hi, This is the data set i am using - https://www.kaggle.com/nehalbirla/vehicle-dataset-from-cardekho?select=Car+details+v3.csv There are missing values in mileage, engine, max_power variables. How do i impute the missing variables? – user16755 Dec 10 '20 at 14:48