When to use imputer1.fit(X.iloc[:, 0:3]) vs imputer.fit(X[:, 1:3]) 0

Question

I'm new to coding and this might be a silly question. I'm using the data preprocessing tools approach to practice missing data imputing on multiple files. However, I'm not clear when to use X.iloc[] vs x[]

Both the below examples work but I have no idea why

Ex 1:

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

dataset = pd.read_csv('Data.csv')
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, -1].values

from sklearn.impute import SimpleImputer
imputer = SimpleImputer(missing_values=np.nan, strategy='mean')
imputer.fit(X[:, 1:3])
X[:, 1:3] = imputer.transform(X[:, 1:3])
print (X)

Ex 2:

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

dataset = pd.read_csv('datasets_596958_1073629_Placement_Data_Full_Class_edited3.csv')
X = dataset.iloc[:, :-1]
Y = dataset.iloc[:, -1]

from sklearn.impute import SimpleImputer
imputer1 = SimpleImputer(missing_values=np.NaN, strategy ="most_frequent")
imputer1.fit(X.iloc[:, 0:3])

X.iloc[:, 0:3]= imputer1.transform(X.iloc[:, 0:3])
imputer2 = SimpleImputer(missing_values=np.NaN, strategy ="mean")
imputer2.fit(X.iloc[:, 3:5])
X.iloc[:, 3:5]= imputer2.transform(X.iloc[:, 3:5])

This could be helpful, https://stackoverflow.com/questions/31593201/how-are-iloc-ix-and-loc-different#:~:text=iloc%20gets%20rows%20(or%20columns,not%20present%20in%20the%20index. — sushanth, Jun 03 '20 at 06:28

score 0 · Answer 1 · answered Jun 03 '20 at 05:45

0

You would access a list using x[] and you would access a pandas dataframe using x.iloc[]. You may confirm the datatype by using the type() function in Python.

answered Jun 03 '20 at 05:45

Zakariah Siyaji

989
8
27

When to use imputer1.fit(X.iloc[:, 0:3]) vs imputer.fit(X[:, 1:3]) 0

1 Answers1