I'm new to coding and this might be a silly question. I'm using the data preprocessing tools approach to practice missing data imputing on multiple files. However, I'm not clear when to use X.iloc[] vs x[]
Both the below examples work but I have no idea why
Ex 1:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
dataset = pd.read_csv('Data.csv')
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, -1].values
from sklearn.impute import SimpleImputer
imputer = SimpleImputer(missing_values=np.nan, strategy='mean')
imputer.fit(X[:, 1:3])
X[:, 1:3] = imputer.transform(X[:, 1:3])
print (X)
Ex 2:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
dataset = pd.read_csv('datasets_596958_1073629_Placement_Data_Full_Class_edited3.csv')
X = dataset.iloc[:, :-1]
Y = dataset.iloc[:, -1]
from sklearn.impute import SimpleImputer
imputer1 = SimpleImputer(missing_values=np.NaN, strategy ="most_frequent")
imputer1.fit(X.iloc[:, 0:3])
X.iloc[:, 0:3]= imputer1.transform(X.iloc[:, 0:3])
imputer2 = SimpleImputer(missing_values=np.NaN, strategy ="mean")
imputer2.fit(X.iloc[:, 3:5])
X.iloc[:, 3:5]= imputer2.transform(X.iloc[:, 3:5])