I am trying to detect the outliers to my dataset and I find the sklearn's Isolation Forest. I can't understand how to work with it. I fit my training data in it and it gives me back a vector with -1 and 1 values.
Can anyone explain to me how it works and provide an example?
How can I know that the outliers are 'real' outliers?
Tuning Parameters?
Here is my code:
clf = IsolationForest(max_samples=10000, random_state=10)
clf.fit(x_train)
y_pred_train = clf.predict(x_train)
y_pred_test = clf.predict(x_test)
[1 1 1 ..., -1 1 1]