I have encountered the problem, as I can't use the Isolation Forest algorithm in the Sklearn pipeline. I am trying to predict the credit card default using the Kaggle Credit Card Fraud Detection dataset. I am trying to fix everything after data partitioning in order to avoid data leakage. (By using pipelines for every cross-validation as I get an almost 100% F1-score using Logistic Regression in K-fold cross-validation without using pipelines) Most of the machine learning algorithms can be used (Logistic Regression, Random Forest Classifier, etc) but not for some anomaly detection algorithms such as IsolationForest. I wondered how can I fit these anomaly detection algorithms inside the Pipelines. Thanks.
Some details for X and Y (Y- 0 as a normal transaction, 1 as fraudulent transaction)
pipe =Pipeline([
('sc', StandardScaler()),
('smote', SMOTE()),
('IF', IsolationForest())
])
print(cross_val_score(pipe, X,Y, scoring='f1_weighted' ,cv=5))
# Result: [3.01179163e-06 3.53204982e-06 6.55363495e-06 3.51940600e-06 4.52981524e-06]