I want to resample my dataset. This consists in categorical transformed data with labels of 3 classes. The amount of samples per class are:
- counts of class A: 6945
- counts of class B: 650
- counts of class C: 9066
- TOTAl samples: 16661
The data shape without labels is (16661, 1000, 256). This means 16661 samples of (1000,256). What I would like is to up-sampling the data up to the number of samples from the majority class, that is, class A -> (6945)
However, when calling:
from imblearn.over_sampling import SMOTE
print(categorical_vector.shape)
sm = SMOTE(random_state=2)
X_train_res, y_labels_res = sm.fit_sample(categorical_vector, labels.ravel())
It keeps saying ValueError: Found array with dim 3. Estimator expected <= 2.
How can I flatten the data in a way that the estimator could fit it and that it makes sense too? Furthermore, how can I unflatten (with 3D dimension) after getting X_train_res?