I have an imbalanced dataset with 2 classes with low examples and 3 with a high number of examples. Is there a standard method of calculating the weights for the loss function, so as to create a system with high precision?
1 Answers
Taking a step back, let me point you in 2 possible directions:
Undersampling and oversampling : This procedure takes place on the dataset level. The goal is to generate new samples from the underrepresented classes (oversampling). Or decrease the number of samples from the overrepressented classes (undersampling). See the following link + package: imbalanced-learn.
Adjusting loss function : This technique is applied on the classifier's loss function guaranteeing that samples from underrepresented classes contribute more to the overall loss (in relative terms). See this discussion How does the class_weight parameter in scikit-learn work?
Also, this article gives a general overview 8 Tactics to Combat Imbalanced Classes in Your Machine Learning Dataset.

- 4,040
- 1
- 15
- 16