1

I try to use ROSE library on R to rebalancing target variable in my dataset. Here is my information of my dataset.

  • My original dataset have total 132056 records.
  • There are total 279 cases (0.21%) of minor class in target variable.
  • There are total 131777 cases (99.79%) of major class in target variable.

I would like to undersampling the dataset to make the percentage of minor class increase to 5%.

Here is my code :

df_Under <- ovun.sample(Target ~ ., data = df, method = "under", N =5580, seed = 1)

However, after run the code above, I got the following error message.

"Error in (function (formula, data, method, subset, na.action, N, p = 0.5,  :Too few observations." 

I tried play with other method of ROSE such as "over" and "both" but there are the same error occurs.

How can I fixed this problem ?

Kind regards,

Hattori
  • 11
  • 1
  • 2

3 Answers3

2

I was facing the same problem. The problem was actually in the dataset which had columns (variables) with NA/Nan.

Please try running the code after NA removal.

Let me know if this helps.

0

I believe you want your code to use p = 0.05 (5%) not p = 0.5 (50%) like you have (which is the function's default) and to over sample to bring up the sample size of the minority class like you mentioned in your post:

df_Under <- ovun.sample(Target ~ ., data = df, method = "over", N =5580, seed = 1, p = 0.05)
MHammer
  • 1,274
  • 7
  • 12
0

data.balanced.under <- ovun.sample(Target ~ ., data = df, method = "under",p= 0.5)$data

this will solve your problem

maira khan
  • 43
  • 1
  • 8