I am new here so please let me know if I can improve myself to be clearer.
I would like to predict absenteeism of employees, so I have to make a factor of this numerical variable. The data is skewed right, so I would like to transfer the entries equal over every category. I prefer to have a new variable "Group" that divides all observation equal in to 1, 2 or 3.
The problem is that I have an issue with making this factor with equal n. I tried many possibilities from this topic: splitting a continuous variable into equal sized groups, such as cut, cut2 and Hmisc. All option seem straightforward, but if I apply them, the categorisch are not equal divided.
I hope someone can help me, I am really curious why the above methods are not working for me. I would like an answer from a basic library. Below is a snap of my data:
structure(list(ID = c(11, 36, 3, 7, 11, 3), Reason_absence = c(26,
0, 23, 7, 23, 23), Age = c(33, 50, 38, 39, 33, 38), BMI2 = c(30,
31, 31, 24, 30, 31), Absenteeism_time = c(4, 0, 2, 4, 2, 2)), class =
c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -6L))
The total dataset consist of 700 entries and 21 columns.
Thanks in advance!