I have around 3000 objects where each object has a count associated with it. I want to randomly divide these objects in training and testing data with a 70% training and 30% testing split. But, I want to divide them based on the count associated with each object but not based on the number of objects.
An example, assuming my dataset contains 5 objects.
Obj 1 => 200
Obj 2 => 30
Obj 3 => 40
Obj 4 => 20
Obj 5 => 110
If I split them with a nearly 70%-30% ratio, my training set should be
Obj 2 => 30
Obj 3 => 40
Obj 4 => 20
Obj 5 => 110
and my testing set would be
Obj 1 => 200
If I split them again, I should get a different training and testing set nearing the 70-30 split ratio. I understand the above split does not give me pure 70-30 split but as long as it nears it, it's acceptable.
Are there any predefined methods/packages to do this in Python?