I am currently working on this data -
Data_A of 10,000 samples each with 170 features
Data_B of 1,000 samples each with same 170 features
If we plot Data_A on a 170-dimensional space then it will cover some space. So, I just want to know what percent of my samples in Data_B belongs to that space. I need not to visualize anything, I just a subset.
(Actually, in my Data_B, I have added 800 samples which are similar to samples in Data_A and 200 samples which are quite different from samples in Data_A)
I have tried OneClassSVM but it not giving good results, moreover its results totally depend on its parameters(nu, gamma, kernel etc). And I have to tune models like this every time I have a new set of my training and testing data, which I don't want to do.
Is there any other easy technique or model to perform this in python? Any module of Python that ca perform this using set theory?
Pardon me if I am not able to explain the problem statement correctly.