I have a feeling this question has a glaringly obvious and simple solution that I have perhaps overlooked.
Assume I have a model f that is reliant on some inputs x and a parameter set p to produce a binary classification output y. I am using a linear model to serve as f for now but I'd like to keep the architecture flexible so I can easily substitute a neural network or non-linear model in the future.
My question is, instead of training models to generate an optimal parameter set p which produces the highest accuracy, is there a way to calculate accuracy of a model from manually inserted parameters? Reason I ask is because I want analyse an ensemble method, but instead of fitting multiple models and averaging/weighing their predictions, I want to bypass training and calculate accuracies of randomly sampled parameter sets p from a specified distribution.
In other words, I want to specify something like this:
model = linear
parameter_set = p #manually inserted array/distribution of possible parameters
sample_params = np.random.choice(parameter_set, size=100, replace=True) #a subset of the true parameter set
I now need to calculate all 100 accuracies of the sample parameters as if they were 100 independent models and reject them based on some criteria. Importantly, I want to evaluate the models only on the training set. I have looked at many libraries and found something close using validation_curve and gridsearch within sklearn.model_selection however validation_curve evaluates for the test set as well and gridsearch searches for the optimal parameter. I also looked at cross validation techniques, lmfit, scipy.optimize.curve_fit, and more. I know I can manually write in the function as well as a function to calculate the accuracy and loop through it with the parameter sets and reject incrementally based on the threshold I specify, but that obviously defeats the purpose of what I'm trying to do.
I'm just checking if there is a nifty way to say: here is an architecture for a model, here are parameter sets, this is the data, tell me what each accuracy is and reject the parameter sets/models which have an accuracy less than 0.5 for example. Any thoughts or suggestions would be greatly appreciated.