I wonder if there is a way to add a range to the predictions prior to fitting the model.
The variable in question in my train data is technically a percentage score, but when I predict my test set, I get negative values or values >100.
For now, I am manually normalizing the predictions list. I also used to cut off negatives and >100 and assign then a 0 and 100.
However, it only makes sense if the fit function could be made aware of this constraint, right?
Here is a sample row of the data:
test_df = pd.DataFrame([[0, 40, 28, 30, 40, 22, 60, 40, 21, 0, 85, 29, 180, 85, 36, 741, 25.0]], columns=['theta_1', 'phi_1', 'value_1', 'theta_2', 'phi_2', 'value_2', 'theta_3', 'phi_3', 'value_3', 'theta_4', 'phi_4', 'value_4', 'theta_5', 'phi_5', 'value_5', 'sum_readings', 'estimated_volume'])
I have been reading and a lot of people consider this not a linear regression problem but their logic is not sound. Also, some say that one can apply a log scale but that only works in the case of comparison against a threshold, i.e., manual classification, i.e., using linear regression for a logistic regression problem! In my case, I need the percentages as they are the required output.
Your feedbacks/thought are much appreciated.