0

I found this answer that tells me how to get every unique combination--which is perfect. But I already have a "base" set of variables I want in my model. And it's just the last ones I need to iterate through and add-in. It would have been simple if it was only two or three--but I have a few sets I need to go through.

I already have a function that will take in all the accuracy, recall, etc measures I need and output a data frame of all my measures. So I can go through the columns easily and see which is best in which area.

All the variables are in a data frame so all I have to do is select the columns I want. I can't share my dataset with you due to confidentiality agreements. So everything below are all made up.

The basic set up is:

train_x = train[["Age", "Gender", "Income", "Seniority"]]
test_x = test[["Age", "Gender", "Income", "Seniority"]]

train_y = [["Longevity"]]
test_x = ["Longevity"]]

but now I want to add variables [["var_1", "var_2", "var_3", "var_4", "var_5", "var_6"]] to the end of the train_x and test_x set in all possible combinations.

The answer I found gets me the for loop and outputs lists but I can't input a list into my variable set--I tried manually and it didn't work out.

My basic concept is that it should be something like this:

set = some iteration loop through [["var_1", "var_2", "var_3", "var_4", "var_5", "var_6"]]

train_x = train[["Age", "Gender", "Income", "Seniority", set]]
test_x = test[["Age", "Gender", "Income", "Seniority", set]]

train_y = [["Longevity"]]
test_x = ["Longevity"]]

 model = mlp.fit(train_x, train_y)  
 y_pred = pd.DataFrame(mlp.predict(test_x), columns = ["Predicted"]) 

 print(model)
  
 print(metrics.confusion_matrix(test_y, y_pred))

knn_model(model, "model_name") this my function 

whether that set be just var_1 or "var_1", "var_2", "var_3", "var_4", "var_5", "var_6" and everything in between, where var_1, var_2, var_3 is the same as var_3, var_2, var_1

Emm
  • 123
  • 8

1 Answers1

0

it's ugly but it works...ish

import itertools

input = ["var_1", "var_2", "var_3", "var_4", "var_5", "var_6"]
output = sum([list(map(list, combinations(input, i))) for i in range(len(input) + 1)], [])

for item in itertools.chain.from_iterable(output):
  train_x = train[["Age", "Gender", "Income", "Seniority"]].join(train[[item]])
  test_x = test[["Age", "Gender", "Income", "Seniority"]].join(train[[item]])

  train_y = [["Longevity"]]
  test_x = ["Longevity"]]

  print(model)
  
  print(metrics.confusion_matrix(test_y, y_pred))

  knn_model(model, "model_name") this my function
Emm
  • 123
  • 8