I'm creating a classification model to predict the outcome of sports event(win/loss) and am running into a data setup conundrum. Currently the data is setup as follows:
example_data = [team_a_feat_1, team_a_feat_2...team_b_feat_1, team_b_feat_2... OUTCOME_A_B]
But am wondering if the following would be possible/more logical.
example_data = [[team_a_feat_1, team_a_feat_2...]
[team_b_feat_1, team_b_feat_2...] OUTCOME_A_B]]
Does sklearn allow data to be passed in as such and if so would it make a difference on the outcome of the model. I ask this because I want the features to be treated as equals between the teams and not as different variables.
Thoughts and suggestions? Am I overthinking this step and does this really affect performance?