What if I have the following data, test_df['review_id']
that contains the id of the dataframe. I need to pair each of them with data from other arrays. I am going to have a code like the following.
def classify_nb_report(X_train_vectorized, y_train, X_test_vectorized, y_test):
clf = MultinomialNB()
# TRAIN THE CLASSIFIER WITH AVAILABLE TRAINING DATA
clf.fit(X_train_vectorized, y_train)
y_pred_class = clf.predict(X_test_vectorized)
return y_pred_class
for i in range(0, n_loop):
train_df, test_df = train_test_split(df, test_size=0.3)
....
nb_y = classify_nb_report(X_train_vectorized, y_train, X_test_vectorized, y_test)
As you can see above, in each iteration I am going to get a new set of nb_y
which is a numpy array. I am also going to have different sets of test_df
and train_df
(which are randomly chosen by the function above). I want to pair each value of nb_y
from each iteration to id
that matches test_df['review_id']
.
With the following code, I can get the id of test_df
side by side with the value from nb_y
.
for f, b in zip(test_df['review_id'], nb_y):
print(f, b)
Result:
17377 5.0
18505 5.0
24825 1.0
16032 5.0
23721 1.0
18008 5.0
Now, what I want is, from the result above, I append the values of nb_y
from the next iterations to their corresponding ids.
I hope this is not too confusing, I will try to expand more if my question is not clear enough. Thanks in advance.