I am trying to get intersection between 'game' and 'sample' dataframes if there rows match. The dataframes are of unequal sizes, and I don't want a row to be counted twice for intersection.
Eg,
sample dataframe has rows [0,1,1],[1,1,0],[1,0,1],[0,1,1]
and game dataframe has rows [1,1,0],[1,1,0],[1,0,1],[1,1,1],[1,0,1]
.
Now the intersection dataframe should have the rows [1,1,0],[1,0,1]
.
import pandas as pd
import numpy as np
import random
trials = 1000
games = 3
data = pd.DataFrame()
for i in range(trials):
for j in range(games):
data.loc[i,j] = random.choice([0,1])
sample = pd.DataFrame()
for i in range(trials):
for j in range(games):
if ((data.loc[i,:]).sum()) >= 2:
sample.loc[i,j] = data.loc[i,j]
game = pd.DataFrame()
for i in range(trials):
for j in range(games):
if (data.loc[i,0]) == 1:
game.loc[i,j] = data.loc[i,j]
intersection = pd.DataFrame()
for i in range(len(sample)):
if np.all(sample.iloc[i,:] == game.iloc[i,:]):
for j in range(games):
intersection.loc[i,j] = sample.loc[i,j]