I am working on IPL dataset from Kaggle (https://www.kaggle.com/manasgarg/ipl). It has two .csv files with a primary key to connect the data. I want to drop rows where batting team has lost the match. df_deliv has batting team df_match has the winner of the match
I achieved it using the below code but its very slow due to the for loop.
import pandas as pd
import numpy as np
df_deliv = pd.read_csv("deliveries.csv")
df_match = pd.read_csv("matches.csv")
df_deliv = df_deliv[["match_id", "batting_team", "batsman", "batsman_runs"]]
df_deliv["winner"] = [df_match.loc[i-1]["winner"] for i in df_deliv["match_id"]] #makes it very slow
df_deliv.drop(df_deliv[df_deliv["batting_team"] != df_deliv["winner"]].index, inplace = True)
print(df_deliv)
is there a way to do in one df.drop statement rather than the for loop???