This is not a complete solution, but I think I found a way to help you make some progress.
Here the steps:
- Create a sample dataframe, just for illustration
Convert the 'HomeTeam' column into a list … this is the target column.
Create an empty list to store the results of searching 'HomeTeam' column
Loop through the teams in the 'AwayTeam' column
Use Python's list.index() method to return the index of the match … but use a try-except just in case you don't find a match.
Store the result into list
When finished with the for-loop, add the list as a new column in the pandas dataframe.
import pandas as pd
import numpy as np
# create sample dataframe
df = pd.DataFrame({
'Date': ['2019-08-18', '2019-08-25'],
'HomeTeam': ['Rennes', 'Strasbourg'],
'AwayTeam': ['Paris SG', 'Rennes'],
'home_form': [np.NaN, 1.0],
'away_form': [np.NaN, 3.0],
})
# convert your 'HomeTeam' column into a Python list
list_HomeTeam = list(df['HomeTeam'])
print(list_HomeTeam)
# create an empty list to capture the index position of matches in 'HomeTeam'
list_results_in_home = []
# loop through each team in the 'AwayTeam column'
for each_team in df['AwayTeam']:
# if you find a match in the list, store index as a result
try:
result = list_HomeTeam.index(each_team)
# if you don't find a match, store a string
except:
result = 'team not in list'
# add the result to the list that is capturing the index position in 'HomeTeam'
list_results_in_home.append(result)
print(list_index_home)
# add column to dataframe with the index position
df['index_match_in_HomeTeam'] = list_results_in_home