I am having difficulty creating two columns, "Home Score" and "Away Score", in the wikipedia table I am trying to parse.
I tried the following script with two try-except-else statements to see if that would work.
test_matches = pd.read_html('https://en.wikipedia.org/wiki/List_of_Wales_national_rugby_union_team_results')
test_matches = test_matches[1]
test_matches['Year'] = test_matches['Date'].str[-4:].apply(pd.to_numeric)
test_matches_worst = test_matches[(test_matches['Winner'] != 'Wales') & (test_matches['Year'] >= 2007) & (test_matches['Competition'].str.contains('Nations'))]
try:
test_matches_worst['Home Score'] = test_matches_worst['Score'].str.split("–").str[0].apply(pd.to_numeric)
except:
print("let's try again")
else:
test_matches_worst['Home Score'] = test_matches_worst['Score'].str.split("-").str[0].apply(pd.to_numeric)
try:
test_matches_worst['Away Score'] = test_matches_worst['Score'].str.split("–").str[1].apply(pd.to_numeric)
except:
print("let's try again")
else:
test_matches_worst['Away Score'] = test_matches_worst['Score'].str.split("-").str[1].apply(pd.to_numeric)
test_matches_worst['Margin'] = (test_matches_worst['Home Score'] - test_matches_worst['Away Score']).abs()
test_matches_worst.sort_values('Margin', ascending=False).reset_index(drop = True)#.head(20)
However, I would receive a Key error message and the "Home Score" is not displayed in the dataframe when shortening the code. What is the best way to handle this particular table and to generate the columns that I want? Any assistance on this would be greatly appreciated. Thanks in advance.