This is a follow up question from here
I have the below code which I use to scrape from a website
soup = bs4.BeautifulSoup(driver.page_source, "html.parser")
for thead in soup.select(".data-point-container table thead"):
tbody = thead.find_next_sibling("tbody")
table = "<table>%s</table>" % (str(thead) + str(tbody))
df = pandas.read_html(str(table), header=0, index_col=0)[0]
df = df.drop(['Unnamed: 6'], axis=1)
# Renaming Columns to just have FY-YEAR
for each_column in df.columns:
if each_column[:3] == "LTM":
df.rename(columns={each_column: "Last 12 Months"}, inplace=True)
else:
df.rename(columns={each_column: each_column[:6]}, inplace=True)
df = df.T
print(df)
print("-------------------------------------")
Upon execution, it produces this result.
The screenshot just shows 2 dataframes, theres a total of 6 dataframes on code execution.
What I want to do is to merge them together, so that on the row axes, it just shows FY2012, FY2013, FY2014, FY2015, Last 12 Months
, and on the column axis, it would show a combination of all the rows from the 6 Dataframe
it scrapped from the website.
I think I can do this by separating it into different variables
and use a form of df.join()
method to achieve this. But I'm having trouble is separating this df
into different variables in the first place..
What do you think?
Update
Initially I thought I could just do a print(df[0])
but this gives me a keyerror:0