I would like to retrieve a inner element (a table), using Beautiful Soup 4 (BS4).
Using Firefox Inspector, I could find that object is available in the XPATH below:
/html/body/form/table/tbody/tr/td/table/tbody/tr[1]/td/table[2]/tbody/tr/td/table/tbody/tr/td[2]/table/tbody/tr[4]/td[2]/table
I could not find in BS4 documentation any way to retrieve a element given a XPATH. But I was able to do it with the following code
response = requests.get(urljoin(BASE_URL, path))
soup = bs(response.text, 'html.parser')
#%% Retrieve the inner table
node = soup.html.body.form
node = node.find("table", recursive=False)
node = node.tr.td.table.tr.td
node = node.findAll("table", recursive=False)[1]
node = node.tr.td.table.tr
node = node.findAll("td", recursive=False)[1]
node = node.table
node = node.findAll("tr", recursive=False)[3]
node = node.findAll("td", recursive=False)[1]
node = node.table
As you can see. It mimics exactly the XPATH, but I have to use findAll()
call when more than one element exists in a given depth, and I don't want the first one.
Is there a way to do it just with XPATH, or some similar approach?