0

I would like to retrieve a inner element (a table), using Beautiful Soup 4 (BS4).

Using Firefox Inspector, I could find that object is available in the XPATH below:

/html/body/form/table/tbody/tr/td/table/tbody/tr[1]/td/table[2]/tbody/tr/td/table/tbody/tr/td[2]/table/tbody/tr[4]/td[2]/table

I could not find in BS4 documentation any way to retrieve a element given a XPATH. But I was able to do it with the following code

response = requests.get(urljoin(BASE_URL, path))
soup = bs(response.text, 'html.parser')

#%% Retrieve the inner table
node = soup.html.body.form
node = node.find("table", recursive=False)
node = node.tr.td.table.tr.td
node = node.findAll("table", recursive=False)[1]
node = node.tr.td.table.tr
node = node.findAll("td", recursive=False)[1]
node = node.table
node = node.findAll("tr", recursive=False)[3]
node = node.findAll("td", recursive=False)[1]
node = node.table

As you can see. It mimics exactly the XPATH, but I have to use findAll() call when more than one element exists in a given depth, and I don't want the first one.

Is there a way to do it just with XPATH, or some similar approach?

Lin
  • 1,145
  • 11
  • 28
  • 2
    What's the URL? and the _exact_ element you are searching for? – MendelG Jun 29 '21 at 20:54
  • 2
    [No xpath on BS](https://stackoverflow.com/questions/11465555/can-we-use-xpath-with-beautifulsoup/11469854#11469854). – LMC Jun 29 '21 at 23:42
  • You can _use_ CSS selectors. But without providing the URL/relevant HTML and showing what you want to scrape, we can't help you. – MendelG Jun 30 '21 at 02:10

1 Answers1

0

I am afraid you CANNOT use xpath with bs4. You can combine selenium or scrapy with bs4 or use solely them if xpath is mandatory.

can we use XPath with BeautifulSoup?

pullidea-dev
  • 1,768
  • 1
  • 7
  • 23