Accesing a second tag with beatifoulsoup

Question

I started working on some website scraping projects and I stumbled accros some difficulties selecting a second tag within the same parent tag. I've tried using google but i still couldn't cleary understand it.

My code looks like this:

url = 'url to site'
content = requests.get(url).text
soup = BeautifulSoup(content, 'lxml')

car_add = soup.find('div', class_='offer-wrapper')

ad_title = car_add.find('h3', class_='lheight22 margintop5').a.strong.text
ad_price = car_add.find('p', class_='price').text
ad_location = car_add.find('td', class_='bottom-cell').div.p.small.span.text
ad_time_and_location = car_add.find('td', class_='bottom-cell').div.p
print(ad_time_and_location.prettify())

This prints out the following:

<p class="lheight16">
 <small class="breadcrumb x-normal">
  <span>
   <i data-icon="location-filled">
   </i>
   Otopeni
  </span>
 </small>
 <small class="breadcrumb x-normal">
  <span>
   <i data-icon="clock">
   </i>
    09:25
  </span>
 </small>
</p>

What I want to do is access the string '09:25' but when I type:

ad_location = car_add.find('td', class_='bottom-cell').div.p.small.span.text

Then it automatically defaults to the first text tag.

I've tried using the select() method but it gave me an empty list. Could anyone help me with this ?

Thank you!

Please supply a MCVE example. Since we don't have all your input HTML, please skip the requests code and just give us the HTML snippet needed to reproduce this. — smci, Jan 11 '20 at 13:05
you can use `find_all('span')` to get list with all `span` and later use `[1]` to get second element from list. — furas, Jan 11 '20 at 13:06
lxml .xpath syntax is better, you can have subscript [1] directly inside the expression — smci, Jan 11 '20 at 13:13
The `.xpath` solution is [How to select first element via XPath?](https://stackoverflow.com/questions/3319341/why-do-indexes-in-xpath-start-with-1-and-not-0) — smci, Jan 11 '20 at 13:17
@smci , thanks for the tip buddy . I'm really new to webscraping, I'm just now learning about these parsers. Thanks again! — Ghirasim Daniel, Jan 12 '20 at 10:51
xpath is incredibly powerful, instead of writing code that assumes the HTML has a certain structure and manually navigates it. If you post the actual url (to make this example MCVE), then I'll post the xpath code. — smci, Jan 12 '20 at 12:39

score 0 · Accepted Answer · answered Jan 11 '20 at 13:10

0

You can use find_all('span') to get list with all span and later use [1] to get second element from list.

from bs4 import BeautifulSoup as BS

text = '''<p class="lheight16">
 <small class="breadcrumb x-normal">
  <span>
   <i data-icon="location-filled">
   </i>
   Otopeni
  </span>
 </small>
 <small class="breadcrumb x-normal">
  <span>
   <i data-icon="clock">
   </i>
    09:25
  </span>
 </small>
</p>'''

soup = BS(text, 'html.parser')

item = soup.find('p').find_all('span')[1].get_text(strip=True)

print(item)

answered Jan 11 '20 at 13:10

furas

134,197
12
106
148

Thank you so much ! It worked out real nice. I wish you a nice day :) – Ghirasim Daniel Jan 11 '20 at 13:16
@GhirasimDaniel you can mark answer as accepted. And when you will have enough reputation you can upvote it. – furas Jan 11 '20 at 13:18
I just marked it, forgot about that yesterday. Thanks for the reminder ! :) – Ghirasim Daniel Jan 12 '20 at 10:50

Accesing a second tag with beatifoulsoup

1 Answers1