0

I am using Beautiful Soup and I have this bit of code :

<div class="tomato"></div>

And when I do:

from bs4 import BeautifulSoup


for div in soup.find_all('div'):
    print(div.get('class'))

I get [u'tomato']. Is there any way to change this code to get only "tomato" without all the other characters?

I also have a few divs that have multiple classes.

Martin Gergov
  • 1,556
  • 4
  • 20
  • 29
Chor May
  • 65
  • 6
  • 2
    Possible duplicate of [Python string prints as \[u'String'\]](https://stackoverflow.com/questions/599625/python-string-prints-as-ustring) – NissimL Jan 08 '18 at 21:34
  • It's possible for a `div` to have [multiple `class` attributes](https://stackoverflow.com/questions/14884523/can-a-div-have-multiple-classes-twitter-bootstrap), so it's returning a list of class names. What would you want it to print if there were 2 classes? – Peter Wood Jan 08 '18 at 21:38
  • 1
    So use `classes = div.get('class', [])` and `print u' '.join(classes)` to print all classes on a div. – Martijn Pieters Jan 08 '18 at 21:44
  • @PeterWood i said it can have multiple classes because if it does it will display [u'something',u'something',u'something'] – Chor May Jan 08 '18 at 22:00
  • @MartijnPieters thanks so much man! could you explain what the commands do please? – Chor May Jan 08 '18 at 22:01
  • 1
    @ChorMay: The `class` attribute returns a *list* of unicode strings. I gave `get()` a second argument to return an empty list instead of the default `None` if the attribute is missing. `unicode.join()`, like `str.join()`, takes a sequence and joins the elements into a single string, here using a space. – Martijn Pieters Jan 08 '18 at 22:04
  • `div.find_all('div', {'class': 'tomato'})` would also work. Which, afaik, basically tests `if 'tomato' in div.get('class')`. – sytech Jan 08 '18 at 23:01

0 Answers0