1
>>> soup = BeautifulSoup('<div class="class1 class2 class3">...</div>','lxml')
>>> soup.find('div')['class']
['class1', 'class2', 'class3']

How can i force BS4 to treat class name as a single string?

user3664862
  • 298
  • 1
  • 8
  • Related (or may be a duplicate): http://stackoverflow.com/questions/34295928/disable-special-class-attribute-handling. – alecxe Dec 19 '15 at 18:29

1 Answers1

1

You could use xml as the parser:

soup = BeautifulSoup('<div class="class1 class2 class3">...</div>',"xml")
print(soup.find('div')['class'])
class1 class2 class3

Or you could remove 'class' from builder.cdata_list_attributes['*']:

del BeautifulSoup().builder.cdata_list_attributes["*"][0]

soup = BeautifulSoup('<div class="class1 class2 class3">...</div>')
print(soup.find('div')['class'])
class1 class2 class3
Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321