How to get data from span tag which have custom characteristics? (BeautifulSoup)

Question

I have following span tag. How can I scrape xuRMlBoIUcI7nAJktBcJvPByp1DLE4aPGzq3JNiRKsdNqUkVSJBY%2BggxRhp0GcRx4Gw4lWQxbTk%3D which is assigned to data-slug?

    <span data-ju-jspjrvxy="" 
    data-slug="xuRMlBoIUcI7nAJktBcJvPByp1DLE4aPGzq3JNiRKsdNqUkVSJBY%2BggxRhp0GcRx4Gw4lWQxbTk%3D" 
    data-gtm-clickedelement="CTA button" data-gtm-offer="" data-ju-wvxjoly-pk="303795"
 data-gtm-voucher-id="303795" class="businessinsiderus-voucher-button-holder clear">

Roshin Raphel · Answer 1 · 2020-07-16T10:24:29.987

0

If s is your data string, then use the regex module:

import re
match = re.findall('data\-slug=\"()\"',str(s))

edited Jul 16 '20 at 10:24

answered Jul 16 '20 at 10:14

Roshin Raphel

2,612
4
22
40

I want that full string assigned to data-slug. – saurabh Shah Jul 16 '20 at 10:17
Yes, this will return an array of strings of all possible matches. if only one string is required use, match[0] – Roshin Raphel Jul 16 '20 at 10:19
I have multiple span tags and each has an unique value assigned to data-slug. Will it work on that too. – saurabh Shah Jul 16 '20 at 10:22
Yes, it will return a of multiple strings. in the input string s contain multiple data-slugs – Roshin Raphel Jul 16 '20 at 10:23
It is better to use str(s) in the findall function, I have edit the answer. – Roshin Raphel Jul 16 '20 at 10:24

score 0 · Accepted Answer · answered Jul 16 '20 at 10:18

0

If my understanding of your problem is correct you want to scrape an attribute of a tag. If this is in fact your problem the following link will provide a solution: Extracting an attribute value with beautifulsoup

answered Jul 16 '20 at 10:18

PavelNikov

48
6

yrnr · Answer 3 · 2020-07-16T10:51:12.733

0

    from bs4 import BeautifulSoup as BS

    content = 'your html span text here'

    soup = BS(content,parser='html', features='lxml')

    dict_of_spantag_attributes_and_values = soup.span.attrs

    for i,j in dict_of_spantag_attributes_and_values.items():

        print(f'{i}:{j}')

edited Jul 16 '20 at 10:51

answered Jul 16 '20 at 10:45

yrnr

71
6

How to get data from span tag which have custom characteristics? (BeautifulSoup)

3 Answers3