Getting meta property with beautifulsoup

Question

I am trying to extract the property "og" from opengraph from a website. What I want is to have all the tags that start with "og" of the document in a list.

What I've tried is:

soup.find_all("meta", property="og:")

and

soup.find_all("meta", property="og")

But it does not find anything unless I specify the complete tag.

A few examples are:

 <meta content="https://www.youtube.com/embed/Rv9hn4IGofM" property="og:video:url"/>,
 <meta content="https://www.youtube.com/embed/Rv9hn4IGofM" property="og:video:secure_url"/>,
 <meta content="text/html" property="og:video:type"/>,
 <meta content="1280" property="og:video:width"/>,
 <meta content="720" property="og:video:height"/>

Expected output would be:

l = ["og:video:url", "og:video:secure_url", "og:video:type", "og:video:width", "og:video:height"]

How can I do this?

Thank you

https://stackoverflow.com/questions/36768068/get-meta-tag-content-property-with-beautifulsoup-and-python, it may be help — Samsul Islam, Feb 18 '21 at 20:26

score 2 · Accepted Answer · answered Feb 19 '21 at 11:45

2

use CSS selector meta[property]

metas = soup.select('meta[property]')
propValue = [v['property'] for v in metas]
print(propValue)

answered Feb 19 '21 at 11:45

uingtea

6,002
2
26
40

score 1 · Answer 2 · answered Feb 18 '21 at 20:33

Is this what you want?

from bs4 import BeautifulSoup

sample = """
<html>
<body>
<meta content="https://www.youtube.com/embed/Rv9hn4IGofM" property="og:video:url"/>,
<meta content="https://www.youtube.com/embed/Rv9hn4IGofM" property="og:video:secure_url"/>,
<meta content="text/html" property="og:video:type"/>,
<meta content="1280" property="og:video:width"/>,
<meta content="720" property="og:video:height"/>
</body>
</html>
"""

print([m["property"] for m in BeautifulSoup(sample, "html.parser").find_all("meta")])

Output:

['og:video:url', 'og:video:secure_url', 'og:video:type', 'og:video:width', 'og:video:height']

score 1 · Answer 3 · answered Feb 18 '21 at 20:39

1

You can check if og exist in property as follows:

...
soup = BeautifulSoup(html, "html.parser")

og_elements = [
    tag["property"] for tag in soup.find_all("meta", property=lambda t: "og" in t)
]

print(og_elements)

answered Feb 18 '21 at 20:39

MendelG

14,885
4
25
52

Getting meta property with beautifulsoup

3 Answers3