1

So I am scraping a website which has a DOM something similar to that:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <meta http-equiv="X-UA-Compatible" content="ie=edge">
    <title>Document</title>
</head>
<body>
    
    <p style='color: #ff9900'></p>
    <p></p>
    <p></p>
    <p style='color: #ff0000'>2</p>

    <p></p>
    <p></p>
    <p style='color: #ff9900'>3</p>
    <p></p>
    <p style='color: #ffffff'>4</p>
    <p></p>





</body>
</html>

As you can see, there are <p> tags that have style attribute. I would like to get the elements only containing the style attribute.

const $ = cheerio.load(page, {
            normalizeWhitespace: true,
            xmlMode: false
        });


const item = [];
$('p:style="color:#ff9900').each(function(){
    item.push($(this).text())
})
console.log(item)

I want to know if there is any chance to have a style as a selector in cheerio. There is a chance to have it as such p[style], but it will return every element that have the attribute like this. Let's say I want it to return elements with only certain style style="color:#ff9900 not style="color:#ffffff

DavidB
  • 51
  • 1
  • 11
  • 2
    Possible duplicate of [CSS selector by inline style attribute](https://stackoverflow.com/questions/8426882/css-selector-by-inline-style-attribute). **tldr;** `p[style]` – Ionut Necula Jan 31 '19 at 15:09
  • @Ionut I edited the post. I am interested in having `p[style=color:#ff9900]` returned only. – DavidB Jan 31 '19 at 15:21
  • Same principle applies: `$('p[style="color: #ff9900"]')` – Ionut Necula Jan 31 '19 at 15:24
  • 1
    I've made you a working fiddle [here](https://jsfiddle.net/Lk403m6z/) – Ionut Necula Jan 31 '19 at 15:35
  • 1
    @Ionut Worked now! `$('p[style="color: #ff9900;"]')` The only thing left needed to be added was a semicolon at the end. Don't really undestand why though. It seems that it was a duplicate of CSS selector by inline style attribute. – DavidB Jan 31 '19 at 15:36
  • 1
    You can use `*=` for partial match: `p[style*=ff9900]` - Cheerio might be adding the ; to those attributes. – pguardiario Jan 31 '19 at 23:16

1 Answers1

2

I have same types of problem and I resolved by placing p:style="color:#ff9900' with p:[style="color:#ff9900"]. here is your problem's solution

const $ = cheerio.load(page, {
            normalizeWhitespace: true,
            xmlMode: false
        });


const item = [];
$('p:[style="color:#ff9900"]').each(function(){
    item.push($(this).text())
})
console.log(item)
Muhammad Raheel
  • 617
  • 1
  • 6
  • 15