Regex match any li of ul that contains text

Question

I have a string

<ul><li>Option to add embroidered text personalization below design<br/>for only $1.00 per shirt and free setup</li><li>Men&#39;s Sizes: XS-6XL</li><li>Individually folded and bagged with size sticker for easy distribution</li><li>Ready to ship in 7 business days after art approval</li></ul>

Trying to match

<li>Men&#39;s Sizes: XS-6XL</li>

I am looking to take only the last <li></li> set that contains words

So for li that contains sizes I am looking to run something like:

(<li>).*?\b[sS]izes[ :]{1}.*?<\/li>

but that selects the first <li> instance instead of the closest.

EDIT: I can't use a html parser here like HTMLAgilityPack.

Hang on.. I've got a great link somewhere round here that talks in some depth about using Regex to parse HTML... — Caius Jard, Mar 19 '21 at 20:20
Ok but I am not looking for a htmlparser like HTMLAgilityPack here — Ya Wang, Mar 19 '21 at 20:21
..[found it](https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) .. By the way, have you thought about setting [RTL search direction on your Regex](https://learn.microsoft.com/en-us/dotnet/standard/base-types/regular-expression-options), if you have a pattern that matches the first but you want the last? — Caius Jard, Mar 19 '21 at 20:27
Eh. This tells me to forget about it and use htmlagilitypack. — Ya Wang, Mar 19 '21 at 20:29

score 1 · Accepted Answer · answered Mar 19 '21 at 20:29

I'd use the pattern:

<li>[^<]*[Ss]izes[^<]*<\/li>

Which works like:

Element	Matches
`<li>`	The opening tag
`[^<]*`	Zero or more characters that are not the start of a new tag (`<`)
`[Ss]izes`	The keyword we are looking for
`[^<]*`	Zero or more characters that are not the start of a new tag (`<`)
`<\/li>`	The closing tag

Try it out!

And I'd take the last such matching element.

score 0 · Answer 2 · answered Mar 19 '21 at 20:32

You can use innerHTML and innetText properties like this:

const str = "<ul><li>Option to add embroidered text personalization below design<br/>for only $1.00 per shirt and free setup</li><li>Men&#39;s Sizes: XS-6XL</li><li>Individually folded and bagged with size sticker for easy distribution</li><li>Ready to ship in 7 business days after art approval</li></ul>"
const el1 = document.createElement('div')
el1.innerHTML = str;
let liArr = el1.getElementsByTagName('li')
let resultsText = [] 
let resultsHTML = []
for (const listElement of liArr) {
    if(listElement.innerText.indexOf('Size') >-1){
        resultsText.push(listElement.innerText)
        resultsHTML.push(listElement)
    }
}
console.log('resultsText:::::::::::::')
console.log(resultsText)
console.log('resultsHTML::::::::::::')
console.log(resultsHTML)

Regex match any li of ul that contains text

2 Answers2