Regex(PHP): Avoid Capturing a certain word list

Question

If we take a text Like this

 <p>Portable <span class="shlt">Adobe</span> <span class="shlt">After</span>
 <span class="shlt">Effects</span> CC <span class="shlt">2018</span> 15.1.1.12 (x64)</p>

There are words between those  tags. I need to capture The title Only!

(You can clearly see that it contains Portable Adobe After Effects CC 2018 15.1.1.12 (x64))

Is it possible to avoid capturing  and  Parts?

And Capture only the Portable Adobe After Effects CC 2018 15.1.1.12 (x64) Text?

What I am currently trying to do is Capturing the words in between those tags. Is there a better way! A sample regex Code will be useful. In PHP Please...

https://stackoverflow.com/q/590747/4265352 – axiac Jun 01 '18 at 16:56 — axiac, Jun 01 '18 at 16:56

The fourth bird · Answer 1 · 2018-06-01T17:00:01.687

2

Instead of using a regex, you might use DOMDocument and use getElementsByTagName to find your  element.

Then take the first match from the result and get the textContent:

$dom = new DOMDocument();
$dom->loadHTML($data);
echo $dom->getElementsByTagName("p")[0]->textContent;

That will give you:

Portable Adobe After Effects CC 2018 15.1.1.12 (x64)

edited Jun 01 '18 at 17:00

answered Jun 01 '18 at 16:50

The fourth bird

154,723
16
55
70

Doesn't it also give those span tags.Also I do not have the Document File I need. I am just using cURL to get it... – The Bang Bandit Jun 01 '18 at 16:52
According to the [docs](http://at.php.net/manual/en/class.domnode.php#domnode.props.textcontent), this returns `The text content of this node and its descendants.` – The fourth bird Jun 01 '18 at 16:59
But How do i use it if i was using cURL?? – The Bang Bandit Jun 01 '18 at 17:02
Then you load the html from the curl request into DOMDocument. When the response is a more complicated structure, you could then use [DOMXPath](http://php.net/manual/en/class.domxpath.php) and create an xpath expression with [query](http://php.net/manual/en/domxpath.query.php). – The fourth bird Jun 01 '18 at 17:08

jarchuleta · Accepted Answer · 2018-06-01T17:07:25.930

0

You can capture groups inside of the regex by using (). Then you can parse out the array.
Here is an example.

$re = '/\<span class="shlt">([^<]*)<\/span>/m';
$str = 'Portable <span class="shlt">Adobe</span> <span 
class="shlt">After</span> <span class="shlt">Effects</span> CC <span 
class="shlt">2018</span> 15.1.1.12 (x64)';

preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);

// Print the entire match result
var_dump($matches);

this will remove the span tags

$str = 'Portable <span class="shlt">Adobe</span> <span 
class="shlt">After</span> <span class="shlt">Effects</span> CC <span 
class="shlt">2018</span> 15.1.1.12 (x64)';

preg_replace("/<\/?span[^>]*>/", "", $str);
echo $str;

edited Jun 01 '18 at 17:07

answered Jun 01 '18 at 16:43

jarchuleta

1,231
8
10

Ok but what about those other words. The words which are not in span tags – The Bang Bandit Jun 01 '18 at 16:54
\([^<]*)<\/span>([^<]*) this one will capture the words after too. – jarchuleta Jun 01 '18 at 16:57
No buddy, It's not what i need. Those span tags are not always like that. they differ each time.So this may return wrong outputs sometimes.Also It does not return similar outputs all the time. I need to capture everything else except those span tags – The Bang Bandit Jun 01 '18 at 16:59
you can make it as complex as it needs to be. I don't have all the cases to write the perfect regex, but this will get you there. – jarchuleta Jun 01 '18 at 17:01
maybe you can use regex to remove the span tags? – jarchuleta Jun 01 '18 at 17:01
preg_replace("/<\/?span[^>]*>/", "", $str); will remove the span tags. – jarchuleta Jun 01 '18 at 17:04
Thanks man It will work. I totally forgot bout that function. Awesome. Thanks Dude!!! – The Bang Bandit Jun 01 '18 at 17:14
Welcome, can you select my answer? Thanks. – jarchuleta Jun 01 '18 at 17:15
Cause YES! Post It as a answer.I'll Select It. Thanks Dude. – The Bang Bandit Jun 01 '18 at 17:19

Regex(PHP): Avoid Capturing a certain word list

2 Answers2