-1

I am trying to capture the url of the images (how ever many there may be on a specific site. I am able to do so however when I then progress to try an capture other things thereafter the entire thing falls apart. Would greatly appreciate any help.

Working regex:

.(?:src="(http:\/\/website\.bla\.com\/Live.+?)".+?)

Non working

.(?:src="(http:\/\/website\.bla\.com\/Live.+?)".+?).*Status.*\s(Sld|Rtr)

Sample code:

 <div ng-class="{
    'active': active
  }" class="item text-center ng-isolate-scope" ng-transclude="" ng-repeat="slide in slides" active="slide.active">
                          <img class="image-circle ng-scope" ng-src="http://website.bla.com/Live/photos/FULL/18/134/W3764134_18.jpg" src="http://website.bla.com/Live/photos/FULL/18/134/W3764134_18.jpg">
                  </div><!-- end ngRepeat: slide in slides --><div ng-class="{
    'active': active
  }" class="item text-center ng-isolate-scope" ng-transclude="" ng-repeat="slide in slides" active="slide.active">
                          <img class="image-circle ng-scope" ng-src="http://website.bla.com/Live/photos/FULL/19/134/W3764134_19.jpg" src="http://website.bla.com/Live/photos/FULL/19/134/W3764134_19.jpg">
                  </div><!-- end ngRepeat: slide in slides --><div ng-class="{
    'active': active
  }" class="item text-center ng-isolate-scope" ng-transclude="" ng-repeat="slide in slides" active="slide.active">
                          <img class="image-circle ng-scope" ng-src="http://website.bla.com/Live/photos/FULL/20/134/W3764134_20.jpg" src="http://website.bla.com/Live/photos/FULL/20/134/W3764134_20.jpg">
                  </div><!-- end ngRepeat: slide in slides -->
                </div>
<b class="ng-binding">Status:</b> &nbsp; &nbsp; Sld
nicky
  • 787
  • 2
  • 12
  • 27
  • don't parse html with regex, use xml parsers. What is your OS? – RomanPerekhrest May 19 '17 at 22:21
  • try alternates: https://regex101.com/r/sHWJMi/1 for this example. But seriously this can get complicated. – Khanna111 May 19 '17 at 22:23
  • What language is the regex in? Why do you have the non-capturing group around the whole thing? Why the `.` at the front? How are you matching across newlines? – NetMage May 19 '17 at 22:24
  • Its in php, the . at the front helped at some point in testing but is not necessary. OS is windows for testing but Linux for deployed env. – nicky May 20 '17 at 01:10

2 Answers2

1

For this simple example: use alternates. Please see this.

But this can get complicated if added requirements are to be implemented. In that case you might want to use a HTML parser as in JSoup.

See this one - it is already answered:

Community
  • 1
  • 1
Khanna111
  • 3,627
  • 1
  • 23
  • 25
0

With lots of assumptions, you could try this:

src="(http://website\.bla\.com/Live.+?)"(?:(?:[^s]|s[^r]|sr[^c])*?Status.*? (Sld|Rtr))?
NetMage
  • 26,163
  • 3
  • 34
  • 55