-3

I need to get the entries between the quotes like in this example: Regex href="x....dkjads...href="y" and it returns x and y.

[<a class="lightbox" href="fileadmin/user_upload/images/Sprachen/Englisch/USA/San_Diego/San_Diego_EC/EC_San_Diego_Galerie.jpg" title=""><img alt="Sprachschule EC San Diego" border="0" height="80" src="typo3temp/pics/EC_San_Diego_Galerie_d1def1bf4a.jpg" title="Sprachschule EC San Diego (Copyright EC San Diego. All rights reserved.)" width="80"/></a>, <a class="lightbox" href="fileadmin/user_upload/images/Sprachen/Englisch/USA/San_Diego/San_Diego_EC/EC_San_Diego_Galerie-_1_.jpg" title=""><img alt="Sprachschule EC San Diego 2" border="0" height="80" src="typo3temp/pics/EC_San_Diego_Galerie-_1__fd87630014.jpg" title="Sprachschule EC San Diego 2 (Copyright EC San Diego. All rights reserved.)" width="80"/></a>, <a class="lightbox" href="fileadmin/user_upload/images/Sprachen/Englisch/USA/San_Diego/San_Diego_EC/EC_San_Diego_Galerie-_10_.jpg" title=""><img alt="Sprachschule EC San Diego 3" border="0" height="80" src="typo3temp/pics/EC_San_Diego_Galerie-_10__a8ed60c277.jpg" title="Sprachschule EC San Diego 3 (Copyright EC San Diego. All rights reserved.)"

How can I input in regex to search for multiple exact chars at the beginning?

This one (?<=\").*?(?=\") returns everything between " " and something like (?<=\{href="}).*?(?=\") does not work

Stefan Badertscher
  • 331
  • 1
  • 6
  • 20
  • Do you need regex? python should be able to parse the html for you. https://stackoverflow.com/questions/2782097/python-is-there-a-built-in-package-to-parse-html-into-dom – Lanting Apr 09 '17 at 10:36
  • Thanks. I started with the process of learning regex and thats why I want to use it even in cases where other solutions are available. – Stefan Badertscher Apr 09 '17 at 12:22

1 Answers1

1

If you want to match the <content> in href="<content>", to pattern to match is href=\"(.*?)\" (regex101 demo) .

With python re module, you can do:

>>> a= """
... [<a class="lightbox" href="fileadmin/user_upload/images/Sprachen/Englisch/USA/San_Diego/San_Diego_EC/EC_San_Diego_Galerie.jpg" title=""><img alt="Sprachschule EC San Diego" border="0" height="80" src="typo3temp/pics/EC_San_Diego_Galerie_d1def1bf4a.jpg" title="Sprachschule EC San Diego (Copyright EC San Diego. All rights reserved.)" width="80"/></a>, <a class="lightbox" href="fileadmin/user_upload/images/Sprachen/Englisch/USA/San_Diego/San_Diego_EC/EC_San_Diego_Galerie-_1_.jpg" title=""><img alt="Sprachschule EC San Diego 2" border="0" height="80" src="typo3temp/pics/EC_San_Diego_Galerie-_1__fd87630014.jpg" title="Sprachschule EC San Diego 2 (Copyright EC San Diego. All rights reserved.)" width="80"/></a>, <a class="lightbox" href="fileadmin/user_upload/images/Sprachen/Englisch/USA/San_Diego/San_Diego_EC/EC_San_Diego_Galerie-_10_.jpg" title=""><img alt="Sprachschule EC San Diego 3" border="0" height="80" src="typo3temp/pics/EC_San_Diego_Galerie-_10__a8ed60c277.jpg" title="Sprachschule EC San Diego 3 (Copyright EC San Diego. All rights reserved.)"
... 
... """
>>> import re
>>> re.findall(r'href=\"(.*?)\"',a)
['fileadmin/user_upload/images/Sprachen/Englisch/USA/San_Diego/San_Diego_EC/EC_San_Diego_Galerie.jpg', 'fileadmin/user_upload/images/Sprachen/Englisch/USA/San_Diego/San_Diego_EC/EC_San_Diego_Galerie-_1_.jpg', 'fileadmin/user_upload/images/Sprachen/Englisch/USA/San_Diego/San_Diego_EC/EC_San_Diego_Galerie-_10_.jpg']
>>> 

Hope this helps.

Ahsanul Haque
  • 10,676
  • 4
  • 41
  • 57