Get all substrings between two different start and ending delimiters

Question

I am trying in Python 3 to get a list of all substrings of a given String a, which start after a delimiter x and end right before a delimiter y. I have found solutions which only get me the first occurence, but the result needs to be a list of all occurences.

start = '>'
end = '</'
s = '<script>a=eval;b=alert;a(b(/XSS/.source));</script><script>a=eval;b=alert;a(b(/XSS/.source));</script>'"><marquee><h1>XSS by Xylitol</h1></marquee>'
print((s.split(start))[1].split(end)[0])

the above example is what I've got so far. But I am searching for a more elegant and stable way to get all the occurences.

So the expected return as list would contain the javascript code as following entries:

a=eval;b=alert;a(b(/XSS/.source));

a=eval;b=alert;a(b(/XSS/.source));

Does this answer your question? [Parsing HTML using Python](https://stackoverflow.com/questions/11709079/parsing-html-using-python) — mkrieger1, Apr 22 '20 at 22:46
Sadly not... I am actually working with Beautiful Soup and Esprima. The input strings on the other hand dont necessary contain a full HTML Structure that could be parsed. They will rather be URL's which contain XSS Paylods and therefor can contain Javascript. I need to manually extract all tags out of the URL. — marcels93, Apr 22 '20 at 22:49

score 1 · Accepted Answer · answered Apr 22 '20 at 23:44

Looking for patterns in strings seems like a decent job for regular expressions. This should return a list of anything between a pair of <script> and </script>:

import re
pattern = re.compile(r'<script>(.*?)</script>')
s = '<script>a=eval;b=alert;a(b(/XSS/.source));</script><script>a=eval;b=alert;a(b(/XSS/.source));</script>\'"><marquee><h1>XSS by Xylitol</h1></marquee>'
print(pattern.findall(s))

Result:

['a=eval;b=alert;a(b(/XSS/.source));', 'a=eval;b=alert;a(b(/XSS/.source));']

Thank you, this does exactly what I was hoping for! – marcels93 Apr 23 '20 at 02:21 — marcels93, Apr 23 '20 at 02:21

Get all substrings between two different start and ending delimiters

1 Answers1