I have this text:
a href="#" class="s-navigation--item js-gps-track js-products-menu" aria-controls="products-popover" data-controller="s-popover" data-action="s-popover#toggle" data-s-popover-placement="bottom" data-s-popover-toggle-class="is-selected" data-gps-track="top_nav.products.click({location:2, destination:1})" data-ga="["top navigation","products menu click",null,null,null]" aria-expanded="false"
With this regex:
attr_regex = '(?:\w+[-\.]*)+(?:=+[\'\"][\w\d\s:;,$@#!\[\]^&?%*\/+(){}.=-]*[\'\"])*'
I want to separate this text into the individual words or variables there are, like this:
But instead, in python code the output gets like this (in a list):
['a', 'aria-controls="products-popover"', 'aria-expanded="false"', 'class="s-navigation--item js-gps-track js-products-menu"', 'data-action="s-popover#toggle"', 'data-controller="s-popover"', 'data-ga', 'top', 'navigation', 'products', 'menu', 'click', 'null', 'null', 'null', 'data-gps-track="top_nav.products.click({location:2, destination:1})"', 'data-s-popover-placement="bottom"', 'data-s-popover-toggle-class="is-selected"', 'href="#"']
As you can see there are some words which are not supposed to come out like that, because they are inside the value of the variable.
Python code:
elements = re.findall(attr_regex, str(text))
print(elements)
Using raw string doesn't fix the problem!
How can I fix this problem, and better, how can I make this regex work successfully in every text possible?