3

I'm using regex to find all selectors in CSS files and sometimes, it loads for minutes. After looking at the files, I found out that the sourceMappingURL is really large and cause the issue:

sourceMappingURL=data:application/json;charset=utf8;base64,eyJ2ZXJzaW9uIjozLCJzb3VyY2VzIjpbIndvb2QuZnVsbC5taW4uY3NzIl0sIm5hbWVzIjpbXSwibWFwcGluZ3MiOiJpQkFFQSw4QkFBOEIsU0FBUyxPQUFPLGlCQUFpQixPQUFPLEtBQUssb0JBQW9CLEtBQUssUUFBUSxPQUFPLEVBQUUsU0FBUyxtQkFBbUIsSUFBSSxRQUFRLFdBQVcsT0FBTyxvQkFBb0IsNEJBQTRCLE9BTyxL...

Here's the full CSS file: https://jsfiddle.net/jj_jaq/32d7hpc0/3/

Here's my regex:

selectors = re.findall(r'([.#\w][-\w,\s.]+)(\{(.*?)\})', content)

Is there a way to speed up my regex?

jjyoh
  • 428
  • 2
  • 6
  • 22

1 Answers1

1

You may tell the regex engine to anchor the matches at left-hand word boundaries. However, just adding \b won't work as the first char you want to match can also be a . or # that are non-word chars.

Use

[.#]?\b([-\w,\s.]+){([^{}]*)}

See the regex demo where [.#]? matches an optional . or # before the word boundary check.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563