0

Is there a max length limit to Python3 regular expression.

I have a long list of file extensions that I need to validate, following is the regex I intend to use.

.*\.(adm|asa|asmx|asp|aspx|axd|bak|bas|bat|bkp|cc|cfg|cfm|cfml|cgi|conf|copy|cs|css|csv|dat|dbm|dll|dn|do|docx|dot|env|eot|exe|exp|fcgi|fts|gif|gz|htm|html|htr|htw|ico|ida|idq|idx|inc|ini|iso|jar|jpeg|jpg|js|jsf|json|jsp|kspx|lock|log|mako|mdb|meta|mo|mp3|mp4|mwsl|nasl|nlm|nsf|old|ovpl|page|pdf|perl|php|php3|php4|pid|pl|plx|png|pot|pt|qap|rar|rs|sh|sql|stm|svg|swp|tar|tcl|temp|test|tmp|ttf|txt|vb|vtl|w|wdm|web|woff|xml|xsl|xsql|zip)$

python3's re.compile just truncates the compiled regex, the following is the output upon re.comiple

>>> re.compile('.*\.(adm|asa|asmx|asp|aspx|axd|bak|bas|bat|bkp|cc|cfg|cfm|cfml|cgi|conf|copy|cs|css|csv|dat|dbm|dll|dn|do|docx|dot|env|eot|exe|exp|fcgi|fts|gif|gz|htm|html|htr|htw|ico|ida|idq|idx|inc|ini|iso|jar|jpeg|jpg|js|jsf|json|jsp|kspx|lock|log|mako|mdb|meta|mo|mp3|mp4|mwsl|nasl|nlm|nsf|old|ovpl|page|pdf|perl|php|php3|php4|pid|pl|plx|png|pot|pt|qap|rar|rs|sh|sql|stm|svg|swp|tar|tcl|temp|test|tmp|ttf|txt|vb|vtl|w|wdm|web|woff|xml|xsl|xsql|zip)$')
re.compile('.*\\.(adm|asa|asmx|asp|aspx|axd|bak|bas|bat|bkp|cc|cfg|cfm|cfml|cgi|conf|copy|cs|css|csv|dat|dbm|dll|dn|do|docx|dot|env|eot|exe|exp|fcgi|fts|gif|gz|htm|html|htr|htw|ico|ida|idq|idx|inc|ini|iso|jar|jp)

Is there an alternative or a better way of validating this if there is a max length limit?

Som
  • 950
  • 2
  • 16
  • 29
  • https://stackoverflow.com/questions/3640359/regular-expressions-search-in-list, this might answer your question. Put all your extensions in a list to match the pattern. – NMAK Dec 19 '19 at 12:29
  • Unless you get a `OverflowError: regular expression code size limit exceeded` the string should not be too long. – jerch Dec 19 '19 at 12:30
  • Thanks, @wiktor-stribiżew. The compiled regular expression seems to indicate that the pattern is truncated, however, when a match is performed it is matching the entire pattern like mentioned in here https://stackoverflow.com/a/30222089/1162305. – Som Dec 19 '19 at 12:34
  • @Som So it is fine, right? – Wiktor Stribiżew Dec 19 '19 at 12:37
  • Yes, it is fine. – Som Dec 23 '19 at 05:42

0 Answers0