0

Regular Expressions: Search in list

I want to have a fast way to match a regex in a list. The above solution has a linear complexity wrt the length of the list. Given things like suffix-array or suffix-tree for pure string search, is there something similar for regex search in a list that can have a sublinear complexity in the length of the list? (I am currently interested in a python solution. Solutions in other langauges can be mentioned, but not a priority here.) Thanks.

user1424739
  • 11,937
  • 17
  • 63
  • 152
  • Just making sure: you've already pre-compiled and pre-optimized your regex patterns and still the performance is too low for you? – Shloim Nov 03 '19 at 14:25
  • My question is that the solution should have a sublinear time complexity wrt the list length. No matter how you optimize the regex, the current solution has a linear complexity wrt the length of the list, which is not what I am looking for. – user1424739 Nov 03 '19 at 14:29
  • How can it be sublinear if each element in the list must be processed? Does your data have particular qualities that might allow for this? – snakecharmerb Nov 03 '19 at 14:30
  • When you say "sublinear", you mean faster than O(n) using "Big O notation" where n is the size of the list? – Booboo Nov 03 '19 at 14:31
  • What would the regex consist of ? If you're searching using a string literal that is exactly one of the elements, it can be done a different way. Set up a full blown trie of all the elements of the array, then just search using the term. Would consist of a max of 5 comparisons to match or fail, no matter how many elements there are. –  Nov 03 '19 at 15:52
  • Not in the general case. Matching a single character in a list can't be done in sublinear time. And this is a (very simple) regex. – m.raynal Dec 11 '19 at 16:30

0 Answers0