2

trying to capture console output and find keyword in that, which approach should be faster? for example: trying to run "ls" command through python and check a keyword is there in the console output

1st Approach having a list of string and then search in each string

list_output = [ "abc", "def", "xyz"]
for i in list_output:
    if "ab" in i:
        print "found"
        break

Please note: Each item in list_output might be a big string itself

2nd Approach having a big string and search a sub string in the big string

string_output = "abcdefxyz"
if "ab" in S:
    print "found"

Please note: string_output could be a huge string

Need to stop search as soon as first occurrence is found, no need to search entire string or list further

Pradip Das
  • 728
  • 1
  • 7
  • 16
  • I think first approach is fine. Second approach can cause memory issues. – nice_dev Feb 10 '20 at 15:44
  • You don't want to use the second because with some bad input, e.g. `efx` you could get "found" when it would probably be wrong. – Guy Coder Feb 10 '20 at 15:57
  • how big are these "big string"s? millions or billions of bytes? what are you splitting them on? if this is output of `ls` I'd not expect it to be pretty small (normally much less than a MB), also why not use use `pathlib`? – Sam Mason Feb 10 '20 at 16:05
  • "ls" is just a sample command, it could be any linux command. And yes, output of those command should not go beyond 1 MB ,I am splitting the on "\n" – Pradip Das Feb 10 '20 at 16:29
  • Instead of piping the output of `ls` straight to python, why not pass it through `grep` first to extract the terms you're looking for? It's bound to be faster at matching strings than anything you could write in python. – r3mainer Feb 10 '20 at 16:41
  • Need to show the entire output and then print message if the keyword is in the output – Pradip Das Feb 11 '20 at 04:55

1 Answers1

-1

According to this using your second approach is much faster than the first one, because the first approach sets up a hash table on the fly, and just does a linear search.

Also (still referring to the link above) if using sequence is a must, set is better that list

Aven Desta
  • 2,114
  • 12
  • 27
  • Did you read the question? You reference an answer using a set, but then reference the OP idea which is one long string. – Guy Coder Feb 10 '20 at 16:00
  • @GuyCoder yes I've read the question. and the link has multiple answers. check out the second answer. its about strings – Aven Desta Feb 10 '20 at 16:10
  • this this approach there is an overhead to convert the big string in to set. Could you please provide an example – monster Oct 09 '20 at 16:13