2

Good Day, this should be fairly straight forward, but my googling and experimenting is not working.

I have a scraping script in python that uses Selenium/geckodriver/Firefox that runs on an Ubuntu 18 server. Sometimes it does not close properly and Selenium will crash midscript but it leaves many Web Content processes open. If not closed they use up all the memory and then selenium can no longer open and the script fails.

If I run from the command line: pkill 'Web Content' it will kill those processes and free up the memory.

In my python script I use the subprocess module to try and automate this upon Selenium crashing. I've tried a number of options including:

  • subprocess.call("pkill 'Web Content'".split())
  • subprocess.call("pkill 'Web\ Content'".split())
  • subprocess.call("pkill Web\ Content".split())
  • subprocess.call("pkill -f Web\ Content".split())

And all of these throw the same error: pkill: only one pattern can be provided

Yet, if I do something like subprocess.call("pkill firefox".split()) the code is able to run without an error.

What must I do to resolve this issue? Thank you.

Reily Bourne
  • 5,117
  • 9
  • 30
  • 41
  • 1
    `subprocess.call(["pkill", "Web Content"])` – Marat Oct 09 '20 at 16:05
  • It appears to be working, I am just testing to see if it is actually killing the process. I probably don't understand subprocess fully as I thought you had to do one word per list item as it's input. – Reily Bourne Oct 09 '20 at 16:10
  • 1
    It is one argument per list item, not one word. That seems what caused the confusion – Marat Oct 09 '20 at 16:18

2 Answers2

1

You have 2 options:

Use subprocess.call("pkill 'Web Content'", shell=True) or

subprocess.call(shlex.split("pkill 'Web Content'"))

Option 1

From the docs:

On POSIX with shell=True, the shell defaults to /bin/sh. If args is a string, the string specifies the command to execute through the shell. This means that the string must be formatted exactly as it would be when typed at the shell prompt.

split() splits Python string around spaces:

>>> "pkill 'Web Content'".split()
['pkill', "'Web", "Content'"]

So subprocess.call("pkill 'Web Content'".split()) supplies two arguments to pkill: "'Web" and "Content'" while it expects only one. That's why error pkill: only one pattern can be provided pops.

Note the subprocess.call signature which is equivalent to subprocess.Popen:

subprocess.Popen(args,..

From the docs:

args should be a sequence of program arguments or else a single string or path-like object. By default, the program to execute is the first item in args if args is a sequence.

Also note security considerations while using shell=True

Option 2

If you want to supply the args sequence use shlex.split:

>>> s = "pkill 'Web Content'"
>>> import shlex
>>> args = shlex.split(s)
>>> import subprocess
>>> subprocess.call(args)

shlex.split would split the string s using shell-like syntax.

It's up to you which option to use, note relevant information in this answer:

Understand shell=True vs shell=False With shell=True you pass a single string to your shell, and the shell takes it from there.

With shell=False you pass a list of arguments to the OS, bypassing the shell.

When you don't have a shell, you save a process and get rid of a fairly substantial amount of hidden complexity, which may or may not harbor bugs or even security problems.

On the other hand, when you don't have a shell, you don't have redirection, wildcard expansion, job control, and a large number of other shell features.

rok
  • 9,403
  • 17
  • 70
  • 126
  • 1
    Supplying `shell=True` just to get the shell to perform splitting of the command back into a list of strings seems like severe overkill. See also [Actual meaning of `shell=True` in `subprocess`](https://stackoverflow.com/questions/3172470/actual-meaning-of-shell-true-in-subprocess) – tripleee Oct 10 '20 at 09:58
  • Thanks. I've offered 2 options. Second uses `shlex.split`. – rok Oct 10 '20 at 10:21
1

Manually splitting the command line yourself simplifies the entire question significantly.

subprocess.run(['pkill', 'Web Content'], check=True)

If you genuinely need Python to perform the splitting, shlex.split() implements the rules which you assumed Python's regular split function would obey. It doesn't; it simply splits on the string you supply, with no support for escaping etc.

tripleee
  • 175,061
  • 34
  • 275
  • 318