1

I got python 2.6 which has an old version of toprettyxml() which doesn't do my xml formatting as expected. Hence Im trying to call xmllint using subprocess. This is my simplified code.

      xmlParseCmd = "xmllint -format - <<< '%s'" % '<?xml version="1.0" encoding="UTF-8"?> <insertion> <mytag>123456</mytag> <mytag2>789</mytag2> </insertion>'
      print shlex.split(xmlParseCmd)
      pxmlParser = subprocess.Popen(shlex.split(xmlParseCmd), stdout=subprocess.PIPE)
      pretty_xml = pxmlParser.communicate()[0]
      print pretty_xml

The program hangs indefinitely after the below output. I guess its waiting for some input.

 -> python ~/myscripts/resources/test_xtract.py
['xmllint', '-format', '-', '<<<', '<?xml version="1.0" encoding="UTF-8"?> <insertion> <mytag>123456</mytag> <mytag2>789</mytag2> </insertion>']

I've used a here string as input for xmllint, then why is it still waiting for input? I've being trying to debug this but havnt found anything concrete to solve this. Any pointers would be of great help

tripleee
  • 175,061
  • 34
  • 275
  • 318
vibz
  • 157
  • 1
  • 12
  • Don't use `Popen()` if you really want `check_output` (or with modern Python 3.6+ simply `run()`). – tripleee Dec 15 '17 at 10:37
  • Even if you are stuck on Python 2, you *really* should consider moving to 2.7. For new development you *really* *really* want to target Python 3. Py2 was slated to be end-of-lifed next year, though it was put on an additional couple of years of terminal care. – tripleee Dec 15 '17 at 10:47

1 Answers1

2

The here string <<< is a shell construct. When using shlex() the command line will be split into arguments as if the shell was there, so you don't need shell=True, but shlex doesn't -- and coulnd't -- know if what you are attempting to parse is something which still requires the shell.... which of course is exactly the problem here.

If you are really desperate, you can of course call on the shell to simply print a string (in which case, take out shlex and pass the long string with shell=True), but, you know, Python can do that too.

from subprocess import run, PIPE

xml = '<?xml version="1.0" encoding="UTF-8"?> <insertion> <mytag>123456</mytag> <mytag2>789</mytag2> </insertion>'
xmllint = run(['xmllint', '-format', '-'], input=xml, stdout=PIPE, universal_newlines=True)
print(xmllint.stdout)

With this simple static command, shlex is kind of overkill, though it will of course save you from figuring out exactly how the shell will parse your command line. I just hard-coded the command here.

If you are really stuck on Python 2, consider switching to 2.7, which has subprocess.check_output() which does pretty much the same thing, though the interface is somewhat more clunky.

If you are really stuck on Python 2.6, then for directly interacting with Popen(), the process will be pretty much like in your existing code - you just have to change it either to pass the input with p = Popen(['xmllint', etc]); p.communicate('string'), or cave in to the sinful temptation of Popen("xmllint etc <<<'%s'" % string, shell=True) (though in the latter case, without shlex, you'll have to think about how to escape any single quotes in the input string, or live with the fact that they will cause a syntax error, so maybe the temptation isn't very strong here, when the first alternative is so much clearer and simpler).

tripleee
  • 175,061
  • 34
  • 275
  • 318
  • Here's the scoop on why you want to avoid `shell=True` if you can: https://stackoverflow.com/questions/3172470/actual-meaning-of-shell-true-in-subprocess – tripleee Dec 15 '17 at 10:55
  • thanks for the details, but for the time being I'll have to be content with 2.6. Is there any other way to do this with python 2.6? – vibz Dec 15 '17 at 15:22
  • Other than either passing `Popen(cmd, input="string")` or with `shell=True` like this answer suggests...? Why can't you use either of those? – tripleee Dec 15 '17 at 19:14
  • Expanded that part of the answer; maybe it was unclear before. – tripleee Dec 15 '17 at 19:24
  • thanks for the updates. `Popen("xmllint -format - <<< '%s'" % xml, shell=True)` gives me `/bin/sh: syntax error at line 1: '<' unexpected error`. What am I doing wrong here. my variable is in `xml=' 123456 789 '` format – vibz Dec 18 '17 at 10:44
  • `sh` doesn't support here strings. I'll repeat my strong recommendation to avoid the shell if you can, especially if you are not intimately familiar with it. – tripleee Dec 18 '17 at 10:49
  • hmm. meanwhile I don't think there is an argument 'input' for Popen atleast in python 2.6. That's why Im left with using option 2 of using Shell – vibz Dec 18 '17 at 11:17
  • Right, sorry about that oversight. The way to do that is to pass the input as an argument to `communicate()`. I updated the answer slightly to reflect this. – tripleee Dec 18 '17 at 11:22