0

I am running a Java tool called snpSift and would like to run it on multiple files. As such I am using a pythong script and using subprocess. The actual command I am trying to create a string of is:

java -jar SnpSift.jar filter "ANN[0].EFFECT has 'variant'" input.vcf > ~/output.vcf

This command is correct as I have used it directly on the command line myself. I have created a list called variantType which contains strings of different variants that I intend to use as a variable when running snpSift.

I am trying to create another list (command) that will contain the entire command line input as a string for each file and each variantType. My script is below:

command = [] 
for file in os.listdir("filepath"):  
    absfile = os.path.abspath(file)  
    if(file.endswith(".vcf")):
        for i in variantType:  
            w = 'java -jar SnpSift.jar filter "'
            x = "ANN[0].EFFECT has "  
            y = "'" + i + "'"
            z = '" ' + absfile +  " > output." + i + "." + str(file)
            command.append(w+x+y+z)

My issue is that the input for the filter has to have quotation marks in this manner: "ANN[0].EFFECT has 'variant'"

My attempts to do this using double quotation marks have failed, and result in the following output:

'java -jar SnpSift.jar filter "ANN[0].EFFECT has \'transcript_ablation\'" input.vcf > ~output.vcf'

How can I remove those '\' characters? If I print y (the variable containing that part of the entire string) these characters are not printed, but when I print the entire command, they are there, and therefore I cannot run the command properly.

EDIT

When I use:

print(command[0])

This prints the desired command (without the '\').

It is only when I use:

command[0]

That the issue occurs.

spiral01
  • 545
  • 2
  • 17

1 Answers1

0

I am not yet allowed to comment, however, maybe this answer still helps you.

You can check which character there is at a specific position in your string by accessing that character directly (somestr[position]). I ran you code example (slightly modified, e.g. using format to create the command) and tested which character there is at position 48 and it turns out that it is indeed a single quote with ASCII code 39:

import os

variantType = ['variant', 'test']

command = []

absfile = "/tmp.vcf"
file = "tmp.vcp"

for i in variantType:
    w = 'java -jar SnpSift.jar filter "ANN[0].EFFECT has \'{v}\'" input.vcf > ~/output.vcf'.format(v=i)

    command.append(w)

print(", ".join([str(x) for x in command]))

for x in command:
    print(repr(x))
    print(str(x))
    print(ord(x[48]))

The output then is as follows:

['java -jar SnpSift.jar filter "ANN[0].EFFECT has \'variant\'" input.vcf > ~/output.vcf', 'java -jar SnpSift.jar filter "ANN[0].EFFECT has \'test\'" input.vcf > ~/output.vcf']
java -jar SnpSift.jar filter "ANN[0].EFFECT has 'variant'" input.vcf > ~/output.vcf, java -jar SnpSift.jar filter "ANN[0].EFFECT has 'test'" input.vcf > ~/output.vcf
'java -jar SnpSift.jar filter "ANN[0].EFFECT has \'variant\'" input.vcf > ~/output.vcf'
java -jar SnpSift.jar filter "ANN[0].EFFECT has 'variant'" input.vcf > ~/output.vcf
'
39
'java -jar SnpSift.jar filter "ANN[0].EFFECT has \'test\'" input.vcf > ~/output.vcf'
java -jar SnpSift.jar filter "ANN[0].EFFECT has 'test'" input.vcf > ~/output.vcf
'
39

As you can see, for each command, the first print statement creates the "malformed" string representation (using the object's __repr__ function), and the second version produces the nicely formated one (using the object's __str__ function).

What is happening when you call print(somelist) is that python prints the repr value of each object in somelist, which could enable you to recreate your list from the printed list string using the eval function.

Check the following links for more information on the difference between __repr__() and __str()__:

mjoppich
  • 3,207
  • 1
  • 11
  • 13
  • Thanks for your reply. Whilst what you recommend works when printing a statement, when I wish to run the subprocess by using: 'for i in command: subprocess.call(i)' the same issue with the / occurs. – spiral01 Jun 13 '17 at 17:54
  • You can try to alter my example by replacing `w = '/bin/echo "\'{v}\'"'.format(v=i)`. Then try to launch this command in the for loop `print(subprocess.call(x, shell=True))` (that shell=True is important, as otherwise subprocess will interpret the _x_ differently. What I get as output from this call to _echo_ then is `'test'`. Does that work for you? What's the error message in detail? – mjoppich Jun 13 '17 at 20:19