2

I am having trouble calling an EMBOSS program (which runs via command line) called sixpack through Python.

I run Python via Windows 7, Python version 3.23, Biopython version 1.59, EMBOSS version 6.4.0.4. Sixpack is used to translate a DNA sequence in all six reading frames and creates two files as output; a sequence file identifying ORFs, and a file containing the protein sequences.

There are three required arguments which I can successfully call from command line: (-sequence [input file], -outseq [output sequence file], -outfile [protein sequence file]). I have been using the subprocess module in place of os.system as I have read that it is more powerful and versatile.

The following is my python code, which runs without error but does not produce the desired output files.

from Bio import SeqIO
import re
import os
import subprocess

infile = input('Full path to EXISTING .fasta file would you like to open: ')
outdir = input('NEW Directory to write outfiles to: ')
os.mkdir(outdir)
for record in SeqIO.parse(infile, "fasta"):

    print("Translating (6-Frame): " + record.id)

    ident=re.sub("\|", "-", record.id)

    print (infile)
    print ("Old record ID: " + record.id)
    print ("New record ID: " + ident)

    subprocess.call (['C:\memboss\sixpack.exe', '-sequence ' + infile, '-outseq ' + outdir + ident + '.sixpack', '-outfile ' + outdir + ident + '.format'])

    print ("Translation of: " + infile + "\nWritten to: " + outdir + ident)
Simeon Visser
  • 118,920
  • 18
  • 185
  • 180
user1426421
  • 81
  • 1
  • 1
  • 8
  • Does it give an error? Is the command exactly the same as you would type it on the command-line? – Simeon Visser Jul 09 '12 at 15:49
  • Well, not exactly the same... I am using (strings, variables?) such as 'infile', 'outdir' and 'ident' to represent what would be the actual path and filenames that would be typed into command line. Can the subprocess module handle these strings/ variables? – user1426421 Jul 09 '12 at 16:45
  • Using variables is fine but what I mean is: if I would copy and paste the command from your code (which is a string basically) to the actual command-line, would it work? You may need to specify full paths instead of relative paths for file access to work. – Simeon Visser Jul 09 '12 at 16:50
  • command line: sixpack -sequence c:\python32\multi.fasta -outseq c:\memboss\OUTSEQ.sixpack -outfile c:\memboss\OUTFILE.sixpack – user1426421 Jul 09 '12 at 16:55
  • What is the difference between a relative path and full path? As long as the user types the full path when prompted, that full path gets assigned to the variable 'infile', 'outdir', etc... – user1426421 Jul 09 '12 at 16:57
  • A relative path is a part of a full path (such as `folder1\folder2\file.ext`). Anyway, if you run that command, does it work? Is the directory writeable for the output files? Does anything happen that should not happen? – Simeon Visser Jul 09 '12 at 17:00
  • The command "subprocess.call (['C:\memboss\sixpack.exe', '-sequence c:\python32\multi.fasta -outseq c:\memboss\OUTSEQ.sixpack -outfile c:\memboss\OUTFILE.sixpack'])" does not work when it replaces the above code. – user1426421 Jul 09 '12 at 17:06
  • No, the command does not work if I copy/paste into command line. That makes sense though, because those variables have no meaning outside of python – user1426421 Jul 09 '12 at 17:07
  • You shouldn't copy/paste with variables of course. I mean the command that your Python code produces. Either way, it doesn't work won't tell us much. Can you try using subprocess to call sixpack without any parameters? If that works, you should get an error from sixpack that it is missing some parameters. – Simeon Visser Jul 09 '12 at 17:10
  • Is there a method to see what errors are being produced? Or a way to see what the command that Python is producing? – user1426421 Jul 09 '12 at 17:14
  • IOError: [Errno 22] Invalid argument: 'C:\\python32\\myfiles\test\\multi.fasta'..... do the double slashes mean anything? – user1426421 Jul 09 '12 at 17:19
  • You can use `subprocess.check_call` and `subprocess.check_output` to check the output of a function. Also read the last section of http://docs.python.org/library/subprocess.html as you're using Windows and minor issues can cause the command not to work. – Simeon Visser Jul 09 '12 at 17:23
  • I've read the entire subprocess document, still can't figure it out. Anyone have any ideas? It seems as though sixpack does execute, but no output files are created. – user1426421 Jul 11 '12 at 18:48

1 Answers1

2

Found the answer.. I was using the wrong syntax to call subprocess. This is the correct syntax:

subprocess.call (['C:\memboss\sixpack.exe', '-sequence', infile, '-outseq', outdir + ident + '.sixpack', '-outfile', outdir + ident + '.format'])
user1426421
  • 81
  • 1
  • 1
  • 8