58

I am trying to pass BioPython sequences to Ilya Stepanov's implementation of Ukkonen's suffix tree algorithm in iPython's notebook environment. I am stumbling on the argparse component.

I have never had to deal directly with argparse before. How can I use this without rewriting main()?

By the by, this writeup of Ukkonen's algorithm is fantastic.

Community
  • 1
  • 1
Niels
  • 1,513
  • 1
  • 14
  • 21

11 Answers11

52

Using args = parser.parse_args(args=[]) would solve execution problem.

or you can declare it as class format.

class Args(argparse.Namespace):
  data = './data/penn'
  model = 'LSTM'
  emsize = 200
  nhid = 200

args=Args()
sngjuk
  • 927
  • 1
  • 10
  • 24
  • 3
    For me this was the most useful way to add arguments to Jupyter notebook. Thank you – novastar Apr 13 '20 at 14:33
  • I didnt want to deep dive into the whole topic, just get my 3rd party code, called from a jupyter notebook, to work. This did. – DISC-O May 14 '22 at 18:57
  • such a simple trick, I don't know why I didn't think about it, Thanks ! – zanga Oct 21 '22 at 08:24
50

An alternative to use argparse in Ipython notebooks is passing a string to:

args = parser.parse_args() (line 303 from the git repo you referenced.)

Would be something like:

parser = argparse.ArgumentParser(
        description='Searching longest common substring. '
                    'Uses Ukkonen\'s suffix tree algorithm and generalized suffix tree. '
                    'Written by Ilya Stepanov (c) 2013')

parser.add_argument(
        'strings',
        metavar='STRING',
        nargs='*',
        help='String for searching',
    )

parser.add_argument(
        '-f',
        '--file',
        help='Path for input file. First line should contain number of lines to search in'
    )

and

args = parser.parse_args("AAA --file /path/to/sequences.txt".split())

Edit: It works

tbrittoborges
  • 965
  • 1
  • 6
  • 16
  • 4
    @mithrado @rjurney Almost, this works: `args = parser.parse_args(['--file', '/path/to/sequences.txt'])`, i.e. you need to pass an array of strings where each element is an argument that would normally be separated by a space in the command line. – jjs May 10 '16 at 22:21
  • 2
    @jjs the way to split the sequence automatically is to use `shlex.split`: `args = parser.parse_args(shlex.split("AAA --file /path/to/sequences.txt"))` – zenpoy Jul 30 '17 at 14:36
25

I've had a similar problem before, but using optparse instead of argparse.

You don't need to change anything in the original script, just assign a new list to sys.argv like so:

if __name__ == "__main__":
    from Bio import SeqIO
    path = '/path/to/sequences.txt'
    sequences = [str(record.seq) for record in  SeqIO.parse(path, 'fasta')]
    sys.argv = ['-f'] + sequences
    main()
BioGeek
  • 21,897
  • 23
  • 83
  • 145
25

If all arguments have a default value, then adding this to the top of the notebook should be enough:

import sys
sys.argv = ['']

(otherwise, just add necessary arguments instead of the empty string)

nivniv
  • 3,421
  • 5
  • 33
  • 40
4

Clean sys.argv

import sys; sys.argv=['']; del sys

https://github.com/spyder-ide/spyder/issues/3883#issuecomment-269131039

hyun woo Cho
  • 2,182
  • 1
  • 10
  • 9
3

I ended up using BioPython to extract the sequences and then editing Ilya Steanov's implementation to remove the argparse methods.

import imp
seqs = []
lcsm = imp.load_source('lcsm', '/path/to/ukkonen.py')
for record in SeqIO.parse('/path/to/sequences.txt', 'fasta'):
    seqs.append(record)
lcsm.main(seqs)

For the algorithm, I had main() take one argument, his strings variable, but this sends the algorithm a list of special BioPython Sequence objects, which the re module doesn't like. So I had to extract the sequence string

suffix_tree.append_string(s)

to

suffix_tree.append_string(str(s.seq))

which seems kind of brittle, but that's all I've got for now.

Niels
  • 1,513
  • 1
  • 14
  • 21
3

I face a similar problem in invoking argsparse, the string '-f' was causing this problem. Just removing that from sys.srgv does the trick.

import sys
if __name__ == '__main__':
    if '-f' in sys.argv:
        sys.argv.remove('-f')
    main()
drew_psy
  • 95
  • 8
2

Here is my code which works well and I won't worry about the environment changed.

import sys
temp_argv = sys.argv

try:
    sys.argv = ['']
    print(sys.argv)
    args = argparse.parser_args()
finally:
    sys.argv = temp_argv
    print(sys.argv)
Veagau
  • 21
  • 2
1

Suppose you have this small code in python:

import argparse

parser = argparse.ArgumentParser()
parser.add_argument("-v", "--verbose", help="increase output verbosity",
                    action="store_true")
parser.add_argument("-v_1", "--verbose_1", help="increase output verbosity",
                    action="store_true")
args = parser.parse_args()

To write this code in Jupyter notebook write this:

import argparse

args = argparse.Namespace(verbose=False, verbose_1=False)

Note: In python, you can pass arguments on runtime but in the Jupyter notebook that will not be the case so be careful with the data types of your arguments.

Jeremy Caney
  • 7,102
  • 69
  • 48
  • 77
Arzam Abid
  • 11
  • 2
1

If arguments passed by the iPython environment can be ignored (do not conflict with the specified arguments), then the following works like a charm:

# REPLACE   args = parser.parse_args()   with:
args, unknown = parser.parse_known_args()

From: https://stackoverflow.com/a/12818237/11750716

Jean Monet
  • 2,075
  • 15
  • 25
-1

If you don't want to change any of the arguments and working mechanisms from the original argparse function you have written or copied.

To let the program work then there is a simple solution that works most of the time.

You could just install jupyter-argparser using the below command:

pip install jupyter_argparser

The codes work without any changes thanks to the maintainer of the package.

PaladiN
  • 4,625
  • 8
  • 41
  • 66