2

I'm writing a script that takes one required parameter, and then can change the interpretation of the following arguments based on that. Most of the combinations are going well, but there's one that is giving me trouble. It's a repeating group of three parameters, all strings. For example:

$ python script.py p1 p2_1 p2_2 p2_3 p3_1 p3_2 p3_3

or as pseudo-regex:

$ python script.py p1 (px_1 px_2 px_3)+

I have no control over the format of the input. There is the option to receive this via stdin versus on the command line though. It's probably just easier to deal with this as a string using a regex, which would also allow me to handle both, by joining argv.

There are several other SO answers that sort of address doing something similar with argparse.

hpaulj has two helpful responses here: Argparse, handle repeatable set of items and here: Python argparser repeat subparse

After now several hours, I have yet to figure out how to make this work with argparse without some hackery. First, stripping off the first param, then iterating until the remaining params are gone. I'd like to keep it in the same namespace object, like it would be if I could figure out how to do this properly. Some demo code based on one of the answers above:

#!/usr/bin/env python
import argparse

parser = argparse.ArgumentParser()
parser.add_argument('param1', type=str)
param1, remaining = parser.parse_known_args()

parser = argparse.ArgumentParser()
parser.add_argument('first', type=str, action='append')
parser.add_argument('second', type=str, action='append')
parser.add_argument('third', type=str, action='append')

repeating_args, remaining = parser.parse_known_args(remaining)
while remaining:
    another_set, remaining = parser.parse_known_args(remaining)
    repeating_args.first.append(another_set.first[0])
    repeating_args.second.append(another_set.second[0])
    repeating_args.third.append(another_set.third[0])

But this just feels kludgy to me and it forces me to modify the code in a way that impacts the other parameter combinations. Is there a better way to do this with argparse? Or if I'm not happy with this, should I just not be using argparse? Unfortunately, that would mean I get to rewrite a lot of my code...

Thanks.

UPDATED CODE:

Based on hpaulj's answers, this is a compacted and working version of the code I'm using. It's much better than the above code, since it's relatively generic with regard to the parser configuration. I hope it helps.

#!/usr/bin/env python
import sys
import argparse

def parse_args():

    # Create the first parser object and get just the first parameter
    parser = argparse.ArgumentParser('Argument format parser')
    parser.add_argument('arg_format', type=str, help='The first argument.' +
                        'It tells us what input to expect next.')
    args_ns, remaining = parser.parse_known_args()

    # Generate a new parser based on the first parameter
    parser = formatSpecificParser(args_ns.arg_format)

    # There will always be at least one set of input (in this case at least)
    args_ns, remaining = parser.parse_known_args(args=remaining, namespace=args_ns)

    # Iterate over the remaining input, if any, adding to the namespace
    while remaining:
        args_ns, remaining = parser.parse_known_args(args=remaining,
                                                     namespace=args_ns)

    return args_ns

def formatSpecificParser(arg_format):
    parser = argparse.ArgumentParser("Command line parser for %s" % arg_format)
    if (arg_format == "format_1"):
        addArgsFormat1(parser)
    # elif (...):
        # other format function calls
    return parser

def addArgsFormat1(parser):
    parser.add_argument('arg1', type=str, action='append', help='helpful text')
    parser.add_argument('arg2', type=str, action='append', help='helpful text')
    parser.add_argument('arg3', type=str, action='append', help='helpful text')

def main(argv):
    args = parse_args()
    print (args)

if __name__ == "__main__":
    main(sys.argv[1:])

Command line output:

$ ./example.py format_1 foo bar baz meh meh meh e pluribus unum
Namespace(arg1=['foo', 'meh', 'e'], arg2=['bar', 'meh', 'pluribus'], arg3=['baz', 'meh', 'unum'], arg_format='format_1')
Community
  • 1
  • 1
  • May I ask where are the arguments coming from? Why do you have no control over them? – Reut Sharabani Dec 28 '14 at 23:42
  • Upstream application that I can't change. – Over_canvassed Dec 28 '14 at 23:54
  • I don't think argparse is the way to go... I'm just not sure what your case is. Argparse is meant to parse argument (surprisingly). While it could work, it doesn't sound like the tool for the job. If you can call a python script with the data as it's parameters, maybe you can stream the parameters somehow directly to the script using a messaging system? – Reut Sharabani Dec 28 '14 at 23:58
  • Yeah, it started off simple as simple arg handling. And works well for a dozen other cases. But this seems to be pushing what argparse is meant to do. I think taking stdin or converting argv to a string and using a regex is maybe the cleaner and yet least disruptive solution. I'll still wait to see if someone has a suggestion. – Over_canvassed Dec 29 '14 at 00:15
  • What is calling the script? – Reut Sharabani Dec 29 '14 at 00:16

1 Answers1

0

Here's a rough sequence that could simplify the repeated part:

In [10]: p=argparse.ArgumentParser()

In [11]: p.add_argument('p2',nargs=3,action='append')

In [12]: ns,rest=p.parse_known_args('p21 p22 p23 p31 p32 p33'.split())

In [13]: ns
Out[13]: Namespace(p2=[['p21', 'p22', 'p23']])

In [14]: rest
Out[14]: ['p31', 'p32', 'p33']

In [15]: ns,rest=p.parse_known_args(rest,ns)  # repeat as needed

In [16]: ns
Out[16]: Namespace(p2=[['p21', 'p22', 'p23'], ['p31', 'p32', 'p33']])

Normally 'append' doesn't make sense with positionals, since they can't be repeated. But here it conveniently produces a list of sublists. Passing an earlier Namespace to the next parsing step lets you build on values that have already been parsed. This should work just as well with your 3 positional arguments as my one with nargs=3.

hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • Thanks! Passing the namespace back in is what I came up with last night after posting. You've just confirmed that it's probably the best option other than a full refactor. It's clean enough. The ability to iterate through the input with different parsers as needed, but working in the same namespace is actually a pretty cool approach. Just need to be careful not to clobber anything by accident. I'll put my updated code in the question as a reference for others. Thanks again! – Over_canvassed Dec 29 '14 at 17:09