1

I am writing a script to accept (optional) two arguments in command line: --top to return top words by count e.g. --top 5, returns top 5; --lower to lower a list of words before counting the unique values.

I got to this stage and I am getting no output:

import collections
import argparse

def counts(text, top = 10, case = None):
    """ returns counts. Default is top 10 frequent words without change of case"""
    # split on whitespace
    word_list = text.split()

    if case is None:
        c = collections.Counter(word_list)
        return c.most_common(top)
    else:
        c = collections.Counter([w.lower() for w in word_list])
        return c.most_common(top)

# declare parser
parser = argparse.ArgumentParser()

# add argument --top
parser.add_argument("--top", help="returns top N words. If not specified it returns top 10", type=int)

# add argument --lower
parser.add_argument("--lower", help = "lowercase all the words.('StackOverFlow' and 'stackoverflow' are counted equally.")

# add argument filename
parser.add_argument("filename", help = "accepts txt file")

args = parser.parse_args()

# read text file
file = open(args.filename, 'r').read()

counts(text = file, top = args.top, case = args.lower)

When I run the script with

$python script.py text.txt --top 5 --lower

I get no output. Any clue where I am going wrong?

If the file were to output something, I would expect:

(word1 count1)
(word2 count2)
(word3 count3)
(word4 count4)
(word5 count5)
tavalendo
  • 857
  • 2
  • 11
  • 30
  • 4
    If you are using ArgParse, do this properly and make it handle all your arguments. Do not use `sys.argv` – Marcin Orlowski Feb 12 '19 at 05:29
  • Check out this question for help with using file arguments with argparse: https://stackoverflow.com/questions/18862836/how-to-open-file-using-argparse – Steve Boyd Feb 12 '19 at 05:31
  • 1
    `parse_args` parses `sys.argv[1:]`. I'd expect it to raise an `unknown value` error if it encountered the 'text.txt' string first. The error appears to be produced by the `open` command, using `open('--top'). That too makes it look like you did not include 'text.txt' or you put it last. – hpaulj Feb 12 '19 at 05:45
  • Thanks for the comments, I fixed the post based on the suggestions. But the script still does not give any output. – tavalendo Feb 12 '19 at 05:58
  • `print(args)` before running `counts`. That way we can be sure the parsing is correct. Whether `counts` works is another matter. – hpaulj Feb 12 '19 at 06:17
  • Doesn't the parser complain about `--lower` missing an argument? – hpaulj Feb 12 '19 at 06:20
  • Yes, I get: script.py: error: argument --lower: expected one argument – tavalendo Feb 12 '19 at 06:26
  • 1
    @feijao That's because you use the default [action](https://docs.python.org/3/library/argparse.html#action) `store` for `--lower`; it needs any value in that case, e.g. `--lower 1`. You could also use action `store_true`, then you can omit a value for `--lower`. You'd have to change the condition in `count()` in that case though, because `args.lower` can only ever be `True` or `False` then. Oh, and the missing output ... you don't print anything. `print(counts(text = file, top = args.top, case = args.lower))` – shmee Feb 12 '19 at 07:08
  • Thank you @shmee, I did the changes as answer and works well. :-) – tavalendo Feb 12 '19 at 07:30

1 Answers1

1

Based on amazing comments above, the working code is:

import collections
import argparse

def counts(text, top = 10, case = False):
    """ returns counts. Default is top 10 frequent words without change of case"""
    # split on whitespace
    word_list = text.split()

    if case is False:
        c = collections.Counter(word_list)
        return c.most_common(top)
    else:
        c = collections.Counter([w.lower() for w in word_list])
        return c.most_common(top)

# declare parser
parser = argparse.ArgumentParser()

# add argument --top
parser.add_argument("--top", help="returns top N words. If not specified it returns top 10", type=int)

# add argument --lower
parser.add_argument("--lower", help = "lowercase all the words.('StackOverFlow' and 'stackoverflow' are counted equally.",action='store_true')

# add argument filename
parser.add_argument("filename", help = "accepts txt file")

args = parser.parse_args()

# read text file
file = open(args.filename, 'r').read()

if args.top:
    print(counts(text = file, top = args.top, case = args.lower))
else:
    print(counts(text = file, case = args.lower))
tavalendo
  • 857
  • 2
  • 11
  • 30
  • 1
    You could also add [`default=10`](https://docs.python.org/3/library/argparse.html#default) to your definition of `--top`. Then you would not need to worry about `args.top` being `None` when calling `counts` ;) – shmee Feb 12 '19 at 08:06