1

I am trying to write a program to read a text document and output the longest word in the document. If there are multiple longest words (i.e., all of equal length) then I need to output them all in the same order in which they occur. For example, if the longest words were dog and cat your code should produce:

dog cat

I am having trouble finding out how to select numerous words of equal max length and print them. This is as far as I've gotten, I am just struggling to think of how to select all words with equal max length:

open the file for reading

fh = open('poem.txt', 'r')

longestlist = []  
longestword = ''  

for line in fh:
    words = (line.strip().split(' '))  
    for word in words:  
        word = ''.join(c for c in word if c.isalpha())  
        if len(word) > (longestword):  
            longest.append(word)

for i in longestlist:  
    print i  
jeff carey
  • 2,313
  • 3
  • 13
  • 17
Tamjid
  • 4,326
  • 4
  • 23
  • 46
  • Shouldn't `longestword` be an integer? and also you need to update it every time you find a longer word. – afsafzal Sep 26 '16 at 00:22
  • Think about your check: if the lengths are equal, add this word to the longestlist. If the length of word is greater than length of longestword, then you have a new longest word so you should erase your old list and create a new list containing the new longest word. – jeff carey Sep 26 '16 at 00:25

3 Answers3

2

Ok, first off, you should probably use a with as statement, it just simplifies things and makes sure you don't mess up. So

fh = open('poem.txt', 'r')

becomes

with open('poem.txt','r') as file:

and since you're just concerned with words, you might as well use a built-in from the start:

    words = file.read().split()

Then you just set a counter of the max word length (initialized to 0), and an empty list. If the word has broken the max length, set a new maxlength and rewrite the list to include only that word. If it's equal to the maxlength, include it in the list. Then just print out the list members. If you want to include some checks like .isalpha() feel free to put it in the relevant portions of the code.

maxlength = 0
longestlist = []  
for word in words:
    if len(word) > maxlength:
        maxlength = len(word)
        longestlist = [word]
    elif len(word) == maxlength:
        longestlist.append(word)
for item in longestlist:  
    print item

-MLP

MLP
  • 106
  • 4
1

What you need to do is to keep a list of all the longest words you've seen so far and keep the longest length. So for example, if the longest word so far has the length 5, you will have a list of all words with 5 characters in it. As soon as you see a word with 6 or more characters, you will clear that list and only put that one word in it and also update the longest length. If you visited words with same length as the longest you should add them to the list.

P.S. I did not put the code so you can do it yourself.

afsafzal
  • 592
  • 5
  • 15
0

TLDR

Showing the results for a file named poem.txt whose contents are:

a dog is by a cat to go hi

>>> with open('poem.txt', 'r') as file:
...   words = file.read().split()
...
>>> [this_word for this_word in words if len(this_word) == len(max(words,key=len))]
['dog', 'cat']

Explanation

You can also make this faster by using the fact that <file-handle>.read.split() returns a list object and the fact that Python's max function can take a function (as the keyword argument key.) After that, you can use list comprehension to find multiple longest words.

Let's clarify that. I'll start by making a file with the example properties you mentioned,

For example, if the longest words were dog and cat your code should produce:

dog cat

{If on Windows - here I specifically use cmd}

>echo a dog is by a cat to go hi > poem.txt

{If on a *NIX system - here I specifically use bash}

$ echo "a dog is by a cat to go hi" > poem.txt

Let's look at the result of the <file-handle>.read.split() call. Let's follow the advice of @MLP and use the with open ... as statement.

{Windows}

>python

or possibly (with conda, for example)

>py

{*NIX}

$ python3

From here, it's the same.

>>> with open('poem.txt', 'r') as file:
...   words = file.read().split()
...
>>> type(words)
<class 'list'>

From the Python documentation for max

max(iterable, *[, key, default])

max(arg1, arg2, *args[, key])

Return the largest item in an iterable or the largest of two or more arguments.

If one positional argument is provided, it should be an iterable. The largest item in the iterable is returned. If two or more positional arguments are provided, the largest of the positional arguments is returned.

There are two optional keyword-only arguments. The key argument specifies a one-argument ordering function like that used for list.sort(). The default argument specifies an object to return if the provided iterable is empty. If the iterable is empty and default is not provided, a ValueError is raised.

If multiple items are maximal, the function returns the first one encountered. This is consistent with other sort-stability preserving tools such as sorted(iterable, key=keyfunc, reverse=True)[0] and heapq.nlargest(1, iterable, key=keyfunc).

New in version 3.4: The default keyword-only argument.

Changed in version 3.8: The key can be None.

Let's use a quick, not-so-robust way to see if we meet the iterable requirement (this SO Q&A gives a variety of other ways).

>>> hasattr(words, '__iter__')
True

Armed with this knowledge, and remembering the caveat, "If multiple items are maximal, the function returns the first one encountered.", we can go about solving the problem. We'll use the len function (use >>> help(len) if you want to know more).

>>> max(words, key=len)
'dog'

Not quite there. We just have the word. Now, it's time to use list comprehension to find all words with that length. First getting that length

>>> max_word_length = len(max(words, key=len))
>>> max_word_length
3

Now for the kicker.

>>> [this_word for this_word in words if len(this_word) == len(max(words,key=len))]
['dog', 'cat']

or, using the commands from before, and making things a bit more readable

>>> [this_word for this_word in words if len(this_word) == max_word_length]
['dog', 'cat']

You can use a variety of methods you'd like if you don't want the list format, i.e. if you actually want

dog cat

but I need to go, so I'll leave it where it is.

Community
  • 1
  • 1
bballdave025
  • 1,347
  • 1
  • 15
  • 28