-1

I got stuck while learning on-line Python 3 (Coursera. Love them).

I have a .txt file with several lines. each line has several words in it. I need to split the files to line, I'm already using:

for line in fh:
    line = line.rstrip()
    line.split()

Now I need to sort the words from the entire file (from all the lines) Alphabetically, but I get are strings in the list (that I already created named lst). So I get the entire strings sorted alphabetically but couldn't sort the words inside them sorted.

I see that it looks like the question already been asked, but it's a little bit different. Let's take this text for example :

"In the beginning of creation of the heavens and the earth
Now the earth was astonishingly empty and darkness was"

What I need is to sort the entire list of words alphabetically (after trimming same words), so the result should be ['In', 'Now', 'and', 'astonishingly', 'beginning'] and so on.

I really tried to find in myself and I'm taking this course seriously. I need some help here.

Thank you in advance,

Dadep
  • 2,796
  • 5
  • 27
  • 40
  • Welcome to SO. Usually you should include some example data and your attempt to solve the problem. Please read [ask] and [mcve] – wwii Aug 08 '17 at 19:02
  • Create an empty list `all_words = []` at the top, and for each `line`, you can `all_words.extend(line.rstrip().split())` and `all_words.sort()` at the end. – ryugie Aug 08 '17 at 19:05
  • 1
    Possible duplicate of [Python - arranging words in alphabetical order](https://stackoverflow.com/questions/13809542/python-arranging-words-in-alphabetical-order) – wwii Aug 08 '17 at 19:06
  • No need for all that iteration because `str.split()` without any arguments splits on *any* whitespace and it treats consecutive whitespace as single delimiter. So `"a b\t\tc\nd e f\n \ng".split()` returns `['a', 'b', 'c', 'd', 'e', 'f', 'g']`. So just do `words = fh.read().split()` and then `words.sort()`. – Steven Rumbalski Aug 08 '17 at 19:06
  • Do you want unique words or all words? Does case matter? Is `"Now"` the same as `"now"`? What about `"James"` and `"james"`? Does your text include punctuation that you need to ignore? In `"Python 3"` are both `"Python"` and `"3"` words? – Steven Rumbalski Aug 08 '17 at 19:53

1 Answers1

1

If all you want is a list of all unique words in the file, and assuming there is no punctuation or other non-word content

word_list = sorted({ w for line in fh for w in line.strip().split() })

will do it

if you want the words sorted on a line by line basis

words = [ sorted(set(line.strip().split())) for line in fh ]

will do that

kdopen
  • 8,032
  • 7
  • 44
  • 52
  • `line.strip().split()` will always equal `line.split()`. But why not even more directly with `word_list = sorted(fh.read().split())`? – Steven Rumbalski Aug 08 '17 at 19:38
  • Thank you! It works though we didn't work on the sorted() func but a little bit different. I appreciate your time and help. Thanks –  Aug 08 '17 at 20:28