1

I get the text-string from the file. Afterwards I do a for loop over the string (I have to save words) and all works fine except the last word in the file, if there's no separator after it.

My code:

for symbol in text:
        if symbol == ' ' or symbol == '-' or symbol == ',' or symbol == '\n':
            lastWord = ''.join(tmpList)
            del tmpList[:]
            print lastWord
        else:
            tmpList.append(symbol)

I've figured that there is no NULL-termination in Python. Maybe I'm trying to solve this in the C way, and such algorithm for Python is incorrect?

I've added count variable and one more check to "else" block and it works fine. I wonder if it's correct, or i can do the same easier in python. Else-block now looks like:

else:
    tmpList.append(symbol)
    count += 1
    if count == len(text):
      lastWord = ''.join(tmpList)
      del tmpList[:]
      print lastWord
lazyexpert
  • 11
  • 2

3 Answers3

1

the pythonic way of writing this :

if symbol == ' ' or symbol == '-' or symbol == ',' or symbol == '\n':

is :

if symbol in ' -,\n':

I think it would be better to tell what you want to do, do you want to just print the text minus the ' ', '-', ',', '\n' ?

cause if yes, the pythonic way is :

for char in '-,\n':
    text = text.replace(char, ' ')
for word in text.split():
    print(word)

if the string is big or performance matters, take a look at the re module, it's perfect for this kind of jobs (look at the split function)

Ludovic Viaud
  • 202
  • 1
  • 5
  • no secret here, its one of the tutorial google tasks. The mimic.py. I have to make a dict of {key:value} where key is each word in file, and value - an array of the words going after it. As you see I'm on the very beginning :D – lazyexpert Nov 15 '14 at 10:24
  • Probably should rename `symbol` to `word` in the text.split() snippet. – Yann Vernier Nov 15 '14 at 13:47
0

Your code collects symbols in tmpList, and empties it out when it encounters a separator. One way to find the last word is simply to check if tmpList contains anything when the loop is done:

for symbol in text:
    if symbol == ' ' or symbol == '-' or symbol == ',' or symbol == '\n':
        lastWord = ''.join(tmpList)
        del tmpList[:]
        print lastWord
    else:
        tmpList.append(symbol)
if tmpList:
    lastWord = ''.join(tmpList)
    del tmpList[:]
    print lastWord

But clearly Ludovic's answer has a cleaner solution.

Yann Vernier
  • 15,414
  • 2
  • 28
  • 26
0

An easier solution may be to use the split() function.

words = text.split(' -,\n')

which will give you a list of words for you to process.

Simon Callan
  • 3,020
  • 1
  • 23
  • 34
  • No. str.split will split on whitespace if not given a separator, but if given a separator will require the whole separator, not just any character in it. – Yann Vernier Nov 15 '14 at 13:58