0

My program tries to read a file and process its contents. The file to be processed contains

core-001
core-001
core-002
core-003
core-003
...
core-nnn

To process it, I wanted to read out every line, stuff them in a list, remove the duplicates and then put them out again in some other file. The code for these first three things I used is as follows:

content = []
with open(file,'r') as openFile:
        content = [line.strip('\n') for line in openFile]
content = list(set(content))

(Why I use list and set)
As far as I see, this should not have any problems, however two errors are returned:

Traceback (most recent call last):
  File "/path/to/file", line 1, in <module>
    core-004
NameError: name 'core' is not defined

and

File "/path/to/file", line 21
    core-009
           ^
SyntaxError: invalid token

What causes these errors and, more importantly, how to avoid them?

EDIT As also readable in the comments, but repeated here: It was not an error in the code, it was just me not coding well enough. The errors were given by python trying to execute the input file as I seemed to have forgotten to give it the executable and only the parameters. After doing so it works perfectly. I thank you for your time and your kind comments.

Community
  • 1
  • 1
Simon Klaver
  • 480
  • 5
  • 24

3 Answers3

1

A better way to do that is

import sys
lines = sys.stdin.readlines()
print ''.join(sorted(set(lines)))

Here the program takes input from the system and prints it out. You can use this as

python run.py < input.txt > output.txt
Bhargav Rao
  • 50,140
  • 28
  • 121
  • 140
1

This answer is a bit late, and I do not have any usable memory of it or the code available, so I answer this question based on the comments of MightyPork, Tom Dalton and myself.

Apparently the problem was that I did not run the program.
Instead of running

python <name>.py param1 param2 ...

I ran

python param1 param2 ...

which failed as my param1 seems to have been a file containing the topmost text in the question.

I however am unaware of how I got two different error messages: I might have have been giving different files as param1 or something alike.

Therefore it was not an error in the code, as the other answers (and my question) suggested.

Community
  • 1
  • 1
Simon Klaver
  • 480
  • 5
  • 24
0

Here's what I suggest. You should use a set, which is a built-in data type that stores only unique values. This means that there won't be any repeats just as you would like. Try this:

  1. Read the lines of your file.
  2. Add lines to set.
  3. Convert set to list.

    content = open('file.txt').readlines() # read file's lines
    content = [item.strip('\n') for item in content] # remove newlines
    
    content_set = set(content) # to set to remove repeats
    content_list = list(content_set) # back to list
    

EDIT Your code actually seems to work. Perhaps, the error lies in your use of the file built-in function as a variable. Could you provide the code before and after this segment? It seems that your file is being evaluated.

Malik Brahimi
  • 16,341
  • 7
  • 39
  • 70
  • You are aware my code does the exact same thing, only shorter? And to your edit: The only place I used the file is under the `with ... as ...:`. And at last, the answer is already found, see the comments below the question. – Simon Klaver Jan 15 '15 at 15:00
  • Yes, I added an edit mentioning that and a possible source of error. – Malik Brahimi Jan 15 '15 at 15:02