1

I have the following script which identifies lines in a file which I want to remove, based on an array but does not remove them.

What should I change?

sourcefile = "C:\\Python25\\PC_New.txt" 
filename2 = "C:\\Python25\\PC_reduced.txt"

offending = ["Exception","Integer","RuntimeException"]

def fixup( filename ): 
    print "fixup ", filename 
    fin = open( filename ) 
    fout = open( filename2 , "w") 
    for line in fin.readlines(): 
        for item in offending: 
                print "got one",line 
                line = line.replace( item, "MUST DELETE" ) 
                line=line.strip()
                fout.write(line)  
    fin.close() 
    fout.close() 

fixup(sourcefile)
romesub
  • 233
  • 4
  • 10

4 Answers4

5
sourcefile = "C:\\Python25\\PC_New.txt" 
filename2 = "C:\\Python25\\PC_reduced.txt"

offending = ["Exception","Integer","RuntimeException"]

def fixup( filename ): 
    fin = open( filename ) 
    fout = open( filename2 , "w") 
    for line in fin: 
        if True in [item in line for item in offending]:
            continue
        fout.write(line)
    fin.close() 
    fout.close() 

fixup(sourcefile)

EDIT: Or even better:

for line in fin: 
    if not True in [item in line for item in offending]:
        fout.write(line)
jamylak
  • 128,818
  • 30
  • 231
  • 230
zifot
  • 2,688
  • 20
  • 21
  • 5
    or: `if any(item in line for item in offending):` Also, if you're using `fout.write`, then you should be able to go `for line in fin`. – Ryan Ginstrom Jun 15 '10 at 06:16
  • shouldn't you split the line in tokens first? otherwise `item`would be characters. – fortran Jun 15 '10 at 06:56
2

The basic strategy is to write a copy of the input file to the output file, but with changes. In your case, the changes are very simple: you just omit the lines you don't want.

Once you have your copy safely written, you can delete the original file and use 'os.rename()' to rename your temp file to the original file name. I like to write the temp file in the same directory as the original file, to make sure I have permission to write in that directory and because I don't know if os.rename() can move a file from one volume to another.

You don't need to say for line in fin.readlines(); it is enough to say for line in fin. When you use .readlines() you are telling Python to read every line of the input file, all at once, into memory; when you just use fin by itself you read one line at a time.

Here is your code, modified to do these changes.

sourcefile = "C:\\Python25\\PC_New.txt" 
filename2 = "C:\\Python25\\PC_reduced.txt"

offending = ["Exception","Integer","RuntimeException"]

def line_offends(line, offending):
    for word in line.split():
        if word in offending:
            return True
    return False

def fixup( filename ): 
    print "fixup ", filename 
    fin = open( filename ) 
    fout = open( filename2 , "w") 
    for line in fin:
        if line_offends(line, offending):
            continue
        fout.write(line)
    fin.close()
    fout.close()
    #os.rename() left as an exercise for the student

fixup(sourcefile)

If line_offends() returns True, we execute continue and the loop continues without executing the next part. That means the line never gets written. For this simple example, it would really be just as good to do it this way:

    for line in fin:
        if not line_offends(line, offending):
            fout.write(line)

I wrote it with the continue because often there is non-trivial work being done in the main loop, and you want to avoid all of it if the test is true. IMHO it is nicer to have a simple "if this line is unwanted, continue" rather than indenting a whole bunch of stuff inside an if for a condition that might be very rare.

steveha
  • 74,789
  • 21
  • 92
  • 117
0

You're not writing it to the output file. Also, I would use "in" to check for the string existing in the line. See the modified script below (not tested):

sourcefile = "C:\\Python25\\PC_New.txt" 
filename2 = "C:\\Python25\\PC_reduced.txt"

offending = ["Exception","Integer","RuntimeException"]

def fixup( filename ): 
    print "fixup ", filename 
    fin = open( filename ) 
    fout = open( filename2 , "w") 

    for line in fin.readlines(): 
        if not offending in line:
            # There are no offending words in this line
            # write it to the output file
            fout.write(line)

    fin.close() 
    fout.close() 

fixup(sourcefile)
Sam Dolan
  • 31,966
  • 10
  • 88
  • 84
  • Well, you were faster by about 30 secs but your version is not going to work. :) – zifot Jun 15 '10 at 06:13
  • @zifot - I told you I didn't test it :) Embarrassing, but that's what I get for answering questions on SO after working for 14 hours. – Sam Dolan Jun 15 '10 at 06:25
0

'''This is a rather simple implementation but should do what you are searching for'''

sourcefile = "C:\\Python25\\PC_New.txt"

filename2 = "C:\\Python25\\PC_reduced.txt"

offending = ["Exception","Integer","RuntimeException"]

def fixup( filename ): 

    print "fixup ", filename 
    fin = open( filename ) 
    fout = open( filename2 , "w") 
    for line in fin.readlines(): 
        for item in offending: 
                print "got one",line 
                line = line.replace( item, "MUST DELETE" ) 
                line=line.strip()
                fout.write(line)  
    fin.close() 
    fout.close() 

fixup(sourcefile)
badp
  • 11,409
  • 3
  • 61
  • 89