11

I am trying to replace text in a text file by reading each line, testing it, then writing if it needs to be updated. I DO NOT want to save as a new file, as my script already backs up the files first and operates on the backups.

Here is what I have so far... I get fpath from os.walk() and I guarantee that the pathmatch var returns correctly:

fpath = os.path.join(thisdir, filename)
with open(fpath, 'r+') as f:
    for line in f.readlines():
        if '<a href="' in line:
            for test in filelist:
                pathmatch = file_match(line, test)
                    if pathmatch is not None: 
                        repstring = filelist[test] + pathmatch
                        print 'old line:', line
                        line = line.replace(test, repstring)
                        print 'new line:', line
                        f.write(line)

But what ends up happening is that I only get a few lines (updated correctly, mind you, but repeated from earlier in the file) corrected. I think this is a scoping issue, afaict.

*Also: I would like to know how to only replace the text upon the first instance of the match, for ex., I don't want to match the display text, only the underlying href.

jml
  • 1,745
  • 6
  • 29
  • 55
  • 1
    Have you considered simply using `sed` instead? – Amber Jan 24 '11 at 04:41
  • http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 – Ignacio Vazquez-Abrams Jan 24 '11 at 04:44
  • @Amber: In ways. I really want to just finish this up and learn sed later. I am almost done with this... :) – jml Jan 24 '11 at 04:44
  • @Ignacio: that's not helpful at all (yes, I have read it). I am not building an all-encompassing parser, so it really doesn't apply. – jml Jan 24 '11 at 04:44
  • I saw this: http://stackoverflow.com/questions/39086/search-and-replace-a-line-in-a-file-in-python/290494#290494 Is this the preferred methodology? – jml Jan 24 '11 at 04:48
  • 2
    why not read from the backup file and write to the original file? – Dyno Fu Jan 24 '11 at 04:48
  • How about maintaining a list of all the lines post the transformation, and then reopening the file in write mode and writing back all those lines in the list into the file? – inspectorG4dget Jan 24 '11 at 05:04
  • @Dyno Fu: I could do this... Thanks for the suggestion. – jml Jan 24 '11 at 05:36

3 Answers3

10

First, you want to write the line whether it matches the pattern or not. Otherwise, you're writing out only the matched lines.

Second, between reading the lines and writing the results, you'll need to either truncate the file (can f.seek(0) then f.truncate()), or close the original and reopen. Picking the former, I'd end up with something like:

fpath = os.path.join(thisdir, filename)
with open(fpath, 'r+') as f:
    lines = f.readlines()
    f.seek(0)
    f.truncate()
    for line in lines:
        if '<a href="' in line:
            for test in filelist:
                pathmatch = file_match(line, test)
                    if pathmatch is not None: 
                        repstring = filelist[test] + pathmatch
                        line = line.replace(test, repstring)
        f.write(line)
Raph Levien
  • 5,088
  • 25
  • 24
  • very helpful... thanks tons. one weird thing is that it seemed to "almost" work the first time and then i ran it again and it doesn't do any of the replacement... any thoughts? i can try to track it down... – jml Jan 24 '11 at 05:11
  • hi again raph: i have modified my question a bit- would you be so kind as to explain to me how to update the code to allow for this type of single-match replacement? thanks... – jml Jan 24 '11 at 07:08
  • You can use the .sub() method on the regular expression object, with a count of 1. Another possibility is to refine your regex to match the `a href` in addition to just the path. – Raph Levien Jan 24 '11 at 16:37
  • K; I'll give it a try and start a new q if necessary. Thanks a ton. – jml Jan 24 '11 at 22:35
  • Hey again Raph- I have posted a new q here: http://stackoverflow.com/questions/4788532/returning-a-single-instance-of-a-regex-objects-contents – jml Jan 24 '11 at 23:47
10
  1. Open the file for read and copy all of the lines into memory. Close the file.
  2. Apply your transformations on the lines in memory.
  3. Open the file for write and write out all the lines of text in memory.

with open(filename, "r") as f:
    lines = (line.rstrip() for line in f)
    altered_lines = [some_func(line) if regex.match(line) else line for line in lines]
with open(filename, "w") as f:
    f.write('\n'.join(altered_lines) + '\n')
Anmol Singh Jaggi
  • 8,376
  • 4
  • 36
  • 77
hughdbrown
  • 47,733
  • 20
  • 85
  • 108
  • thanks for the alt suggestion hugh, but i think that i like the first solution better in terms of it matching my attempt. – jml Jan 24 '11 at 06:24
1

A (relatively) safe way to replace a line in a file.

#!/usr/bin/python 
# defensive programming style
# function to replace a line in a file
# and not destroy data in case of error

def replace_line(filepath, oldline, newline ):
  """ 
  replace a line in a temporary file, 
  then copy it over into the 
  original file if everything goes well

  """

 # quick parameter checks 
  assert os.exists(filepath)          # ! 
  assert ( oldline and str(oldline) ) # is not empty and is a string
  assert ( newline and str(newline) )

  replaced = False
  written  = False

  try:

    with open(filepath, 'r+') as f:    # open for read/write -- alias to f       

      lines = f.readlines()            # get all lines in file

      if oldline not in lines:
          pass                         # line not found in file, do nothing

      else:
        tmpfile = NamedTemporaryFile(delete=True)  # temp file opened for writing

        for line in lines:           # process each line
          if line == oldline:        # find the line we want 
            tmpfile.write(newline)   # replace it 
            replaced = True  
          else:
            tmpfile.write(oldline)   # write old line unchanged

        if replaced:                   # overwrite the original file     
          f.seek(0)                    # beginning of file
          f.truncate()                 # empties out original file

          for tmplines in tmpfile: 
            f.write(tmplines)          # writes each line to original file
          written = True  

      tmpfile.close()              # tmpfile auto deleted    
      f.close()                          # we opened it , we close it 

  except IOError, ioe:                 # if something bad happened.
    printf ("ERROR" , ioe)
    f.close()                        
    return False

  return replaced and written        # replacement happened with no errors = True 

(note: this replaces entire lines only , and all of the lines that match in the file)

Chris Reid
  • 460
  • 4
  • 9