-3

I am trying to read-in a file, and then write it taking out all the "blank" lines. I don't mean strip the lines of all white space. i want to keep all indentation. i just want to get rid of the blank lines in between the lines of text. I was going to read the string with a loop and take out all the blank lines but i got a bit stumped on exactly how to define a blank line. You shouldnt have to loop a variable char by char to determine this.. right? It seems like it would be a "\n" with no characters before or after it. basically a "\n" alone, but so far i haven't been able to define this. so i've deferred to trying to do it on a line by line basis when reading and writing the file. The file needs to maintain the indentation integrity because it will not run otherwise.

lines = []  

with open(file_name, mode='r+', encoding='utf-8') as f: 
   for line in f:
      if len(line) > 2:
        lines.append(line) 
        print(len(line)) 

   for line in f:
     f.write((str(line) for line in lines))

i only invoked the print function to tell me how many characters were in a line as it is read in, and it seems the blank lines only read as 1 char.("\n" i assumed?) so i tried to only append anything > 1. So far it just writes it back to the file exactly how it is read in. i'm not 100 on my use of the write method as it is shown here either. Ive just taken up python about 10 days ago, and i'm sure i am doing something wrong, that it's most likely really easy.

  • 1
    This code isn't writing anything to the output file. You left the original data in place, and the second loop never executed because `f` was exhausted (and if it did execute, it would die with a `TypeError` because you can't write generator expressions themselves to a file). – ShadowRanger Sep 17 '21 at 12:53
  • its welll explained in the question. it tells me how many chars are in each line so i know how many a blank line represents. i assume its at least a "\n" it was there to debug – gendisarray Sep 17 '21 at 12:54
  • that is most likely the problem.. forgive me.. just started learning this less than 2 weeks ago – gendisarray Sep 17 '21 at 12:56
  • so i need to open the file again then? – gendisarray Sep 17 '21 at 12:58

1 Answers1

3

Your code has several problems:

  1. Testing length isn't a safe way to check for empty lines (the final line of a file might not be newline terminated, and you'd discard it if it only had one character).
  2. You don't seek back to the beginning of the file, so you'd never replace the current data; the file is exhausted, so the second loop never occurs.
  3. You can't call write on a generator expression; it's not a string, so it would die if it executed.
  4. Thanks to #2 though, #3 doesn't bite you, because the second loop never runs (it tries to loop over an exhausted file, which has no lines to loop over).

Fixed code:

lines = []  

with open(file_name, mode='r+', encoding='utf-8') as f: 
   for line in f:
      if line.rstrip('\n'):  # Check if it's empty after removing EOL character
        lines.append(line) 
        print(len(line)) 

   f.seek(0)            # Seek back to beginning of file so you overwrite existing data
   f.writelines(lines)  # Write out all the lines (they're already str, so no need to convert
                        # and writelines handles an iterable of lines, write doesn't)
   f.truncate()         # Truncate file to current file location so excess data removed
ShadowRanger
  • 143,180
  • 12
  • 188
  • 271
  • 1
    Note that a safer way to do this is to write to a *new* file in the same directory, then perform an atomic rename, replacing the old file with the new one; as is, sudden power loss could lead to a mish-mash of old and new data. There's plenty of guides for doing this out there, so I'll avoid getting into the weeds. – ShadowRanger Sep 17 '21 at 13:02
  • the .strip function removes all leading indentation. I can read and write it fine. i understand the second loop in this code is useless, but i dont want to strip the leading whitespace off any of the lines. and i dont want to strip the \n on any lines with text. – gendisarray Sep 17 '21 at 13:03
  • @gendisarray: I didn't use `strip`, I used `rstrip`, which only strips from the end. Also, I passed an argument so it only strips the newline, not arbitrary whitespace. And lastly, I don't save the result, so it doesn't matter anyway; the original unmodified line is what gets stored and rewritten. So your concern is unfounded on three separate levels. :-) – ShadowRanger Sep 17 '21 at 13:04
  • @gendisarray: If you want to strip lines that are nothing but whitespace, you could just remove the `'\n'` argument to `rstrip` in the test; it still won't change the original string (it returns a modified copy without the whitespace), but it will eliminate all trailing whitespace and test if what remains is non-empty. – ShadowRanger Sep 18 '21 at 23:44