Removing specific characters from multiple lines in a list by indexing

Question

I have a folder with multiple ascii coded txt files and would like to open all of them, read all the lines, write into the file and remove whitespaces if any and change/delete the first number of the 4th object in the list at the same time.

One file content looks like that, as a list:

['        0.200000\n', '        0.000000\n', '        0.000000\n', '       -0.200000\n', '  3400000.100000\n', '  5867999.900000\n']

At the end it should look like that:

['0.200000\n', '0.000000\n', '0.000000\n', '-0.200000\n', '400000.100000\n', '5867999.900000\n']

Whithout whitespaces and the first number in the 4th object

My code so far:

import glob, fileinput, os, shutil, string, tempfile, linecache,sys

pfad =  "D:\\Test\\"

filelist = glob.glob(pfad+"*.tfw")
if not filelist:
    print "none tfw-file found"
    sys.exit("nothing to convert")

for fileName in fileinput.input(filelist,inplace=True):
    data_list = [''.join(s.split()) for s in data_list]
    data_list[4]= data_list[4][1:]
print(data_list)
sys.stdout.write(data_list)

i have managed to modify the files at the same time but still can't overwrite them with a new content. I recieve the following error: "data_list = [''.join(s.split()) for s in data_list] NameError: name 'data_list' is not defined"

score 2 · Answer 1 · edited May 23 '17 at 10:33

You want to str.lstrip the leading whitespace:

for fileName in filelist:
    with open(fileName, "r" ) as f:
        lines = [line.lstrip()  for line in f]
        lines[4] = lines[4][1:]

Using with will close your files automatically, also ' 3400000.100000\n' is the fifth object in the list.

I have no idea what you are actually trying to do after you extract the lines as you don't store the data anywhere as you iterate, you just reassign to new values each iteration. If you want to write the data to a file then write as you iterate using file.writelines on the list:

for fileName in filelist:
    with open(fileName, "r" ) as f, open("{}_new".format(fileName),w") as out:
        lines = [line.lstrip() for line in f]
        lines[4] = lines[4][1:]
        out.writelines(lines)

If you want to replace the original use either approach from this answer

from tempfile import NamedTemporaryFile
from shutil import move
import os

for fileName in filelist:
    with open(fileName) as f, NamedTemporaryFile("w",dir=".", delete=False) as temp:
        for ind, line in enumerate(f):
            if ind == 4:
                temp.write(line.lstrip()[1:])        
            else:
                 temp.write(line.lstrip())
    move(temp.name, fileName)

...following your file.writelines example an attribute-error occures: " with open(fileName, "r" ) as f, open("{}_new".fileName,"w") as out: AttributeError: 'str' object has no attribute 'fileName' " — script80, Jul 12 '15 at 10:28
I want simply overwrite the existing files with a new content. I will update my code in a sec. — script80, Jul 12 '15 at 11:20
@script80, the answer I linked does exactly that, I added exactly how to do it — Padraic Cunningham, Jul 12 '15 at 11:44

raymelfrancisco · Accepted Answer · 2015-07-13T14:29:13.373

1

Actually, a list object is indexed. In your code, the first character of the 4th element (if we start counting at zero) is at data_list[4][0].

Using slicing, data_list[4][1:] will remove the first character of the 4th element.

Sample Script: You can test it here:

>>> # original list
>>> lst = ['        0.200000\n', '        0.000000\n', '        0.000000\n', '       -0.200000\n', '  3400000.100000\n', '  5867999.900000\n']
>>>
>>> # removes leading whitespaces from each string of the list
>>> lst = [ s.lstrip() for s in lst ]
>>>
>>> # removes the first character of the 4th string of the list
>>> lst[4] = lst[4][1:]
>>>
>>> # prints the modified list
>>> print(lst)
['0.200000\n', '0.000000\n', '0.000000\n', '-0.200000\n', '400000.100000\n', '5867999.900000\n']

Overwriting the file with the modified list:

Way 1: Closing and reopening in write mode:

for fileName in filelist:

    # open in read mode
    with open(fileName, 'r') as data_file:
        data_list = data_file.readlines()

        # list modification
        data_list = [ s.lstrip() for s in data_list ]
        data_list[4] = data_list[4][1:]

    # reopens file in write mode, deletes contents
    with open(fileName, 'w') as data_file:

        # overwriting
        for line in data_list:
            data_file.write(line)

Way 2: Using file.truncate() so that the file won't be closed and reopened:

for fileName in filelist:

    # open in read/write mode
    with open(fileName, 'r+') as data_file:
        data_list = data_file.readlines()

        # list modification
        data_list = [ s.lstrip() for s in data_list ]
        data_list[4] = data_list[4][1:]

        # removes file contents from first character to end
        data_file.truncate(0)

        # puts cursor to the start of the file
        data_file.seek(0)

        # overwriting
        for line in data_list:
            data_file.write(line)

edited Jul 13 '15 at 14:29

answered Jul 12 '15 at 08:28

raymelfrancisco

828
2
11
21

joining and splitting is a completely wrong approach – Padraic Cunningham Jul 12 '15 at 09:18
@PadraicCunningham Sir, can you explain why? I edited my answer. – raymelfrancisco Jul 12 '15 at 09:24
1

Firstly the OP seems to want to write the lines to a file and they keep all the newlines in their expected output. str.strip would do exactly the same job your code is doing with creating unnecessary lists and having to call join. Using lis as a variable name is also not a great idea – Padraic Cunningham Jul 12 '15 at 09:30
..and how can I write my modified lists to all txt files ("filelist") in that folder "pfad"? – script80 Jul 12 '15 at 09:32
@PadraicCunningham I didn't notice the newlines. Thank you! – raymelfrancisco Jul 12 '15 at 09:33
1

You can simply `str.lstrip` to remove leading whitespace – Padraic Cunningham Jul 12 '15 at 09:33
@script80 You need to open all the files using `filelist` file names in write or append mode, then do the writing. – raymelfrancisco Jul 12 '15 at 09:40
1

@PadraicCunningham Thank you very much sir! – raymelfrancisco Jul 12 '15 at 09:40
@script80 Are you trying to edit the contents of a file? Am I right? – raymelfrancisco Jul 12 '15 at 09:47
@raymelfrancisco Yes exactly. In my code i use the glob module, but it seems that it doesn't support modes such as "w" for write. – script80 Jul 12 '15 at 10:02
@script80 You use `glob` to grab filenames, you use `open()` in opening files, and it supports `w` for write. I suggest `r+` for both reading and writing. [The opening modes are exactly the same that C fopen() std library function.](http://stackoverflow.com/a/1466036/4895040) – raymelfrancisco Jul 12 '15 at 10:08
@script80 You can take a look here on how to overwrite a file: [Read and overwrite a file in Python](http://stackoverflow.com/a/2424410/4895040). Aside from that, you can close your file and open it again when you need to write the edited list. You can simply iterate through your edited data list and print each element to the same file. `file.write()` or `print()` are some of the ways to do it. Take note that `print()` has an argument `file`, that's why it can be used to write to a file. Example is `print(line, file='f.txt')` – raymelfrancisco Jul 12 '15 at 10:26
@raymelfrancisco I tried your Way2, but unfortunately I get a "data_list.write(line) AttributeError: 'list' object has no attribute 'write' " attribute-error. – script80 Jul 12 '15 at 11:55
@script80 I'm sorry, it's just a typo. It should be `data_file.write()` – raymelfrancisco Jul 12 '15 at 12:46

score 0 · Answer 3 · answered Jul 12 '15 at 08:25

this does what you want:

import io

file0 = io.StringIO(''' 0.200000
 0.000000
 0.000000
 -0.200000
 3400000.100000
 5867999.900000
''')

def read_data(fle):
    out_str = ''
    for (i, line) in enumerate(file0.readlines()):
        if i != 4:
            out_str += '{}\n'.format(line.strip())
        else:
            out_str += '{}\n'.format(line.strip()[1:])
    return out_str

print(read_data(file0))

i am not entirely sure what you mean with "indexing characters". in python strings behave like lists of characters. you can address individual characters with string[5] or get slices string[5:-1]. does that answer your question?

Removing specific characters from multiple lines in a list by indexing

3 Answers3