0

I have a folder with source txt files and a destination folder. The source txt files could look like these two examples:

File1:

0;122214;stringvalue1;10;string;value;1012;1014
0;1222155;stringvalue20;10;anotherstring;v;value;10000015;0
0;1222155;stringvalue20;10;anotherstring;with;;;;;;;;;;;;;;a lot of ;;;;;;;;;;;;;;;;;;;;value;10000016;0
0;1222155;stringvalue20;10;;value;7;0

File2:

0;122214;stringvalue1;10;string;value;1012;1014
0;1222155;stringvalue20;10;anotherstring;v;value;10000015;0
0;1222155;stringvalue20;10;anotherstring;with;;;;;;;;;;;;;;a lot of ;;;;;;;;;;;;;;;;;;;;value;10000016;0
0;1222155;stringvalue20;10;;value;7;0
1;122214;stringvalue1;10;string;value;1012;1014
1;1222155;stringvalue20;10;another;"string;v;value;10000015;0
1;1222155;stringvalue20;10;anoth";erstring;with;;;;;;;;;;;;;;a lot of ;;;;;;;;;;;;;;;;;;;;value;10000016;0
1;1222155;stringvalue20;10;--;value;7;0

I have a code now which inserts quote characters into a specific column. My current code is as follows:

import glob
import os

def findnth(string, substring, n):
    parts = string.split(substring, n + 1)
    if len(parts) <= n + 1:
        return -1
    return len(string) - len(parts[-1]) - len(substring)

path = "D:\source\*.txt"
path2 = "D:\destination"
for fname in glob.glob(path):
    with open(fname) as f:
        content = f.readline()
        content2 = content[:findnth(content, ";", 3)+1]+'"'+content[findnth(content, ";", 3)+1:(len(content)-findnth(content[::-1], ";", 2))-1]+'"'+content[(len(content)-findnth(content[::-1], ";", 2))-1:]
        print(content2)
        with open(os.path.join(path2,os.path.basename(fname)), "w") as output:
            output.write(content2)

The code works and no errors result. However, only the first line of each file is written to a new file:

0;122214;stringvalue1;10;"string";value;1012;1014

0;122214;stringvalue1;10;"string";value;1012;1014

So the inserting of '"' works, however I have problems to do it line by line and export it to a new file. I tried read, readline and readlines, but did not get it working. So how can I get this working that the code runs for all lines and not just the first line of each file? Furthemore I do not want to have empty lines between each line being inserted in the final file.

Update: Desired output:

File1:

0;122214;stringvalue1;10;"string";value;1012;1014
0;1222155;stringvalue20;10;"anotherstring;v";value;10000015;0
0;1222155;stringvalue20;10;"anotherstring;with;;;;;;;;;;;;;;a lot of ;;;;;;;;;;;;;;;;;;;";value;10000016;0
0;1222155;stringvalue20;10;"";value;7;0

File2:

0;122214;stringvalue1;10;"string";value;1012;1014
0;1222155;stringvalue20;10;"anotherstring;v";value;10000015;0
0;1222155;stringvalue20;10;"anotherstring;with;;;;;;;;;;;;;;a lot of ;;;;;;;;;;;;;;;;;;;";value;10000016;0
0;1222155;stringvalue20;10;"";value;7;0
1;122214;stringvalue1;10;"string";value;1012;1014
1;1222155;stringvalue20;10;"another;"string;v";value;10000015;0
1;1222155;stringvalue20;10;"anoth";erstring;with;;;;;;;;;;;;;;a lot of ;;;;;;;;;;;;;;;;;;;";value;10000016;0
1;1222155;stringvalue20;10;"--";value;7;0
PSt
  • 97
  • 11
  • 1
    `readline()` only reads one line from the file. – theherk Feb 01 '22 at 10:18
  • Yes, I am aware of that. I tried to use readlines and iterate over each line, modify it and output it, but I did not get it working. I also tried read, but neither got this working. – PSt Feb 01 '22 at 10:24
  • Open both input and output file within the same context so you can iterate both at the same time: https://stackoverflow.com/questions/4617034/how-can-i-open-multiple-files-using-with-open-in-python – Tzane Feb 01 '22 at 10:27
  • Use `splitlines()` and iterate through all of those. – AI - 2821 Feb 01 '22 at 10:29
  • @Tzane I tried the follwoing: with open(fname) as f, open(os.path.join(path2,os.path.basename(fname)), "w") as output and then I used read or readline again, but same result? It does not fix it. – PSt Feb 01 '22 at 10:31
  • @AI - 2821 I tried it with iterating, also iterate over the result of readlines(), as it gives a list. But I was not able to get it running. – PSt Feb 01 '22 at 10:32
  • Can you please write about how you are actually trying to manipulate the txt files and what is the expected output ? – AI - 2821 Feb 01 '22 at 10:48
  • I am using content2 to add quotation marks to a column. I want to have the same output as in the input files, except that I want to add these quotation marks. I store the new line in content2. It works for the first line, however not for the others. Not because the insertion of the quotation marks is wrong, but because just one line is read and I don't know how to extend the code in order to get it working for all lines. For each line to be specific. – PSt Feb 01 '22 at 11:27
  • 1
    `I tried to use readlines and iterate over each line, modify it and output it` Show us the code. –  Feb 01 '22 at 11:40
  • @SembeiNorimaki I don't know how to do it. When I tried I got an error that I cannot iterate it, as my function to add the quotes uses .split. I have no code to post. – PSt Feb 01 '22 at 11:50
  • `When I tried I got an error`. So how you tried? Put the code that you tried, then you can ask why that code gives you an error. –  Feb 01 '22 at 11:53

1 Answers1

1

I think problem will be solved now , I've tried on my system and it works :

import glob
import os

def findnth(string, substring, n):
    parts = string.split(substring, n + 1)
    if len(parts) <= n + 1:
        return -1
    return len(string) - len(parts[-1]) - len(substring)

path = "D:\source\*.txt"
path2 = "D:\destination"
for fname in glob.glob(path):
    newcontent = ""
    with open(fname) as f:
        content = f.read().splitlines()
        for line in content :
            content2 = line[:findnth(line, ";", 3)+1]+'"'+line[findnth(line, ";", 3)+1:(len(line)-findnth(line[::-1], ";", 2))-1]+'"'+line[(len(line)-findnth(line[::-1], ";", 2))-1:]
            print(content2)
            newcontent = newcontent + content2 + "\n"
        
        with open(os.path.join(path2,os.path.basename(fname)), "w") as output:
            output.write(newcontent)

Explanation :

The variable content has the list of each lines the text file is containing.

We then iterate through all lines and place the ["quotation marks"] at the correct places which is returned in the variable content2.

We also have a newcontent variable which is temporarily used to store the contents of text file with quotation mark added.

At the beginning newcontent is set to " " which signifies that it is blank string variable. Then when every line is manipulated (quotation mark gets added) it is appended to the newcontent variable. The newcontent = newcontent + content2 + "\n" represents that the previous content2's will be added with previuos newcontent variable's value with "\n" added which creates a newline in the file and again stored in newcontent variable.

After whole text file is manipulated it is stored in a new file in a separate directory.

AI - 2821
  • 375
  • 1
  • 7
  • Ok, thanks and I will accept. But to be honest I do not understand it. What is newcontent = newcontent + content2 + "\n"? – PSt Feb 01 '22 at 12:46
  • 1
    Extremely sorry for the inconvenience, I've updated the answer with a explanation added. If you still have any confusion please do reply. I've also corrected an important thing in the code, I think when you ran the code, the first file was only created correctly and the second file was also containing the contents of first file. – AI - 2821 Feb 01 '22 at 15:49