1

I'm a complete beginner, started today kind but I did a little C++ a few years ago, I'm trying to write a code that will read a text file and add a number to the start of each pattern that increments as it reads further.

So far I've written:

import tkinter as tk
import re

master = tk.Tk()
from tkinter.filedialog import askopenfilename

filename = askopenfilename()
file = open(filename, "r+")
filetext = file.read()
pattern = '"name":"(.*?)"'
name = re.findall(pattern, filetext)
print (name)
namereplace = re.sub(pattern, "test", filetext)
print ("this ran")
file.close()

Which opens a prompt to select a text file, reads the text file and finds all the strings I need to add the order to, but it does not replace them with "test".

martineau
  • 119,623
  • 25
  • 170
  • 301
Adrian P
  • 13
  • 2
  • After replacing the text, you need to write it to the file. Also `findall` is not needed here. – Vishnudev Krishnadas Mar 02 '21 at 19:46
  • More specifically, you need to close the file and the reopen it in `"w"` mode in order to update it. – martineau Mar 02 '21 at 20:07
  • @martineau I thought opening with r+ would allow read and write functionality? – Adrian P Mar 03 '21 at 12:45
  • Adrian: `'r+'` does allow both reading and writing, but intermixing the two operations—i.e. updating the file at the same time as you're reading it—would be extremely difficult to implement. For that reason it'd probably be best to just re-write the whole thing in a separate step (which would require keeping track of a lot of information). One tactic to simplify things would be to write results to a separate, temporary, file will reading the original, and then deleting the original and renaming the temp file so it replaces it at the end. – martineau Mar 03 '21 at 14:20

1 Answers1

0

First, to have your replacement in the file, you need to actually write the result back to the file.

To do this, you have two options (cmp. Replace and overwrite instead of appending):

  1. Just open the file again in w mode after reading it and write the output of your replacement to it:
import tkinter as tk
import re

master = tk.Tk()
from tkinter.filedialog import askopenfilename

filename = askopenfilename()
pattern = '"name":"(.*?)"'

with open(filename, "r") as infile:
    filetext = infile.read()
    infile.close()

with open(filename, "w") as outfile:
    outfile.write(re.sub(pattern, "test", filetext))
    outfile.close()
  1. Use seek to move the beginning of the file and truncate to inplace replace:
import tkinter as tk
import re

master = tk.Tk()
from tkinter.filedialog import askopenfilename

filename = askopenfilename()
pattern = '"name":"(.*?)"'

with open(filename, "r+") as infile:
    filetext = infile.read()
    infile.seek(0)
    infile.write(re.sub(pattern, "test", filetext))
    infile.truncate()
    infile.close()

Second, concerning the main part of your question, the replacement by an incrementing number: I don't think you can do this with a single call of re.sub().

What you could do is read the file line by line and linewise substitute a counter variable. Whenever you successfully match, you increment your counter afterwards. To determine this, you could e.g. use re.subn() which will not only return the new string but also the number of substitutions.

Full example:

import tkinter as tk
import re

master = tk.Tk()
from tkinter.filedialog import askopenfilename

filename = askopenfilename()
pattern = '"name":"(.*?)"'

with open(filename, "r") as infile:
    filetext = ""
    count = 1
    line = infile.readline()
    while line:
        matchtuple = re.subn(pattern, str(count), line)
        if matchtuple[1]:
            count += 1
        filetext += matchtuple[0]
        line = infile.readline()
    infile.close()

with open(filename, "w") as outfile:
    outfile.write(filetext)
    outfile.close()

Input:

"bla":"bal"
"name":"baba"
"blah":"blah"
"name":"keke"

Output:

"bla":"bal"
1
"blah":"blah"
2
buddemat
  • 4,552
  • 14
  • 29
  • 49
  • Your first suggestion seems to be working very well for what I want, thank you. Now I'm looking into being able to replace each instance of the pattern with the pattern but with an incrementing number for each consecutive occurence. – Adrian P Mar 03 '21 at 14:27
  • I'm glad my answer works for you. If it does, you can always show that accepting it and/or upvoting if you like. In general, if you have a follow-up question that is not directly a part of the original one, please consider posting it as a separate question. That makes it easier for others to find, both if they have a similar question or if they feel they can answer it. – buddemat Mar 03 '21 at 14:47
  • That being said, a way to do what you want could be reading the file line-by-line (using `infile.readline()` instead of `infile.read()`) and then replacing your match with a variable that you increment on each successful substitution. – buddemat Mar 03 '21 at 14:49
  • Come to think of it, I guess the replacement by an incrementing number is actually part of the original question, so I'll modify my answer to add it. – buddemat Mar 03 '21 at 15:23
  • I've just got around to doing this today and the full example is perfect save for one detail, the files I'm trying to apply this to does not have any breaklines so everything is numbered as 1. Will work out what can be done about this possibly using the CSV library, thank you very much! – Adrian P Mar 04 '21 at 13:22