Replace string python txt text

Question

I have a book in txt format. I would like to create 2 new text: in the first, I would like to replace all occurencies of the string "Paul" with Paul_1, in the second with Paul_2. I wrote this code:

with open("book.txt", 'r') as original, \
        open("book_1.txt", 'w') as mod1, \
        open("book_2.txt", 'w') as mod2:
    for line in original:
        words = line.split()
        for word in words:
            s="Paul"
            if(word == s):
                mod1.write(word + "_1 ")
                mod2.write(word + "_2 ")
            else:
                mod1.write(word + " ")
                mod2.write(word + " ")
        mod1.write("\n")
        mod2.write("\n")

There is a problem, often some Paul are skipped and therefore, in the end, I have in the same document both Paul and Paul_1 (and Paul and Paul_2). Where is the problem?

Is it possible the skipped ones are `Paul,` or `Paul.` and such? — bgse, Mar 20 '18 at 17:50
@bgse yes, I noticed now that it skipped string like Paul, and Paul'. How can I solve that? — Camilla8, Mar 20 '18 at 17:52
you can use the method `startswith()` or remove the punctuation marks with replace (use regex) or compare `word[:-1]` compare word without the last letter/symbol — shahaf, Mar 20 '18 at 17:55
@Camilla8 `str.split()` by default splits your string using whitespace as a delimiter, and it isn't really suitable for your needs as you can only split by one delimiter if you specify one yourself. You might want to look at [re.split()](https://docs.python.org/3/library/re.html#re.split). — bgse, Mar 20 '18 at 17:57

Rakesh · Accepted Answer · 2018-03-23T07:53:16.207

2

This should help.

import re

with open("book.txt", 'r') as original, \
        open("book_1.txt", 'w') as mod1, \
        open("book_2.txt", 'w') as mod2:
    data = original.read()
    data_1 = re.sub(r"\bPaul\b", 'Paul_1', data)   #Replace any occurrence of Paul with Paul_1 
    data_2 = re.sub(r"\bPaul\b", 'Paul_2', data)   #Replace any occurrence of Paul with Paul_2 
    mod1.write(data_1 + r"\n")
    mod2.write(data_2 +  r"\n")

edited Mar 23 '18 at 07:53

answered Mar 20 '18 at 17:50

Rakesh

81,458
17
76
113

What do the 'r' in the lasts 2 istructions do? – Camilla8 Mar 20 '18 at 17:55
Should take into account edge-cases like `"Paula is a nice lady.".replace("Paul", "Paul_1")` though, given the question is concerning a book text, that isn't too far fetched. – bgse Mar 20 '18 at 18:00
@Rakesh your code has problem if Paul is a substring of another. If for instance, there is PostPaul, I get PostPaul_1, while my aim is to replace just Paul and not strings like PostPaul – Camilla8 Mar 22 '18 at 16:44
Oh ok. In that case you probably need regex. Let me try to make one tomorrow morning. – Rakesh Mar 22 '18 at 17:29
Updated snippet. – Rakesh Mar 23 '18 at 07:53
@Rakesh If I have a dynamic word, how should I write the regex? re.sub(r"\b+"word"+"\b", 'Paul_1', data) or re.sub(r"\b+"word"+"r\b", 'Paul_1', data)? – Camilla8 Mar 23 '18 at 11:00
you can use `str.format`. Ex: `re.sub(r"\b{0}\b".format(toChange), 'Paul_1', s)` – Rakesh Mar 23 '18 at 11:08

Replace string python txt text

1 Answers1

Linked