Replace sequences in Python

Question

I want to make a replacement script.

It should replace str1 to str2. My file has xml-based structure. For example, I have:

...word1'#13#10'word2'#13#10'word3... = ...word1'#13#10'word3...

I want to remove some part of string. I use this in a script:

Lines[i] = Lines[i].replace(key, DataBase[key])

I've already checked that "key" and "DataBase[key]" are correctly defined. If I print them into console with "print()" - it looks just like it has to. But then script is executing it don't change sequences like this - with '#13#10'. Pairs of keys without any specific simbols works fine. What can I do? And why it doesn't works well? Full script:

import configparser
#import time

config = configparser.ConfigParser()  # init configparser
config.optionxform = str
config.read("SocratToCortesExpress.cfg")  # config file

print("Config file - readed")

filePath = config.get("PATH", "old_file")  # config file - with names of files, pairs of words

DataStrings = config.items("DATA")  # read pairs
DataBase = dict()  # initialization of dictionary
print("Dictionary - initialized")

for Dstr in DataStrings:  # old and new words for a replacement
    SocratName = Dstr[0]
    CortesName = Dstr[1]    
    DataBase[SocratName] = CortesName


print("Dictionary - fulfilled")

with open(filePath, "r", encoding='utf-8-sig') as ResultFile:  # input file    Lines = ResultFile.readlines()

print("Old file - uploaded")

f1 = open('logkeys.txt', 'w')
for key in DataBase.keys():
    try:
        f1.write('\n'+key+'\n'+DataBase[key]+'\n')
    except Exception as e: #errors
            f2 = open('log.txt', 'w')
            f2.write('An exceptional thing happed - %s' %e)
            f2.close()
f1.close()


for i in range(len(Lines)):  # brutforce - all over input file
    #Lines[i] = Lines[i].replace('\ufeff', '') #some weird symbol
    for key in DataBase.keys():     
        try:
            Lines[i] = Lines[i].replace(key, DataBase[key]) #replacing  
        except Exception as e: #errors
            f2 = open('log.txt', 'w')           
            f2.write('An exceptional thing happed - %s' %e)
            f2.close()


print("Sequences - replaced")

outFileName = config.get("PATH", "new_file")  # define output file

print("Exit file - initialized")

with open(outFileName, "a", encoding='utf-8-sig') as outFile:  # save
    for line in Lines:      
        outFile.write(line)

print("OK")

Maybe there is something wrong with the parts of the script you haven't shown; can't be sure for obvious reasons. — Scott Hunter, Sep 15 '17 at 12:09
Looking at what you have shown so far, there does not seem to be an error. The problem will probably be in the parts that you didn't show. Perhaps if you posted more of the code, we could help better. — Haroldo_OK, Sep 15 '17 at 12:12
`It should replace str1 to str2` -> What's wrong with `s = s.replace(str1, str2)`? — Right leg, Sep 15 '17 at 12:13
Just added full code. "XML-based" - I meant that it is .xprt file with structure familiar to .xml. — Timofey Kargin, Sep 15 '17 at 12:18
Can you please create a [mcve]? Input file as well as expected output? — OneCricketeer, Sep 15 '17 at 12:19

score 2 · Answer 1 · answered Sep 15 '17 at 12:17

2

Have you tried this?

>>> s = "word1'#13#10'word2'#13#10'word3"
>>> s.replace("'word2'#13#10'", '')
"word1'#13#10word3"

answered Sep 15 '17 at 12:17

George Tseres

498
6
19

So, I tryed. But is script, not shell. And nope, it doesn't work. Maybe because of my word2 in cyrillic characers – Timofey Kargin Sep 15 '17 at 12:24
It has nothing to do if this is a script or run interactively. For UTF-8 replacement, please check https://stackoverflow.com/questions/13093727/how-to-replace-unicode-characters-in-string-with-something-else-python – George Tseres Sep 15 '17 at 12:26
Thanks a lot. Maybe I'm on the right way now. I'm already using UTF-8 encoding: open(filePath, "r", encoding='utf-8-sig') . It helps me to read file correctly, including BOM. But still, I can't replace my string which includes '#13#10'. I don't know, maybe there is the way to find and replace sequences in bytes, not strings? Firstly encode my strings, make a replace and after that encode it to UTF-8 again. But I'm not sure if replacement in bytes exists. – Timofey Kargin Sep 18 '17 at 10:46

Replace sequences in Python

1 Answers1