0

I want to make a replacement script.

It should replace str1 to str2. My file has xml-based structure. For example, I have:

...word1'#13#10'word2'#13#10'word3... = ...word1'#13#10'word3...

I want to remove some part of string. I use this in a script:

Lines[i] = Lines[i].replace(key, DataBase[key])

I've already checked that "key" and "DataBase[key]" are correctly defined. If I print them into console with "print()" - it looks just like it has to. But then script is executing it don't change sequences like this - with '#13#10'. Pairs of keys without any specific simbols works fine. What can I do? And why it doesn't works well? Full script:

import configparser
#import time

config = configparser.ConfigParser()  # init configparser
config.optionxform = str
config.read("SocratToCortesExpress.cfg")  # config file

print("Config file - readed")

filePath = config.get("PATH", "old_file")  # config file - with names of files, pairs of words

DataStrings = config.items("DATA")  # read pairs
DataBase = dict()  # initialization of dictionary
print("Dictionary - initialized")

for Dstr in DataStrings:  # old and new words for a replacement
    SocratName = Dstr[0]
    CortesName = Dstr[1]    
    DataBase[SocratName] = CortesName


print("Dictionary - fulfilled")

with open(filePath, "r", encoding='utf-8-sig') as ResultFile:  # input file    Lines = ResultFile.readlines()

print("Old file - uploaded")

f1 = open('logkeys.txt', 'w')
for key in DataBase.keys():
    try:
        f1.write('\n'+key+'\n'+DataBase[key]+'\n')
    except Exception as e: #errors
            f2 = open('log.txt', 'w')
            f2.write('An exceptional thing happed - %s' %e)
            f2.close()
f1.close()


for i in range(len(Lines)):  # brutforce - all over input file
    #Lines[i] = Lines[i].replace('\ufeff', '') #some weird symbol
    for key in DataBase.keys():     
        try:
            Lines[i] = Lines[i].replace(key, DataBase[key]) #replacing  
        except Exception as e: #errors
            f2 = open('log.txt', 'w')           
            f2.write('An exceptional thing happed - %s' %e)
            f2.close()


print("Sequences - replaced")

outFileName = config.get("PATH", "new_file")  # define output file

print("Exit file - initialized")

with open(outFileName, "a", encoding='utf-8-sig') as outFile:  # save
    for line in Lines:      
        outFile.write(line)

print("OK")
Timofey Kargin
  • 161
  • 4
  • 15

1 Answers1

2

Have you tried this?

>>> s = "word1'#13#10'word2'#13#10'word3"
>>> s.replace("'word2'#13#10'", '')
"word1'#13#10word3"
George Tseres
  • 498
  • 6
  • 19
  • So, I tryed. But is script, not shell. And nope, it doesn't work. Maybe because of my word2 in cyrillic characers – Timofey Kargin Sep 15 '17 at 12:24
  • It has nothing to do if this is a script or run interactively. For UTF-8 replacement, please check https://stackoverflow.com/questions/13093727/how-to-replace-unicode-characters-in-string-with-something-else-python – George Tseres Sep 15 '17 at 12:26
  • Thanks a lot. Maybe I'm on the right way now. I'm already using UTF-8 encoding: open(filePath, "r", encoding='utf-8-sig') . It helps me to read file correctly, including BOM. But still, I can't replace my string which includes '#13#10'. I don't know, maybe there is the way to find and replace sequences in bytes, not strings? Firstly encode my strings, make a replace and after that encode it to UTF-8 again. But I'm not sure if replacement in bytes exists. – Timofey Kargin Sep 18 '17 at 10:46