How to write Python script read text and write in XML format

Question

I need a Python script to read the below text from text file:

"Re-integratieassistent – modelnummer rea 202"

and write it into an XML file.

The issue when writing in XML is using UTF-8 encoding as it writes as:

"Re-integratieassistent  modelnummer rea 202"

"-" is missing between "integratieassistent" and "modelnumber"

How do I solve this?

My current code:

with codecs.open(file,encoding='utf-8', errors='ignore', mode="r") as curr_file:
    for line in curr_file.readlines():

        # Increment the counter because we encountered the XML start or begin elements
        #line = line.encode('utf-8')

        if line.find("<soapenv:Envelope") != -1 or line.find("</soapenv:Envelope") != -1 :
            i=i+1
        if (i == 1):
            file_i = codecs.open(inputFolder_new+"/"+filename,encoding='utf-8', mode="a")
            file_i.writelines(line)
        if (i == 3):
            file_o = codecs.open(outputFolder_new+"/"+filename, encoding='utf-8', mode="a")
            file_o.writelines(line)
        if (i == 4):
            file_i.writelines("</soapenv:Envelope>")
            file_o.writelines("</soapenv:Envelope>")

Please fix your version tags - you're either using Python 2 or Python 3. — Celeo, Nov 19 '15 at 17:44
Is your input file actually encoded in utf8? What's the contents of `line` when you are trying to writting to the file? — memoselyk, Nov 19 '15 at 18:12
Hi thanks.My input file is as below format Re-integratieassistent – modelnummer rea 202 which is in text file — Testuser, Nov 19 '15 at 18:28

score 0 · Answer 1 · answered Nov 19 '15 at 18:35

0

Python's interfaces for processing XML are grouped in the xml package.

You might want to consider xml.etree.ElementTree for modifying an xml file.

import xml.etree.ElementTree as ET
root = ET.parse(file)

Then use root.findall('{namespace}nodename'). or root.find(). You can loop through each node trying to find your item of interest.

Add a subelement with ET.SubElement(an_element_object, 'your element').

It's hard to be more precise without knowing what your XML file looks like.

answered Nov 19 '15 at 18:35

airstrike

2,270
1
25
26

My XML file contains both Request and Response its big to post here – Testuser Nov 19 '15 at 18:41
Well, let me know if you have any questions on the above, then – airstrike Nov 19 '15 at 18:43

How to write Python script read text and write in XML format

1 Answers1