0

I am making a python script to create XML document and add alphanumeric from the file to XML. Before I add the data from the file to the XML document, I add the if condition if the data matches the criteria(if length is equal to 32, 40 or 64). Otherwise, I will not add that data to the XML. I also implemented a input to add data to XML. I tried to write XML and read it but there are errors related to lxml synatx error.

I tried doing research on how to write XML. I just do not see any difference between my code and the tutorial.

#!/usr/bin/python

from lxml import etree as ET
from StringIO import StringIO

root = ET.Element("root")
file = open("alphanumeric.txt", "r")
invalidalhanumeric= open("invalidalhanumeric.txt", "w+")
print ("Enter the comment of the XML file: ")
comment = raw_input()

for aline in file:
        values = aline.strip()
        length = len(values)
        if length != 32 or length != 40 or length != 64:
                invalidalhanumeric.write(str(values)+ "\n")
        else:
                child = ET.SubElement(root,"child")
                fn = ET.SubElement(child, "alphanumeric")
                fn.text = values
                if length == 32:
                        length32 = ET.SubElement(child, "length32 ")
                        length32 .text = values
                elif length == 40:
                        length40 = ET.SubElement(child, "length40 ")
                        length40 .text = values
                elif length == 64:
                        length64 = ET.SubElement(child, "length64 ")
                        length64 .text = values
                rl = ET.SubElement(child, "ComplexLevel")
                rl.text = "1"
                cm = ET.SubElement(child, "Comment")
                cm.text = comment
                tree = ET.ElementTree(root)
                tree.write("data.xml")
x = ET.parse(StringIO("data.xml"))
print ET.tostring(x, pretty_print=True)


alphanumeric.txt
28c806cb8c91ab66987ac1ec88344296
f6ea268c7e184f580029aec42f2a98f8
d6472dcebce348d693e68b90099d9ede
8aea2ae91cc084731a08aa231e79a430
502fbbdacada9215ed0d026c70f983e1
dd5986339aaf23f2baf8c245923a0f69
6499863d47b68030f0c5ffafaffb1344
752d245f1026482a967a763dae184569
d04f6b2157969a10c2e7421ee624075a2a5f5908
cd206f00306fb902fe25922b95da04af1028be0c
51d4b4cd19ef174a257840f3d1a419f839014f6d
62c2b7723ac79e9b009e3b0a9cb4ffa10542b9da
6e28f9ed9045abbe8321188191f92688ed064c43
93c694deec6c26acecbde4312ddbac9a0fed08e0
2a64742e32d4284640b22422c73e31ae616201be
7f0247d2f4d458ed325def12d8d7a71fc387c18a
3267f0bee5efa5dd2549722357e55fe3f4038e58
ac9fc01c1284bbe9ee4ddf424216a82b5d64a42
2197e35f14ff9960985c982ed6d16d5bd5366062
355603b1922886044884afbdfa9c9a6626b6669a
38599685f23d1840533ce5cbf5bf5114e2252435d191a3d9321093ae0bb8f88b

The result should show the XML output except this ac9fc01c1284bbe9ee4ddf424216a82b5d64a42 which does not meet the criteria. There is error messages which shows

Enter the comment of the XML file: 
rmasf-231
Traceback (most recent call last):
  File "./convertxml.py", line 36, in <module>
    x = ET.parse(StringIO("data.xml"))
  File "src/lxml/etree.pyx", line 3435, in lxml.etree.parse
  File "src/lxml/parser.pxi", line 1857, in lxml.etree._parseDocument
  File "src/lxml/parser.pxi", line 1877, in lxml.etree._parseMemoryDocument
  File "src/lxml/parser.pxi", line 1765, in lxml.etree._parseDoc
  File "src/lxml/parser.pxi", line 1127, in lxml.etree._BaseParser._parseDoc
  File "src/lxml/parser.pxi", line 601, in lxml.etree._ParserContext._handleParseResultDoc
  File "src/lxml/parser.pxi", line 711, in lxml.etree._handleParseResult
  File "src/lxml/parser.pxi", line 640, in lxml.etree._raiseParseError
  File "<string>", line 1
lxml.etree.XMLSyntaxError: Start tag expected, '<' not found, line 1, column 1

  • ` x = ET.parse(StringIO("import.xml"))` is not in your code .. – balderman Jun 24 '19 at 11:20
  • @balderman, Apologies. I made a mistake in copy the error. I have changed the files name to data.xml. The error is still happening. –  Jun 24 '19 at 11:28
  • try changing `x = ET.parse(StringIO('data.xml')`) to `x = ET.parse('data.xml')` – balderman Jun 24 '19 at 11:32
  • You may want to re-read the doc for `StringIO` (https://docs.python.org/3.7/library/io.html?highlight=stringio#io.StringIO). A `StringIO` is basically a text buffer that supports the same API as a file object, so here what you're passing to `ET.parse()` is a file-like object with the "data.xml" string as __content__. You want to open your "data.xml" file in read mode and pass this instead. – bruno desthuilliers Jun 24 '19 at 11:51
  • As based on your request, there are errors which appeared IOError: Error reading file 'data.xml': failed to load external entity "data.xml" –  Jun 24 '19 at 11:51
  • @AbdullahNaina there are actually quite a few other issues with your code. I'm only addressing the current one (not passing the proper object to `ET.parse`), for other issues you should post new questions (but first check the answer below that do fix other issues). – bruno desthuilliers Jun 24 '19 at 11:53

2 Answers2

1

See here - I did a little rewrite but I hope I keep the logic

import xml.etree.ElementTree as ET

with open("alphanumeric.txt", "r") as f:
    root = ET.Element("root")
    invalid_lines = []
    lines = [l.strip() for l in f.readlines()]
    for line in lines:
        line_length = len(line)
        if line_length in [32, 40, 64]:
            child = ET.SubElement(root, "child")
            fn = ET.SubElement(child, "alphanumeric")
            fn.text = line
            e = ET.SubElement(child, 'length{}'.format(line_length))
            e.text = line
            rl = ET.SubElement(child, "ComplexLevel")
            rl.text = "1"
            cm = ET.SubElement(child, "Comment")
            cm.text = 'a comment goes gere'
        else:
            invalid_lines.append(line)
# TODO write invalid lines to file
tree = ET.ElementTree(root)
tree.write("data.xml")

output ('data.xml')

<root>
    <child>
        <alphanumeric>28c806cb8c91ab66987ac1ec88344296</alphanumeric>
        <length32>28c806cb8c91ab66987ac1ec88344296</length32>
        <ComplexLevel>1</ComplexLevel>
        <Comment>a comment goes gere</Comment>
    </child>
    <child>
        <alphanumeric>f6ea268c7e184f580029aec42f2a98f8</alphanumeric>
        <length32>f6ea268c7e184f580029aec42f2a98f8</length32>
        <ComplexLevel>1</ComplexLevel>
        <Comment>a comment goes gere</Comment>
    </child>
    <child>
        <alphanumeric>d6472dcebce348d693e68b90099d9ede</alphanumeric>
        <length32>d6472dcebce348d693e68b90099d9ede</length32>
        <ComplexLevel>1</ComplexLevel>
        <Comment>a comment goes gere</Comment>
    </child>
    <child>
        <alphanumeric>8aea2ae91cc084731a08aa231e79a430</alphanumeric>
        <length32>8aea2ae91cc084731a08aa231e79a430</length32>
        <ComplexLevel>1</ComplexLevel>
        <Comment>a comment goes gere</Comment>
    </child>
    <child>
        <alphanumeric>502fbbdacada9215ed0d026c70f983e1</alphanumeric>
        <length32>502fbbdacada9215ed0d026c70f983e1</length32>
        <ComplexLevel>1</ComplexLevel>
        <Comment>a comment goes gere</Comment>
    </child>
    <child>
        <alphanumeric>dd5986339aaf23f2baf8c245923a0f69</alphanumeric>
        <length32>dd5986339aaf23f2baf8c245923a0f69</length32>
        <ComplexLevel>1</ComplexLevel>
        <Comment>a comment goes gere</Comment>
    </child>
    <child>
        <alphanumeric>6499863d47b68030f0c5ffafaffb1344</alphanumeric>
        <length32>6499863d47b68030f0c5ffafaffb1344</length32>
        <ComplexLevel>1</ComplexLevel>
        <Comment>a comment goes gere</Comment>
    </child>
    <child>
        <alphanumeric>752d245f1026482a967a763dae184569</alphanumeric>
        <length32>752d245f1026482a967a763dae184569</length32>
        <ComplexLevel>1</ComplexLevel>
        <Comment>a comment goes gere</Comment>
    </child>
    <child>
        <alphanumeric>d04f6b2157969a10c2e7421ee624075a2a5f5908</alphanumeric>
        <length40>d04f6b2157969a10c2e7421ee624075a2a5f5908</length40>
        <ComplexLevel>1</ComplexLevel>
        <Comment>a comment goes gere</Comment>
    </child>
    <child>
        <alphanumeric>cd206f00306fb902fe25922b95da04af1028be0c</alphanumeric>
        <length40>cd206f00306fb902fe25922b95da04af1028be0c</length40>
        <ComplexLevel>1</ComplexLevel>
        <Comment>a comment goes gere</Comment>
    </child>
    <child>
        <alphanumeric>51d4b4cd19ef174a257840f3d1a419f839014f6d</alphanumeric>
        <length40>51d4b4cd19ef174a257840f3d1a419f839014f6d</length40>
        <ComplexLevel>1</ComplexLevel>
        <Comment>a comment goes gere</Comment>
    </child>
    <child>
        <alphanumeric>62c2b7723ac79e9b009e3b0a9cb4ffa10542b9da</alphanumeric>
        <length40>62c2b7723ac79e9b009e3b0a9cb4ffa10542b9da</length40>
        <ComplexLevel>1</ComplexLevel>
        <Comment>a comment goes gere</Comment>
    </child>
    <child>
        <alphanumeric>6e28f9ed9045abbe8321188191f92688ed064c43</alphanumeric>
        <length40>6e28f9ed9045abbe8321188191f92688ed064c43</length40>
        <ComplexLevel>1</ComplexLevel>
        <Comment>a comment goes gere</Comment>
    </child>
    <child>
        <alphanumeric>93c694deec6c26acecbde4312ddbac9a0fed08e0</alphanumeric>
        <length40>93c694deec6c26acecbde4312ddbac9a0fed08e0</length40>
        <ComplexLevel>1</ComplexLevel>
        <Comment>a comment goes gere</Comment>
    </child>
    <child>
        <alphanumeric>2a64742e32d4284640b22422c73e31ae616201be</alphanumeric>
        <length40>2a64742e32d4284640b22422c73e31ae616201be</length40>
        <ComplexLevel>1</ComplexLevel>
        <Comment>a comment goes gere</Comment>
    </child>
    <child>
        <alphanumeric>7f0247d2f4d458ed325def12d8d7a71fc387c18a</alphanumeric>
        <length40>7f0247d2f4d458ed325def12d8d7a71fc387c18a</length40>
        <ComplexLevel>1</ComplexLevel>
        <Comment>a comment goes gere</Comment>
    </child>
    <child>
        <alphanumeric>3267f0bee5efa5dd2549722357e55fe3f4038e58</alphanumeric>
        <length40>3267f0bee5efa5dd2549722357e55fe3f4038e58</length40>
        <ComplexLevel>1</ComplexLevel>
        <Comment>a comment goes gere</Comment>
    </child>
    <child>
        <alphanumeric>2197e35f14ff9960985c982ed6d16d5bd5366062</alphanumeric>
        <length40>2197e35f14ff9960985c982ed6d16d5bd5366062</length40>
        <ComplexLevel>1</ComplexLevel>
        <Comment>a comment goes gere</Comment>
    </child>
    <child>
        <alphanumeric>355603b1922886044884afbdfa9c9a6626b6669a</alphanumeric>
        <length40>355603b1922886044884afbdfa9c9a6626b6669a</length40>
        <ComplexLevel>1</ComplexLevel>
        <Comment>a comment goes gere</Comment>
    </child>
    <child>
        <alphanumeric>38599685f23d1840533ce5cbf5bf5114e2252435d191a3d9321093ae0bb8f88b</alphanumeric>
        <length64>38599685f23d1840533ce5cbf5bf5114e2252435d191a3d9321093ae0bb8f88b</length64>
        <ComplexLevel>1</ComplexLevel>
        <Comment>a comment goes gere</Comment>
    </child>
</root>
balderman
  • 22,927
  • 7
  • 34
  • 52
-1

The problem lies with the condition you have given.

if length != 32 or length != 40 or length != 64:

Here no matter what input or value you give for length, this condition is always true because of the 'or' used. Whatever length is, length can not be 32, 40 and 64 at the same time. Thus the data.xml file is not being written with data at all(assuming that the data.xml file started out empty). Hence it cannot find any '<' tag (according to the error message).

A reccomended solution would be to use the opposite i.e.

if length==32 or length==40 or length==64:
    #your code to write to data.xml file
else:
    #your code for invalid alphanumeric

OR use the 'and' condition

if length!=32 and length!=40 and length!=64:
    #your code for invalid alphanumeric
else:
    #your code to write to data.xml file

Hope this helps!!