I am having issues parsing through log files containing the &
character, but only when it is not followed up by amp;
. Can something be done before parsing or do I have to look for faults elsewhere?
I am getting the xml.etree.ElementTree.ParseError: not well-formed (invalid token)
error, and I have isolated the &
to be the only special, out of the ordinary, character on that line. Having the &
followed up by amp;
poses no issue.
Syntax:
import xml.etree.ElementTree as ET
import os
import errno
path = "C:\\Users\\SuperUser\\Desktop\\audit\\audit\\saved\\audit"
for filename in os.listdir(path):
with open(path + "\\" + filename) as myfile:
lines = myfile.readlines()
xmlfile = open("logins.xml", "w")
for line in lines:
# print(ET.fromstring(line))
xmlVal = ET.fromstring(line)
finder = "UserAuthenticated/Action"
if xmlVal.find(finder) is not None and xmlVal.find(finder).text == 'Login':
username = xmlVal.find("UserAuthenticated/LocalUsername").text
timestamp = xmlVal.find("TimeStamp").text
xmlToWrite = '<?xml version="1.0" encoding="UTF-8"?><root><Username>' + username + '</Username><Timestamp>' + timestamp + '</Timestamp></root>\n'
xmlfile.write(xmlToWrite)
print("Writing '" + xmlToWrite + "' to logins.xml")
xmlfile.close()