Python: Input data from .XML to .CSV file

Question

I am attempting to parse the "XML" file included in the code below but am getting error for all variables defined:

NameError: name 'computer_name' is not defined

Here is an excerpt from the "XML" file(because it is not a true xml file I am trying to set the variable to the line below the found line):

        <p1:field>
        <p1:name>NewComputerName</p1:name>
        <p1:value>Computer01</p1:value>
        </p1:field>
        <p1:field>
        <p1:name>NewComputerAssetTag</p1:name>
        <p1:value>12345</p1:value>
        </p1:field>
        <p1:field>
        <p1:name>AcquisitionDate</p1:name>
        <p1:value>4/20/69</p1:value>
        </p1:field>

and here is my code:

import csv
import os

with open('csvtest.csv', 'w', newline='') as outfile:
    writer = csv.writer(outfile)
    writer.writerow(('Computer Name', 'Acquisition Date', 'Asset Tag'))
    for filename in os.listdir('\\\\windb\\f$\\Technology\\V1\\0'):
        if filename.endswith(".xml"):
            with open(os.path.join('\\\\windb\\f$\\Technology\\V1\\0',filename), "r") as input:
                for line in input:
                    if line.startswith('    <p1:name>NewComputerName</p1:name>'):
                            computer_name=next(input, '').strip()
                            computer_name=computer_name.split("<p1:value>")[1].split("</")[0]
                    elif line.startswith('    <p1:name>AcquisitionDate'):
                            acqDate=next(input, '').strip()
                            acqDate=acqDate.split("<p1:value>")[1].split("</")[0]
                    elif line.startswith('    <p1:name>NewComputerAssetTag'):
                            assTag=next(input, '').strip()
                            assTag=assTag.split("<p1:value>")[1].split("</")[0]
                myData = [computer_name,acqDate,assTag]
                writer.writerow(myData)

I expect this to write the 3 variables to the CSV file appending a row for each XML file in the directory.

The output is NameError: name 'computer_name' is not defined

score 0 · Answer 1 · answered Jun 21 '19 at 18:00

Don't parse XML files by hand, use libraries made for it, e.g. BeautifulSoup:

data = '''<p1:field>
        <p1:name>NewComputerName</p1:name>
        <p1:value>Computer01</p1:value>
        </p1:field>
        <p1:field>
        <p1:name>NewComputerAssetTag</p1:name>
        <p1:value>12345</p1:value>
        </p1:field>
        <p1:field>
        <p1:name>AcquisitionDate</p1:name>
        <p1:value>4/20/69</p1:value>
        </p1:field>'''

from bs4 import BeautifulSoup
import csv

soup = BeautifulSoup(data, 'lxml')
fields = soup.find_all('p1:value')

with open('csvtest.csv', 'w', newline='') as outfile:
    writer = csv.writer(outfile)
    writer.writerow(('Computer Name', 'Acquisition Date', 'Asset Tag'))

    for n, a, d in zip(fields[::3], fields[1::3], fields[2::3]):
        writer.writerow([n.text, d.text, a.text])

The content of csvtest.csv will be:

Computer Name,Acquisition Date,Asset Tag
Computer01,4/20/69,12345

Hi Andrej - This works great but as a beginner I am having trouble conceptualizing an iteration for the data variable. I want to go through an entire directory of xml files, pull fields from each, and append them as a new line in the csv. — pythondewd69, Jul 11 '19 at 14:55
@pythondewd69 All the "magic" is done through python slicing. To explain: we know that each item has 3 `` tags, so we want to "tie" each of 3 values to one item. To see more how Python slicing works, see here https://stackoverflow.com/questions/509211/understanding-slice-notation — Andrej Kesely, Jul 11 '19 at 15:09

Python: Input data from .XML to .CSV file

1 Answers1