3

I am trying to use findall to select on some xml elements, but i can't get any results.

import xml.etree.ElementTree as ET
import sys

storefront = sys.argv[1]

xmlFileName = 'promotions{0}.xml'

xmlFile = xmlFileName.format(storefront)

csvFileName = 'hrz{0}.csv'
csvFile = csvFileName.format(storefront)
ET.register_namespace('', "http://www.demandware.com/xml/impex/promotion/2008-01-31")
tree = ET.parse(xmlFile)

root = tree.getroot()
print('------------------Generate test-------------\n')



csv = open(csvFile,'w')
n = 0
for child in root.findall('campaign'):
    print(child.attrib['campaign-id'])
    print(n)
    n+=1

The XML looks something like this:

  <?xml version="1.0" encoding="UTF-8"?>
<promotions xmlns="http://www.demandware.com/xml/impex/promotion/2008-01-31">
    <campaign campaign-id="10off-310781">
        <enabled-flag>true</enabled-flag>
        <campaign-scope>
            <applicable-online/>
        </campaign-scope>
        <customer-groups match-mode="any">
            <customer-group group-id="Everyone"/>
        </customer-groups>
    </campaign>

    <campaign campaign-id="MNT-deals">
        <enabled-flag>true</enabled-flag>
        <campaign-scope>
            <applicable-online/>
        </campaign-scope>
        <start-date>2017-07-03T22:00:00.000Z</start-date>
        <end-date>2017-07-31T22:00:00.000Z</end-date>
        <customer-groups match-mode="any">
            <customer-group group-id="Everyone"/>
        </customer-groups>
    </campaign>

    <campaign campaign-id="black-friday">
        <enabled-flag>true</enabled-flag>
        <campaign-scope>
            <applicable-online/>
        </campaign-scope>
        <start-date>2017-11-23T23:00:00.000Z</start-date>
        <end-date>2017-11-24T23:00:00.000Z</end-date>
        <customer-groups match-mode="any">
            <customer-group group-id="Everyone"/>
        </customer-groups>
        <custom-attributes>
            <custom-attribute attribute-id="expires_date">2017-11-29</custom-attribute>
        </custom-attributes>
    </campaign>

    <promotion-campaign-assignment promotion-id="winter17-new-bubble" campaign-id="winter17-new-bubble">
        <qualifiers match-mode="any">
            <customer-groups/>
            <source-codes/>
            <coupons/>
        </qualifiers>
        <rank>100</rank>
    </promotion-campaign-assignment>

    <promotion-campaign-assignment promotion-id="xmas" campaign-id="xmas">
        <qualifiers match-mode="any">
            <customer-groups/>
            <source-codes/>
            <coupons/>
        </qualifiers>
    </promotion-campaign-assignment>

</promotions>

Any ideas what i am doing wrong? I have tried different solutions that i found on stackoverflow but nothing seems to work for me(from the things i have tried). The list is empty. Sorry if it is something very obvious i am new to python.

Cosmin_Victor
  • 154
  • 1
  • 13
  • Possible duplicate of [ElementTree findall() returning empty list](https://stackoverflow.com/questions/9112121/elementtree-findall-returning-empty-list) – James Nov 16 '17 at 19:45
  • I tried something like 'for child in root.findall('.//campaign'):' No results @James – Cosmin_Victor Nov 16 '17 at 19:48
  • Possible duplicate of [Parsing XML with namespace in Python via 'ElementTree'](https://stackoverflow.com/questions/14853243/parsing-xml-with-namespace-in-python-via-elementtree) – styvane Nov 16 '17 at 19:48
  • Tried something like namespaces = {'':'http://www.demandware.com/xml/impex/promotion/2008-01-31'} for child in root.findall('campaign',namespaces): @sstyvane – Cosmin_Victor Nov 16 '17 at 19:54
  • The best solution i have at the moment is to manually delete the default namespace from the xml file.This seems to be the most easy way if you can manually edit the file.If Anyone has better approach would be great. – Cosmin_Victor Nov 16 '17 at 20:06

1 Answers1

3

As mentioned here by @MartijnPieters, etree's .findall uses the namespaces argument while the .register_namespace() is used for xml output of the tree. Therefore, consider mapping the default namespace with an explicit prefix. Below uses doc but can even be cosmin.

Additionally, consider with and enumerate() even the csv module as better handlers for your print and CSV outputs.

import csv
...

root = tree.getroot()
print('------------------Generate test-------------\n')

with open(csvFile, 'w') as f:
    c = csv.writer(f, lineterminator='\n')

    for n, child in enumerate(root.findall('doc:campaign', namespaces={'doc':'http://www.demandware.com/xml/impex/promotion/2008-01-31'})):
        print(child.attrib['campaign-id'])
        print(n)
        c.writerow([child.attrib['campaign-id']])

# ------------------Generate test-------------

# 10off-310781
# 0
# MNT-deals
# 1
# black-friday
# 2
Parfait
  • 104,375
  • 17
  • 94
  • 125