I've been looking for answer for couple hours for now but I can't seem t find an easy answer for how to find a tag in an XML file and extract its value. For example:
<?xml version="1.0" encoding="UTF-8"?>
<record name="test111.xml" type="content">
<item name="Metadata">
<value>
<item name="label1">
<value>
<item name="dummy1">
<value />
</item>
</value>
</item>
<item name="Source">
<value>test</value>
</item>
<item name="Channel">
<value>celebrate</value>
</item>
<item name="Catref">
<value>cat150032</value>
</item>
<item name="ParentsSource" />
<item name="MobileFriendly" />
<item name="VirtualPath">
<value>/testFolder/testItem</value>
</item>
<item name="VirtualFileName">
<value>sweets/</value>
</item>
<item name="RefreshMode">
<value>soft</value>
</item>
<item name="label2">
<value>
<item name="dummy2">
<value />
</item>
</value>
</item>
</value>
</item>
</record>
In this instance, I'm trying to get the value of VirtualPath
as/testFolder/testItem
and maybe more.
from __future__ import print_function
import os
folderArray = ['\test1', '\test2', '\test3', '\test4']
path = 'D:\Projects\test'
for i in range(0, len(folderArray)):
tempPath = path + folderArray[i]
for root, dirs, files in os.walk(tempPath):
for file in files:
if file.endswith(".xml"):
if (folderArray[0]):
# LOOK ONLY FOR VIRTUAL PATH
f = open(tempPath, 'r')
for line in f:
if "VirtualPath" in line:
# HERE FIND THE VALUE OF THE TAG
After opening the XML file I do not know how to look for a specific tag and find out its value. I know that I can do an indexof and substring the value out, but that is not the most efficient way.
EDIT
Found the answer after playing around a little. I know this has a duplicate but I didn't know about it before I've created this question. Answer:
from __future__ import print_function
import os
from xml.dom import minidom
folderArray = ['\test1', '\test2', '\test3', '\test4']
path = 'D:\Projects\test'
for i in range(0, len(folderArray)):
tempPath = path + folderArray[i]
for root, dirs, files in os.walk(tempPath):
for file in files:
if file.endswith(".xml"):
xmldoc = minidom.parse(tempPath + '\\' + file)
itemlist = xmldoc.getElementsByTagName('item')
for s in itemlist:
if (s.attributes['name'].value == 'VirtualPath'):
print(s.childNodes[0].firstChild.nodeValue)