1

I have this XML style file.

<?xml version="1.0" encoding="UTF-8" ?>
    <searchresults timestamp='ttt' www='www' qqq='qqq' url='url'>
        <obj id='00001' type='Random' name1='Mow' class='Data' >
            <tags>
                <tag key="Car" value="4x4" />
                <tag key="City" value="Big" />
                <tag key="Dog" value="Loud" />
            </tags>
            <details>
                <name desc="fulldesc">Full description</name>
            </details>
            <i1>Empty</i1>
            <i2>False</i2>
            <i3>True</i3>
        </obj>
        <obj id='00002' type='Random' name1='AGP' class='BigData' >
            <tags>
                <tag key="Car" value="Broken" />
                <tag key="City" value="Fresh" />
                <tag key="Dog" value="Long" />
            </tags>
            <details>
                <name desc="fulldesc">Good desc</name>
            </details>
            <i1>True</i1>
            <i2></i2>
            <i3>False</i3>
        </obj>
    </searchresults>

I need export all content to CSV. in particular I need data from tag and details containers. In this child nodes information has such scheme - key and value. I need make (key must be - column name) (value must be - key column value). (fulldesc = column name, "Full description" - content for fulldesc column) Like this

id  type  name  class  Car    City   Dog     fulldesc              i1           i2           i3
--  ----  ----  -----  ----   ----   ----    -------               --           --           --
id1 type1 name1 class1 4x4    Big    Loud    Full description      Information1 Information2 Information3
id2 type2 name2 class2 4x4    Big    Loud    Full description      Information1 Information2 Information3

I use this code for export - it work pretty, but I can't take some content from XML.

[xml]$inputFile = Get-Content ".\xmlFile.xml"
$inputFile.searchresults.ChildNodes |
    Export-Csv ".\xmlFile.csv" -NoTypeInformation -Delimiter:";" -Encoding:UTF8
Import-Csv -Delimiter ';' -Encoding:UTF8 -Path ".\xmlFile.csv" | Format-Table

Result:

id  type  name  class  tags                  details               i1           i2           i3
--  ----  ----  -----  ----                  -------               --           --           --
id1 type1 name1 class1 System.Xml.XmlElement System.Xml.XmlElement Information1 Information2 Information3
id2 type2 name2 class2 System.Xml.XmlElement System.Xml.XmlElement Information1 Information2 Information3

In tags and details columns no content.

when I use this code:

[xml]$xml = Get-Content .\XML\12.xml
$xml.SelectNodes("//*")

I receive all information from XML:

timestamp : ttt
www       : www
qqq       : qqq
url       : url
obj       : {obj, obj}

id      : 00001
type    : Random
name1   : Mow
class   : Data
tags    : tags
details : details
i1      : Empty
i2      : False
i3      : True

tag : {tag, tag, tag}

key   : Car
value : 4x4

key   : City
value : Big

key   : Dog
value : Loud

name : name

desc  : fulldesc
#text : Full description

#text : Empty

#text : False

#text : True

id      : 00002
type    : Random
name1   : AGP
class   : BigData
tags    : tags
details : details
i1      : True
i2      :
i3      : False

tag : {tag, tag, tag}

key   : Car
value : Broken

key   : City
value : Fresh

key   : Dog
value : Long

name : name

desc  : fulldesc
#text : Good desc

#text : True

Name            : i2
LocalName       : i2
NamespaceURI    :
Prefix          :
NodeType        : Element
ParentNode      : obj
OwnerDocument   : #document
IsEmpty         : False
Attributes      : {}
HasAttributes   : False
SchemaInfo      : System.Xml.XmlName
InnerXml        :
InnerText       :
NextSibling     : i3
PreviousSibling : i1
Value           :
ChildNodes      : {}
FirstChild      :
LastChild       :
HasChildNodes   : False
IsReadOnly      : False
OuterXml        : <i2></i2>
BaseURI         :
PreviousText    :

#text : False
Ansgar Wiechers
  • 193,178
  • 25
  • 254
  • 328
Denis Or
  • 162
  • 2
  • 12
  • 1
    Related: [Convert XML to CSV automatically in Powershell](https://superuser.com/q/1038868/376602) or [similar one](https://stackoverflow.com/questions/30575274/converting-xml-with-namespaces-to-csv-using-powershell) – JosefZ Sep 13 '19 at 21:57
  • 1
    The nodes `` and `
    ` contain structured information. You need to define in your code how you want that mangled into a string. You can use [calculated properties](https://technet.microsoft.com/en-us/library/ff730948.aspx) for the implementation.
    – Ansgar Wiechers Sep 14 '19 at 12:06

1 Answers1

1

If you're willing to hard-code the the CSV column names as well as the number of columns, you can use calculated properties, as Ansgar Wiechers suggests:

# Read the input file into an XML DOM.
[xml] $xml = Get-Content -Raw ".\xmlFile.xml"

# Loop over all <obj> children of <searchresults>
$xml.searchresults.obj | Select-Object id, type, name1, class, 
  @{ n='Car'; e = {$_.tags.tag[0].value} }, 
  @{ n='City'; e = {$_.tags.tag[1].value} }, 
  @{ n='Dog'; e = {$_.tags.tag[2].value} }, 
  @{ n='desc'; e = {$_.details.name.'#text'} }, 
  i1, i2, i3 |
    Export-Csv ".\xmlFile.csv" -NoTypeInformation -Delimiter ";" -Encoding:UTF8   

If the column names must be derived from the XML document attribute and element names, dynamically, and/or the column count is variable:

Note:

  • For a given input XML document, you need to make sure that at least the first output object has all properties (columns) that you want the CSV file to contain. It is the first object that locks in the output columns.

  • So as not to make up-front assumptions about the names and number of columns, the solution below adds the properties as they're being found in the XML document, which then implies the output CSV column ordering, which means:

    • You must either rely on all <obj> elements containing the same set of attributes / child elements to turn into columns, or pre-populate the $properties hashtable with the maximum set of output columns, if known.

    • Otherwise, if you want to change the output column ordering, extra logic is required.

# Read the input file into an XML DOM.
[xml] $xml = Get-Content -Raw ".\xmlFile.xml"

# Loop over all <obj> children of <searchresults>
$xml.searchresults.obj | ForEach-Object {

  $el = $_  # The input element at hand.

  # Initialize an ordered hashtable in which to collect / construct the properties
  # to send to the CSV file.
  $properties = [ordered] @{}

  # Loop over all attributes / children of the given <obj>
  foreach ($propName in ($el | Get-Member -Type Property).Name) {

    switch ($propName) {
      'tags' {
        # Instead of a single 'tags' property, create individual properties
        # based on the keys and values of the <tag> elements.
        foreach ($child in $el.tags.tag) {
          $properties.($child.Key) = $child.Value    
        }
        break
      }
      'details' {
        # Instead of a 'details' property,  use the <name> element's
        # 'desc' attribute value as the property name and assign it
        # the element's content.
        $properties.($el.$_.name.desc) = $el.$_.name.'#text'
        break
      }
      default {
        # All other properties: pass them through.
        $properties.$propName = $el.$propName 
      }
    }

  }

  # Output the ordered hashtable as a custom object, whose property names will become 
  # the CSV column headers and whose values will become the row values.
  [pscustomobject] $properties

} | Export-Csv ".\xmlFile.csv" -NoTypeInformation -Delimiter ";" -Encoding:UTF8
  • The core of the approach is $el | Get-Member -Type Property).Name, which extracts all the property names from each <obj> element, because these properties represent the attributes and child elements of the given element.

  • This allows the properties to be looped over, and either passed through or preprocessed, using a switch statement.

mklement0
  • 382,024
  • 64
  • 607
  • 775