1

I have lots of XML files which contain data I need. I need to parse out an attribute as the column headers and the InnerXml as the values. I can handle looping through the documents, but pulling the individual attribute and innerxml is throwing me

XML file Example

<?xml version="1.0" encoding="UTF-8"?>
<files>
    <file type="lcmContract" id="72187_fi20190000046444">
        <title>Bacon Cheeseburger</title>
        <field name="lcmHeaderGrp"></field>
        <field name="lcmHasNote">_image=custom/crmNote.gif</field>
        <field name="lcmSubject">Bacon Cheeseburgers are Good</field>
        <field name="lcmPrincipal">Frontline Rock</field>
        <field name="lcmClosingDate">@20190704</field>
        <field name="lcmAclList">
Royalties
LEGAL_DEPT
LegalExt
MktgRock
AllExceptRestricted
LegalTemp
LEGAL_DEPT_READ
lcmAdmin
CONTRACT-Administrator
</field>
    </file>
</files>

Table I'm trying to get to:

title lcmSubject lcmPrincipal lcmClosingDate lcmAclList
Bacon Cheeseburger Bacon Cheeseburgers are Good Frontline Rock @20190704 Royalties,LEGAL_DEPT,LegalExt,MktgRock,AllExceptRestricted,LegalTemp,LEGAL_DEPT_READ,lcmAdmin,CONTRACT-Administrator

I can the lists of data I'm expecting, but can't Pivot them into headers and values. I also don't really NEED all the data, so if I can figure out how to limit the columns based on the Attribute Name that'd be helpful.

Some of the scripts I've used:

[xml]$xmlFile = Get-Content -Path C:\Doc.xml

$xmlFile.files.file.ChildNodes | Select-Object -Expand Name This brings back column headers, but in a list view

$xmlFile.files.file.ChildNodes | Select-Object -Expand InnerXml This brings back values, but in a list view

Ken White
  • 123,280
  • 14
  • 225
  • 444

1 Answers1

0

Instead of extracting names and values separately, do a single pass over ChildNodes and store them in a hashtable. A hashtable can be converted to PSCustomObject which Format-Table accepts.

Filtering can be done using Where-Object on the name property, which works both for the title and the field elements (for the former it's the name of the element itself).

# Note: The following should be preferred over Get-Content which
#       doesn't respect XML encoding!
$docPath = 'C:\Doc.xml'
$xmlFile = [xml]::new(); $xmlFile.Load(( Convert-Path -LiteralPath $docPath ))

# Create an ordered hashtable as a precursor for a PSCustomObject
$ht = [ordered] @{}

# Process all ChildNodes
$xmlFile.files.file.ChildNodes |
    # Filter by Name property (which is either element name or Name attribute)
    Where-Object Name -match 'title|lcmSubject|lcmPrincipal|lcmClosingDate|lcmAclList' | 
    ForEach-Object {
        # Get element text, trim whitespace, replace any line breaks by comma.
        $value = $_.'#text'.Trim() -replace '\r?\n', ',' 

        # Add hashtable entry (associate name with value)
        $ht[ $_.Name ] = $value 
    }

# Convert hashtable to a PSCustomObject so Format-Table prints it as expected.
[PSCustomObject] $ht | Format-Table -Wrap

Output:

title              lcmSubject                   lcmPrincipal   lcmClosingDate lcmAclList
-----              ----------                   ------------   -------------- ----------
Bacon Cheeseburger Bacon Cheeseburgers are Good Frontline Rock @20190704      Royalties,LEGAL_DEPT,LegalExt,MktgRock,Al
                                                                              lExceptRestricted,LegalTemp,LEGAL_DEPT_RE
                                                                              AD,lcmAdmin,CONTRACT-Administrator
zett42
  • 25,437
  • 3
  • 35
  • 72
  • This worked beautifully - but now looping through multiple XML documents gives me the header each time - how do I fix that? – baconfeltoon Mar 14 '23 at 15:57
  • @baconfeltoon Move `Format-Table` from inside of loop to outside. E. g. `$xmlFiles | ForEach-Object { <# process XML file #>; [PSCustomObject] $ht } | Format-Table -Wrap` – zett42 Mar 14 '23 at 16:08
  • @baconfeltoon Feel free to post a new question with your coding attempt. Makes it easier than discussion in comments here. – zett42 Mar 14 '23 at 17:17
  • Thanks here [link](https://stackoverflow.com/questions/75736469/getting-an-empty-pipe-error-when-trying-to-move-format-table-outside-of-loop) – baconfeltoon Mar 14 '23 at 17:25