8

I have a JSON file that looks like this:

{
    "id": 10011,
    "title": "Test procedure",
    "slug": "slug",
    "url": "http://test.test",
    "email": "test@test.com",
    "link": "http://test.er",
    "subject": "testing",
    "level": 1,
    "disciplines": [
      "discipline_a",
      "discipline_b",
      "discipline_c"
    ],
    "areas": [
      "area_a",
      "area_b"
    ]
  },

I was trying to use the following command to convert that into the CSV file:

(Get-Content "PATH_TO\test.json" -Raw | ConvertFrom-Json)| Convertto-CSV -NoTypeInformation | Set-Content "PATH_TO\test.csv"

However, for disciplines and areas I am getting System.Object[] in the resulting CSV file.

Is there a way to put all those nested values as a separate columns in CSV file like area_1, area_2 etc. And the same for disciplines.

abatishchev
  • 98,240
  • 88
  • 296
  • 433
user2758935
  • 131
  • 1
  • 1
  • 8
  • maybe this link will help: https://stackoverflow.com/questions/30485004/powershell-convertfrom-json-convertto-csv – Anthony McGrath Aug 23 '17 at 03:05
  • In your example, what value would appear in the 'discipline_a', or 'area_a' columns for this record? – andyb Aug 23 '17 at 03:22
  • @andyb those will be just strings specifying different disciplines and areas. So, for example, areas array might have "maths", "chemistry". So what I want in the resulting CSV is column 'area_1' to contain "maths" and columns 'area_2' to contain "chemistry". The number of 'area_.' columns should be determined by the MAX number of areas a particular object might have – user2758935 Aug 23 '17 at 03:30

2 Answers2

10

2017-11-20, Completely rewrote function to improve performance and add features as -ArrayBase and support for PSStandardMembers and grouped objects.

Flatten-Object

Recursively flattens objects containing arrays, hash tables and (custom) objects. All added properties of the supplied objects will be aligned with the rest of the objects.

Requires PowerShell version 2 or higher.

Cmdlet

Function Flatten-Object {                                       # Version 00.02.12, by iRon
    [CmdletBinding()]Param (
        [Parameter(ValueFromPipeLine = $True)][Object[]]$Objects,
        [String]$Separator = ".", [ValidateSet("", 0, 1)]$Base = 1, [Int]$Depth = 5, [Int]$Uncut = 1,
        [String[]]$ToString = ([String], [DateTime], [TimeSpan]), [String[]]$Path = @()
    )
    $PipeLine = $Input | ForEach {$_}; If ($PipeLine) {$Objects = $PipeLine}
    If (@(Get-PSCallStack)[1].Command -eq $MyInvocation.MyCommand.Name -or @(Get-PSCallStack)[1].Command -eq "<position>") {
        $Object = @($Objects)[0]; $Iterate = New-Object System.Collections.Specialized.OrderedDictionary
        If ($ToString | Where {$Object -is $_}) {$Object = $Object.ToString()}
        ElseIf ($Depth) {$Depth--
            If ($Object.GetEnumerator.OverloadDefinitions -match "[\W]IDictionaryEnumerator[\W]") {
                $Iterate = $Object
            } ElseIf ($Object.GetEnumerator.OverloadDefinitions -match "[\W]IEnumerator[\W]") {
                $Object.GetEnumerator() | ForEach -Begin {$i = $Base} {$Iterate.($i) = $_; $i += 1}
            } Else {
                $Names = If ($Uncut) {$Uncut--} Else {$Object.PSStandardMembers.DefaultDisplayPropertySet.ReferencedPropertyNames}
                If (!$Names) {$Names = $Object.PSObject.Properties | Where {$_.IsGettable} | Select -Expand Name}
                If ($Names) {$Names | ForEach {$Iterate.$_ = $Object.$_}}
            }
        }
        If (@($Iterate.Keys).Count) {
            $Iterate.Keys | ForEach {
                Flatten-Object @(,$Iterate.$_) $Separator $Base $Depth $Uncut $ToString ($Path + $_)
            }
        }  Else {$Property.(($Path | Where {$_}) -Join $Separator) = $Object}
    } ElseIf ($Objects -ne $Null) {
        @($Objects) | ForEach -Begin {$Output = @(); $Names = @()} {
            New-Variable -Force -Option AllScope -Name Property -Value (New-Object System.Collections.Specialized.OrderedDictionary)
            Flatten-Object @(,$_) $Separator $Base $Depth $Uncut $ToString $Path
            $Output += New-Object PSObject -Property $Property
            $Names += $Output[-1].PSObject.Properties | Select -Expand Name
        }
        $Output | Select ([String[]]($Names | Select -Unique))
    }
}; Set-Alias Flatten Flatten-Object

Syntax

<Object[]> Flatten-Object [-Separator <String>] [-Base "" | 0 | 1] [-Depth <Int>] [-Uncut<Int>] [ToString <Type[]>]

or:

Flatten-Object <Object[]> [[-Separator] <String>] [[-Base] "" | 0 | 1] [[-Depth] <Int>] [[-Uncut] <Int>] [[ToString] <Type[]>]

Parameters

-Object[] <Object[]>
The object (or objects) to be flatten.

-Separator <String> (Default: .)
The separator used between the recursive property names. .

-Depth <Int> (Default: 5)
The maximal depth of flattening a recursive property. Any negative value will result in an unlimited depth and could cause a infinitive loop.

-Uncut <Int> (Default: 1)
The number of object iterations that will left uncut further object properties will be limited to just the DefaultDisplayPropertySet. Any negative value will reveal all properties of all objects.

-Base "" | 0 | 1 (Default: 1)
The first index name of an embedded array:

  • 1, arrays will be 1 based: <Parent>.1, <Parent>.2, <Parent>.3, ...
  • 0, arrays will be 0 based: <Parent>.0, <Parent>.1, <Parent>.2, ...
  • "", the first item in an array will be unnamed and than followed with 1: <Parent>, <Parent>.1, <Parent>.2, ...

-ToString <Type[]= [String], [DateTime], [TimeSpan]>
A list of value types (default [String], [DateTime], [TimeSpan]) that will be converted to string rather the further flattened. E.g. a [DateTime] could be flattened with additional properties like Date, Day, DayOfWeek etc. but will be converted to a single (String) property instead.

Note:
The parameter -Path is for internal use but could but used to prefix property names.

Examples

Answering the specific question:

(Get-Content "PATH_TO\test.json" -Raw | ConvertFrom-Json) | Flatten-Object | Convertto-CSV -NoTypeInformation | Set-Content "PATH_TO\test.csv"

Result:

{
    "url":  "http://test.test",
    "slug":  "slug",
    "id":  10011,
    "link":  "http://test.er",
    "level":  1,
    "areas.2":  "area_b",
    "areas.1":  "area_a",
    "disciplines.3":  "discipline_c",
    "disciplines.2":  "discipline_b",
    "disciplines.1":  "discipline_a",
    "subject":  "testing",
    "title":  "Test procedure",
    "email":  "test@test.com"
}

Stress testing a more complex custom object:

New-Object PSObject @{
    String    = [String]"Text"
    Char      = [Char]65
    Byte      = [Byte]66
    Int       = [Int]67
    Long      = [Long]68
    Null      = $Null
    Booleans  = $False, $True
    Decimal   = [Decimal]69
    Single    = [Single]70
    Double    = [Double]71
    Array     = @("One", "Two", @("Three", "Four"), "Five")
    HashTable = @{city="New York"; currency="Dollar"; postalCode=10021; Etc = @("Three", "Four", "Five")}
    Object    = New-Object PSObject -Property @{Name = "One";   Value = 1; Text = @("First", "1st")}
} | Flatten

Result:

Double               : 71
Decimal              : 69
Long                 : 68
Array.1              : One
Array.2              : Two
Array.3.1            : Three
Array.3.2            : Four
Array.4              : Five
Object.Name          : One
Object.Value         : 1
Object.Text.1        : First
Object.Text.2        : 1st
Int                  : 67
Byte                 : 66
HashTable.postalCode : 10021
HashTable.currency   : Dollar
HashTable.Etc.1      : Three
HashTable.Etc.2      : Four
HashTable.Etc.3      : Five
HashTable.city       : New York
Booleans.1           : False
Booleans.2           : True
String               : Text
Char                 : A
Single               : 70
Null                 :

Flatting grouped objects:

$csv | Group Name | Flatten | Format-Table # https://stackoverflow.com/a/47409634/1701026

Flatting common objects:

(Get-Process)[0] | Flatten-Object

Or a list (array) of objects:

Get-Service | Flatten-Object -Depth 3 | Export-CSV Service.csv

Note that a command as below takes hours to compute:

Get-Process | Flatten-Object | Export-CSV Process.csv

Why? because it results in a table with a few hundred rows and several thousand columns. So if you if would like to use this for flatting process, you beter limit the number of rows (using the Where-Object cmdlet) or the number of columns (using the Select-Object cmdlet).

For the latest Flatten-Object version, see: https://powersnippets.com/flatten-object/

iRon
  • 20,463
  • 10
  • 53
  • 79
  • 1
    @ste_irl: Technically, yes, although you will loss some details as you can't differentiate between objects and hashtables in the flatten format, besides, there is no difference between an array `@("one", "two")` and a hashtable/object like this: `@{'1' = 'one', '2' = 'two'}` and also a **key** like `asp.net` will be expanded in `asp = @{net = @{...`. Anyway, you might give it a go and if you run into problems, create a new question for help.. – iRon Feb 01 '18 at 14:40
4

The CSV conversion/export cmdlets have no way of "flattening" an object, and I may be missing something, but I know of no way to do this with a built-in cmdlet or feature. If you can guarantee that disciplines and areas will always have the same number of elements, you can trivialize it by using Select-Object with derived properties to do this:

$properties=@('id','title','slug','url','email','link','subject','level',
    @{Name='discipline_1';Expression={$_.disciplines[0]}}
    @{Name='discipline_2';Expression={$_.disciplines[1]}}
    @{Name='discipline_3';Expression={$_.disciplines[2]}}
    @{Name='area_1';Expression={$_.areas[0]}}
    @{Name='area_2';Expression={$_.areas[1]}}
)
(Get-Content 'PATH_TO\test.json' -Raw | ConvertFrom-Json)| Select-Object -Property $properties | Export-CSV -NoTypeInformation -Path 'PATH_TO\test.csv'

However, I am assuming that disciplines and areas will be variable length for each record. In that case, you will have to loop over the input and pull the highest count value for both disciplines and areas, then build the properties array dynamically:

$inputData = Get-Content 'PATH_TO\test.json' -Raw | ConvertFrom-Json
$counts = $inputData | Select-Object -Property     @{Name='disciplineCount';Expression={$_.disciplines.Count}},@{Name='areaCount';Expression={$_.areas.count}}
$maxDisciplines = $counts | Measure-Object -Maximum -Property disciplineCount | Select-Object -ExpandProperty     Maximum
$maxAreas = $counts | Measure-Object -Maximum -Property areaCount | Select-Object -ExpandProperty Maximum

$properties=@('id','title','slug','url','email','link','subject','level')

1..$maxDisciplines | % {
  $properties += @{Name="discipline_$_";Expression=[scriptblock]::create("`$_.disciplines[$($_ - 1)]")}
}

1..$maxAreas | % {
  $properties += @{Name="area_$_";Expression=[scriptblock]::create("`$_.areas[$($_ - 1)]")}
}

$inputData | Select-Object -Property $properties | Export-CSV -NoTypeInformation -Path 'PATH_TO\test.csv'

This code hasn't been fully tested, so it may need some tweaking to work 100%, but I believe the ideas are solid =)

Joseph Alcorn
  • 2,322
  • 17
  • 23
  • Thank you so much! It works like magic. Exactly what I needed! I used the second piece of the code you provided. – user2758935 Aug 23 '17 at 05:02