2

TL;DR: How can I go from "CSV Example" to create an array like "End Result"?

Background
I'm building a lab file system for testing purposes and I want to create a folder structure that looks somewhat like a real file system. I have several CSV files that contains folder information.

CSV Example:

Department       Level1                  Level2                Level3
Human Resources  Personnel               Templates             APAC
                 Job Applications        Customer Relations    EMEA
                 Salaries and Expenses   Directors             NA
                 Vacation Tracking       Human Resources       SA
                 Disputes                Legal Services
                                         Marketing
                                         Production
                                         Finance
                                         IT Services

I want to take each combination above and create a file system that has all the following folders.

End Result:

Human Resources\Personnel\Templates\APAC
Human Resources\Personnel\Templates\EMEA
...
Human Resources\Disputes\IT Services\NA
Human Resources\Disputes\IT Services\SA

Once I have the above array of all full paths, it's as simple as just doing:

foreach($folder in $MyFolderArray){
    New-Item "\\Server\Share$\$folder" -ItemType Directory -Force
}

Problem
I want to be able to do this for any CSV file, regardless of how many columns I have, what the header names are, or how many values each column has. Currently, I'm hardcoding 4 foreach-loops but that solution requires all CSV files to have the same number of columns and header names. I'm looking for something that can take any CSV file of any column count and length.

Getting all headers from any CSV can be done like:

$CSVContent = Import-CSV "C:\PathToMyCSVFile"
$CSVHeaders = $CSVContent[0].PSObject.Properties.Name

This can be used to split the $CSVContent into one array per column with:

for($i=0;$i -lt $CSVHeaders.Count;$i++){
    New-Variable -Name "Header$i" -Value $($CSVContent.$($CSVHeaders[$i]) | Where-Object{$_ -ne ""})
}

This creates arrays from $Header0 to $Header# where # is number of CSV columns minus 1, each array having all values from that column. Going from these arrays to create the final array with all full paths is where I'm stuck.

Question
How can I solve the logic of building a foreach(foreach(... loop that enumerates all value combinations without hardcoding this? I'm guessing this requires recursively calling the loop itself but I'm not sure how to do that.

Tanaka Saito
  • 943
  • 1
  • 17
  • 40

1 Answers1

3

For issues like this, you might create a recursive function.

Wikipedia

In computer science, recursion is a method of solving a problem where the solution depends on solutions to smaller instances of the same problem. Such problems can generally be solved by iteration, but this needs to identify and index the smaller instances at programming time. Recursion solves such recursive problems by using functions that call themselves from within their own code. The approach can be applied to many types of problems, and recursion is one of the central ideas of computer science.

Import data

# $Data = Import-Csv .\Data.csv
# https://www.powershellgallery.com/packages/ConvertFrom-SourceTable
$Data = ConvertFrom-SourceTable '
Department       Level1                  Level2                Level3
Human Resources  Personnel               Templates             APAC
                 Job Applications        Customer Relations    EMEA
                 Salaries and Expenses   Directors             NA
                 Vacation Tracking       Human Resources       SA
                 Disputes                Legal Services
                                         Marketing
                                         Production
                                         Finance
                                         IT Services'

Example

function Add-Leaves($Path, $i = 0) {
    $Names = $Data[0].PSObject.Properties.Name
    if ($i -lt $Names.count) {
        Foreach ($Leaf in $Data.($Names[$i])) {
            if ($Leaf) { Add-Leaves "$Path\$Leaf" ($i + 1) }
        }
    } else { $Path }
}

Add-Leaves '\\Server\Share$'

Result

\\Server\Share$\Human Resources\Personnel\Templates\APAC
\\Server\Share$\Human Resources\Personnel\Templates\EMEA
\\Server\Share$\Human Resources\Personnel\Templates\NA
\\Server\Share$\Human Resources\Personnel\Templates\SA
\\Server\Share$\Human Resources\Personnel\Customer Relations\APAC
\\Server\Share$\Human Resources\Personnel\Customer Relations\EMEA
...

Explanation

  1. Add-Leaves($Path, $i = 0)
    The recursive function which is called also from within. Where:
    • $Path is the current path where the leafs will be added to
    • $i it the column index, default: $i = 0 (first column)
  2. $Names = $Data[0].PSObject.Properties.Name
    This will retrieve all the header names (see also: Iterate over PSObject properties in PowerShell) using a PowerShell feature called member enumeration.
    Note that this is actually a static variable and therefore might be placed outside the function similar to the static $Data table (which you might consider to add as a parameter to the function)
  3. if ($i -lt $Names.count) {
    This will check whether the column index ($i) is still within the number of columns. Meaning that this will eventually stop the recursion and prevent a infinitief loop (and go to step 7.).
  4. Foreach ($Leaf in $Data.($Names[$i])) {
    This will iterate each value in the current column (using member enumeration).
    $Names[3] 'Level3', $Data.'Level3' 'APAC', 'EMEA', 'NA', ...
  5. if ($Leaf) { Add-Leaves "$Path\$Leaf" ($i + 1) }
    • if ($Leaf) { excludes the empty fields in the column, e.g. the first column Department has just one item (Human Resources), the rest should be excluded
    • Add-Leaves "$Path\$Leaf" ($i + 1) is the actual recursive call, where the same actions are done:
      • with a $Path that now includes each $Leaf in the current column
      • on the next column ($i + 1 = recursion depth)
  6. else { $Path }
    In case all the columns are processed (see step 3.), output the current $Path

Todo

If you actually want to create folders, you probably want to do this between step 4. and 5. and implement something like:

  1. a.
    If the subfolder $Leaf doesn't exists in the current directory ($Path), create it ("$Path\$Leaf")
iRon
  • 20,463
  • 10
  • 53
  • 79
  • 1
    This is perfect, it does exactly what I was looking for and I don't need to create variables and keep them in memory, thank you so much! I made an adjustment where I include the $data in the calling of the function (as in the function takes [array]$data too) as I want it to be possible to use anywhere, not necessarily first having $data be defined. I'm guessing that if I have too many columns this will make PS keep a whole lot of information in memory though so I'm not sure that's a good trade-off or not. – Tanaka Saito Dec 08 '20 at 09:09
  • There's only one thing I'm not completely clear on. When you do "if ($Leaf) { Add-Leafs...", is this checking if $Leaf is $null or not? What does this if($Leaf) actually verify? If I run if($null -ne $leaf) I get lots and lots of combinations of entries I'm not looking for. I think this logic here is the only bit I don't understand. – Tanaka Saito Dec 08 '20 at 09:16
  • Ah, was just about to comment on that :) I changed the Foreach there so that the if isn't necessary: Foreach ($Leaf in $($Data.($Names[$i]) | Where-Object{$_ -ne ""})){... which excludes the empty fields already in the topmost Foreach. Again, thank you so much! This has been a headache for me for the past two days, now I can sleep on this :) – Tanaka Saito Dec 08 '20 at 09:25
  • We're all learners here on this glorious day! – Tanaka Saito Dec 08 '20 at 09:36
  • Can I just say that I loved the nice touch that you call the function "Add-Leaf" since we're building a folder tree, that was very much appreciated :) – Tanaka Saito Dec 08 '20 at 12:37