1

I want to compare values in two csv files and return any entries from source2 that do NOT match entries in source1 while disregarding any duplicate entries. Below is my attempt, but it does not return all entries. What would be the best way to get this script to do what I need?

$AD = Import-CSV -Path "source1.csv"
$Student = Import-CSV -Path "source2.csv"

$AD |OuterJoin-Object $Student -on UserPrincipalName | Export-CSV -Path "Path.csv"

Source1 and Source2 csv have columns "Name", "UserPrincipalName", and "TeamDesc". I want to use those to match entries. Ideally input/output would look like this:

Source1.csv
| TeamDesc | UserPrincipalName   |   Name      |
|:---------|:--------------------|:------------|
| Team 1   | student1@domain.com | john smith  |
| Team 1   | student2@domain.com | nancy drew  |
| Team 2   | student3@domain.com | harvey dent |

Source2.csv
| TeamDesc |  UserPrincipalName  |   Name      |
|:---------|:--------------------|:------------|
| Team 1   | student1@domain.com | john smith  |
| Team 2   | student3@domain.com | harvey dent |

Export.csv
| TeamDesc | UserPrincipalName   |  Name      |
|:---------|:--------------------|:-----------|
| Team 1   | student2@domain.com | nancy drew |

2 Answers2

2

Unsure how it's done with OuterJoin-Object. I assume you wanted to do this:

$AD = Import-Csv source1.csv | Group-Object UserPrincipalName -AsHashTable -AsString
$Student = Import-CSV -Path source2.csv

@(
    $AD.Values.ForEach{ $_ }
    $Student.Where{ -not $AD.ContainsKey($_.UserPrincipalName) }
) | Export-CSV -Path Path.csv -NoTypeInformation

If you want to exclude possible duplicates coming from source2.csv, you can use this:

@(
    $AD.Values.ForEach{ $_ }
    $Student.Where{ -not $AD.ContainsKey($_.UserPrincipalName) }.
        ForEach{ $AD[$_.UserPrincipalName] = $null }
) | Export-CSV -Path Path.csv -NoTypeInformation

Looking now at your, now edited answer which provides an expected output, it seems what you actually wanted was:

$set = [System.Collections.Generic.HashSet[string]]::new(
    [string[]] (Import-CSV -Path stundent.csv).UserPrincipalName,
    [System.StringComparer]::InvariantCultureIgnoreCase
)
Import-Csv ad.csv | Where-Object { $set.Add($_.UserPrincipalName) } |
    Export-Csv path\to\output.csv -NoTypeInformation
Santiago Squarzon
  • 41,465
  • 5
  • 14
  • 37
  • 1
    this is close, but not quite what I am looking for. I made some edits to my original question, hopefully it is a little more clear now. – Tyler Laidler Oct 13 '22 at 16:36
  • @TylerLaidler I see, this is why it's important to provide reproducible examples when asking... so basically you're interested in values of `$AD` and not interested in exporting anything from `$Student`, just checking if the `UserPrincipalName` in `$AD` is not in `$student` ? – Santiago Squarzon Oct 13 '22 at 16:43
  • Correct. I want to return the entries that are found in one file and not the other, and disregard any entries found in both files – Tyler Laidler Oct 13 '22 at 17:02
  • is there a way to include the 'TeamDesc' property in the filtering? The script you provided works well, but it seems like I am missing some of the changes. I notice it more with smaller, less diverse files, and want to be sure all changes are captured. What would be the best way to include another filter parameter? – Tyler Laidler Nov 15 '22 at 20:38
  • @TylerLaidler Please ask a new question with an example of what you want and what is currently wrong – Santiago Squarzon Nov 15 '22 at 20:47
0
$Source1 = ConvertFrom-Csv @'
TeamDesc, UserPrincipalName,   Name
"Team 1", student1@domain.com, "john smith"
"Team 1", student2@domain.com, "nancy drew"
"Team 2", student3@domain.com, "harvey dent"
'@

$Source1 = ConvertFrom-Csv @'
TeamDesc, UserPrincipalName,   Name
"Team 1", student1@domain.com, "john smith"
"Team 2", student3@domain.com, "harvey dent"
'@
$Source1 |OuterJoin $Source2 -On Name,TeamDesc

TeamDesc UserPrincipalName            Name
-------- -----------------            ----
Team 1   {student2@domain.com, $null} nancy drew

Note that the UserPrincipalName has two values the one from the left table and the (empty) one from the right table. This will show you on with side the information is missing. You might also further distinguish the properties with the -Name parameter e.g.:

$Source1 |OuterJoin $Source2 -On Name,TeamDesc -Name Source1,Source2

TeamDesc Source1UserPrincipalName Source2UserPrincipalName Name
-------- ------------------------ ------------------------ ----
Team 1   student2@domain.com                               nancy drew

Or just the left properties only:

$Source1 |OuterJoin $Source2 -On Name,TeamDesc -Property 'Left.*'

TeamDesc UserPrincipalName   Name
-------- -----------------   ----
Team 1   student2@domain.com nancy drew
iRon
  • 20,463
  • 10
  • 53
  • 79