1

Very new to Powershell and trying to get my head around how to compare two Excel files, both with the column 'User_ID' in common, and output only the records sharing the User_ID value to a new Excel file.

File 1 'Powershell_Results.xlsx' contains a single column 'User_ID' with 80 rows of data. File 2 'Master_Data.xlsx' contains three columns: 'Host_name', 'Serial_number' and 'User_ID' with 4150 rows of data.

Powershell_results.csv = User_ID AVHQSX BKSAXS CFNSAU

Master_Data.csv = User_ID AVHQSX BKSAXS CFNSAU

Serial_Number 566409 127629 466160

Hostname WDC00001 WDC00006 WDC00007

Unsure if a first step of conversion from XLSX to CSV would make things easier.

I feel like I'm close but can't work out the remaining few kinks. Working on a Mac FWIW:

# load both csv files
$pwsh_results  = Import-Csv -Path 'Desktop\Code\Powershell_results.csv'
$master_data = Import-Csv -Path 'Desktop\Code\Master_Data.csv'

# loop through records
$result = foreach ($User_ID in $master_data){
    # find a record in the pwsh_results.csv where the .User_ID is equal to the .User_ID in the master_data.csv
    $pwsh_results | Where-Object { $_.User_ID -eq $_.User_ID } | ForEach-Object {
        # create an object with properties from both csv files combined
        $obj = $emp | Select-Object @{Name = 'User_ID'; Expression = {$_.User_ID}}, * -ExcludeProperty id
        # add the details from $pwsh_results to this
        $obj | Add-Member -MemberType NoteProperty -Name 'User_ID' -Value $_.User_ID
        $obj | Add-Member -MemberType NoteProperty -Name 'Serial_Number' -Value $_.Serial_Number
        $obj | Add-Member -MemberType NoteProperty -Name 'Hostname' -Value $_.Hostname
        # output the combined object to be collected in variable $result
        $obj
    }
}

# show the results in the console
$result | Format-Table -AutoSize

# save as new csv file
$result | Export-Csv -Path 'Desktop\Code\Matches.csv' -NoTypeInformation
  • What is not working so far? Your code should work tho you should use `Group-Object` to simplify this process. – Santiago Squarzon Mar 12 '23 at 15:23
  • I'm just getting an empty CSV output file. How do you suggest I approach with Group-Object? Thanks for your help. – deadmemorywaste Mar 12 '23 at 15:25
  • please add an example of both CSVs as plain text (just a few lines) and how you expect the result to look – Santiago Squarzon Mar 12 '23 at 15:27
  • Powershell_results.csv = User_ID AVHQSX BKSAXS CFNSAU Master_Data.csv = User_ID Serial_Number Hostname AVHQSX 566409 WDC00001 ABCTYU 567890 WDC00003 ASDYUI 576897 WDC00005 Ideal output = same as Master_Data but *only* showing records where User_ID values match with those in Powershell_results.csv. – deadmemorywaste Mar 12 '23 at 15:36
  • Apologies for formatting - unsure how to get that to look better. – deadmemorywaste Mar 12 '23 at 15:37
  • Add it to your question not to the comment section – Santiago Squarzon Mar 12 '23 at 15:42
  • This is a quiet common question, if performance is a concern, you might use a hashtable and if you do not reinvent the wheel, you might use this [`Join-Object script`](https://www.powershellgallery.com/packages/Join)/[`Join-Object Module`](https://www.powershellgallery.com/packages/JoinModule) (see also: [In Powershell, what's the best way to join two tables into one?](https://stackoverflow.com/a/45483110/1701026)): `Import-Csv .\Master_Data.csv |Join (Import-Csv .\Powershell_results.csv) -on User_ID |Export-Csv .\Matches.csv -NoTypeInformation` – iRon Mar 12 '23 at 20:23

1 Answers1

0

This can be made easier with Group-Object, simply import both CSVs in a single array then group the objects by the User_Id property and filter the objects where the group count is greater than 1. Lastly output the grouped objects by index:

@(
    Import-Csv -Path 'Desktop\Code\Master_Data.csv' # Note, master_data should be first
    Import-Csv -Path 'Desktop\Code\Powershell_results.csv'
) | Group-Object User_Id | Where-Object Count -GT 1 | ForEach-Object {
    $_.Group[0] # Here we output the objects in index 0 (the ones belonging to master_data)
} | Export-Csv -Path 'Desktop\Code\Matches.csv' -NoTypeInformation
Santiago Squarzon
  • 41,465
  • 5
  • 14
  • 37
  • 1
    Thank you, this has worked a treat. Very keen to learn more Powershell and your input has helped motivate me. Have a great day. – deadmemorywaste Mar 12 '23 at 16:01
  • @deadmemorywaste glad it helped. Remember if this answered your question you can mark it as accepted by clicking the tick button – Santiago Squarzon Mar 12 '23 at 16:45
  • 1
    Have done. Thanks again. Just out of interest - why should the master_data file be imported first? – deadmemorywaste Mar 12 '23 at 19:30
  • @deadmemorywaste because if you put master_data second then `$_.Group[0]` would need to be changed to `$_.Group[1]` :) (the objects from master_data are the ones you're interested in outputting so if you change the other of importing the CSVs then the index of oututing those objects has to change too) – Santiago Squarzon Mar 12 '23 at 19:37