Get latest duplicates based on a column sql powershell

Question

I have a table retrived from a sql query like below

   Id   Date-Time            FilesA    FilesB

   556  2020-10-13 21:12:04  temp1.txt fels.csv
   133  2020-09-08 03:22:16  NA        temp2.txt
   556  2020-10-13 22:02:44  temp1.txt fels.csv
   203  2021-08-03 03:22:16  macs.xml  temp2.txt

I'm pulling out FilesA and FilesB column and doing some operation for all Ids of Id column. But there are some duplicates in the Id column, I need to perform the operations only on the latest Id based on the Date-Time column like for 556 I need to work with FilesA and FilesB of the 3rd row as it is generated latest i.e, on 22:02. I can have multiple duplicates of different Ids like this. How do I achieve that with powershell script? All I've found to identify duplicates is using csv which doesn't seem helpful here as I don't need csv and for the time date column is there anything like converting the retrieved column data to date time object and then compare? I'm lost.

$SqlAdapter.SelectCommand = $SqlCmd
$Dataset = New-Object System.Data.Dataset
$SqlAdapter.Fill($Dataset)
$DataSet.Tables[0]
$data1=$DataSet.Tables[0]
$data1.FilesA #some operations
#need to perfrom the actions on all ids and for duplicates only on the latest ones

Have you considered modifying the SQL query to filter it down before it gets to Powershell? See https://stackoverflow.com/questions/6841605/get-top-1-row-of-each-group — Charlieface, Oct 23 '21 at 18:32

score 0 · Accepted Answer · answered Oct 23 '21 at 16:05

Use Group-Object to group the rows by Id, then pick the one with the newest date from each group with Sort-Object -Descending |Select -First:

$newestRowPerId = $data1 |Group-Object Id |ForEach-Object { 
    $_.Group |Sort-Object -Property 'Date-Time' -Descending |Select -First 1 
}

Get latest duplicates based on a column sql powershell

1 Answers1