3

I have an issue I'm hoping someone might be able to provide some guidance on. I have two csv files, File1 and File2.

File1:

value1,value2
A,10
B,30
C,45
D,39

File2:

value1,value2
A,10
B,32
C,44
E,7
F,3

What I am looking for is two things. I need to check if any items in value1 have been removed between the files, and/or I need to check if the corresponding number value2 has decreased.

First, to compare the files and find which lines have been added or removed in File1 vs File2. This is easy enough with Compare-Object if I compare only the value1 items.

So the result of my first compare, I'll see that from File1 to File2, I'll see that line E and F have been added, and line D has been removed. Perfect.

However, It's the next part I'm struggling with. I need to then compare value2 in each file, and determine if the number has decreased (or potentially increased or stayed the same).

The tricky part is I can't just compare line 4, value2 in File1 to line 4, value2 in File2, because one is for D and the other is for E. I don't know how to match the items first, then take essentially only the items that match and compare value2 for just those items? But what then happens there was no previous line to match? (because the line was newly added to file2 or the line was removed and only exists in file1)

In the end what I'm trying to come up with is a list of all the value1's that have been removed, and a list of all the value1's whose corresponding value2 has decreased since File1. I do not care about additions or increases.

Hope someone can provide some guidance. Thank you!

tylergun
  • 33
  • 5
  • For a while now, I am maintaining a [`Join-Object script`](https://www.powershellgallery.com/packages/Join)/[`Join-Object Module`](https://www.powershellgallery.com/packages/JoinModule) (see also: [In Powershell, what's the best way to join two tables into one?](https://stackoverflow.com/a/45483110/1701026)). For the first part of your question, the syntax would be simply something like: `$Csv1 |LeftJoin $Csv2 -on value1 -Name Old,New`. For the second part of your question, you could add a calculated property as e.g.: `-Property *, @{Decreased = {$Right.value2 -lt $Left.value2}}` – iRon Jul 09 '21 at 08:18
  • Or just: `$Csv1 |LeftJoin $Csv2 -on value1 -Name Old,New |Where {$_.Newvalue2 -lt $_.Oldvalue2}` – iRon Jul 09 '21 at 08:31

2 Answers2

4

To provide a single-pipeline alternative to Santiago's helpful answer:

It relies on Add-Member, Group-Object, and calculated properties via Select-Object:

(@'
value1,value2
A,10
B,30
C,45
D,39
'@ | ConvertFrom-Csv | Add-Member -PassThru Source 1) + 
(@'
value1,value2
A,10
B,32
C,44
E,7
F,3
'@ | ConvertFrom-Csv | Add-Member -PassThru Source 2) | 
  Group-Object value1 | ForEach-Object {
    if ($_.Count -eq 1) {
      $_.Group[0] | Add-Member -PassThru Status @{ 1 = '<='; 2 = '=>' }[$_.Group[0].Source]
    }
    else {
      $grp = $_
      $change = ($_.Group[0].value2).CompareTo($_.Group[1].value2)
      $_.Group[0] |
        Select-Object value1,
                      @{ name = 'value2'; expression = { $grp.Group[0].value2, $grp.Group[1].value2 } },
                      @{ name = 'Status'; expression = { $change } }
      }
    }

The above yields the following:

value1 value2   Status
------ ------   ------
A      {10, 10}      0
B      {30, 32}     -1
C      {45, 44}      1
D      39           <=
E      7            =>
F      3            =>
  • <= and => have the same meaning as in the output of Compare-Object: <= indicates value1 values exclusive to the LHS, => those exclusive to the RHS.

  • -1 / 0 / 1 map onto an increase / equality / decrease in the value2 property value.

mklement0
  • 382,024
  • 64
  • 607
  • 775
  • 1
    This is so clever, I couldn't even think of this. My brain just defaults to classic with questions like this one! I re-did my code to give data more meaning. – Santiago Squarzon Jul 08 '21 at 23:21
  • 1
    Thanks, @SantiagoSquarzon. It certainly is a non-obvious combination of multiple cmdlets. – mklement0 Jul 08 '21 at 23:51
3

I decided to edit a bit the code, since I think you're doing some sort of data analysis, in my opinion I think the best way you could display the information is by merging both objects.

It may take a bit more processing time depending on how big the CSVs are but I think this is the cleaner approach of showing the data, specially if this is going to be exported to Excel, filtering the result object would be very easy.

Also, mklement0's answer is much more clever than mine. We could say this is maybe a more classic coding approach and he is using all the PowerShell toolset at his disposal. Big props to his answer too.

# Use Import-Csv here, this is just for testing
$csv1 = @'
value1,value2
A,10
B,30
C,45
D,39
'@ | ConvertFrom-Csv

$csv2 = @'
value1,value2
A,10
B,32
C,44
E,7
F,3
'@ | ConvertFrom-Csv

# Convert CSV1 to hashtable
$map = @{}

foreach($line in $csv1)
{
    $map.Add($line.value1,$line.value2)
}

$result = [system.collections.generic.list[pscustomobject]]::new()

$map.Keys.ForEach({
    if($_ -notin $csv2.value1)
    {
        $result.Add(
            [pscustomobject]@{
                value1 = $_
                oldval2 = $map[$_]
                newval2 = $null
                status = 'REMOVED'
        })
    }
})

foreach($line in $csv2)
{
    $out = [ordered]@{
        value1 = $line.value1
        newval2 = $line.value2
    }
    if(-not $map.ContainsKey($line.value1))
    {
        $out.oldval2 = $null
        $out.status = 'ADDED'
        $result.Add([pscustomobject]$out)
        
        continue
    }

    $out.oldval2 = $map[$line.value1]

    switch($line.value2)
    {
        {$_ -lt $map[$line.value1]}
        {
            $out.status = 'DECREASED'
            continue
        }
        {$_ -gt $map[$line.value1]}
        {
            $out.status = 'INCREASED'
            continue
        }
        Default
        {
            $out.status = 'EQUAL'
        }
    }

    $result.Add([pscustomobject]$out)
}

Looking at $result:

PS /> $result

value1 oldval2 newval2 status   
------ ------- ------- ------   
D      39              REMOVED  
A      10      10      EQUAL    
B      30      32      INCREASED
C      45      44      DECREASED
E              7       ADDED    
F              3       ADDED    

If you wanted to see the values 'REMOVED' or 'DECREASED':

PS /> $result.where({$_.status -match 'REMOVED|DECREASED'})

value1 oldval2 newval2 status   
------ ------- ------- ------   
D      39              REMOVED  
C      45      44      DECREASED
Santiago Squarzon
  • 41,465
  • 5
  • 14
  • 37
  • 1
    I believe you understood perfectly! Now that I see what you've written it makes sense, but I don't think I would have come up with that on my own. One follow up question, at the end your catching values that increased or stayed the same as "Increased", as with "A", what would be the best strategy for separating out values that are the Same? Just add an elseif -gt section before your last else? – tylergun Jul 08 '21 at 22:58
  • @tylergun you're totally right, I missed the equality pointer, and also I decided to re-do the hole code. I hope the data makes more sense now. – Santiago Squarzon Jul 08 '21 at 23:20
  • 1
    That's great, it does make more sense now, and your solution is definitely easier for a beginner like me to walk through and understand than @mklement0's answer :) But I very much appreciate both examples! Thanks again. – tylergun Jul 08 '21 at 23:59
  • I'm happy to help @tylergun – Santiago Squarzon Jul 09 '21 at 00:07