1

I have two csv file where I contain data, I need to check if value from CSV 1 exist in CSV 2 and if so then replace this value in file2 with data from file1, if no just skip to another row,

File1.csv

NO;Description
L001;DREAM
L002;CAR
L003;PHONE
L004;HOUSE
L005;PLANE

File2.csv

ID;Name;Status*;Scheduled Start Date;Actual Start Date;Actual End Date;Scheduled End Date;SLA
144862;DREAM;Scheduled;1524031200;;;1524033000;
149137;CAR;Implementation In Progress;1528588800;;;1548968400;
150564;PHONE;Scheduled;1569456000;;;1569542400;
150564;HOUSE;Scheduled;1569456000;;;1569542400;
150564;PLANE;;;;;;

I tried something like that but it is not working for me:

    $file1 = Import-Csv "C:\Users\file1.csv" |Select-Object -ExpandProperty Description
$file2 = Import-Csv "C:\Users\file1.csv" |Select-Object -ExpandProperty NO
        Import-Csv "C:\Users\file3.csv" |Where-Object {$file1 -like $_.Name} |ForEach-Object {
          $_.Name = $file2($_.NO)
    } |Out-File "C:\Users\File4.csv"

File4.csv should like that:

ID;Name;Status*;Scheduled Start Date;Actual Start Date;Actual End Date;Scheduled End Date;SLA
144862;L001;Scheduled;1524031200;;;1524033000;
149137;L002;Implementation In Progress;1528588800;;;1548968400;
150564;L003;Scheduled;1569456000;;;1569542400;
150564;L004;Scheduled;1569456000;;;1569542400;
150564;L005;;;;;;

Maybe there is another way to achive my goal! Thank you

RoadRunner
  • 25,803
  • 6
  • 42
  • 75
majan
  • 125
  • 12
  • Why is there an asterisk in the `Status` column? Is that for real or a typo when posting? – Theo Apr 23 '20 at 09:51
  • This is how file is exported, it is not typo error, but this column is not using to check values so it shouldnt be a problem, I guess. – majan Apr 23 '20 at 10:02

2 Answers2

0

Here's one approach you can take:

  • Import both CSV files with Import-Csv
  • Create a lookup hash table from the first CSV file, where the Description you want to replace are the keys, and NO are the values.
  • Go through the second CSV file, and replace any values from the Name column from the hash table, if the key exists. We can use System.Collections.Hashtable.ContainsKey to check if the key exists. This is a constant time O(1) operation, so lookups are fast.
  • Then we can export the final CSV with Export-Csv. I used -UseQuotes Never to put no " quotes in your output file. This feature is only available in PowerShell 7. For lower PowerShell versions, you can have a look at How to remove all quotations mark in the csv file using powershell script? for other alternatives to removing quotes from a CSV file.

Demo:

$csvFile1 = Import-Csv -Path .\File1.csv -Delimiter ";"
$csvFile2 = Import-Csv -Path .\File2.csv -Delimiter ";"

$ht = @{}
foreach ($item in $csvFile1) {
    if (-not [string]::IsNullOrEmpty($item.Description)) {
        $ht[$item.Description] = $item.NO
    }
}

& {
    foreach ($line in $csvFile2) {
        if ($ht.ContainsKey($line.Name)) {
            $line.Name = $ht[$line.Name]
        }
        $line
    }
} | Export-Csv -Path File4.csv -Delimiter ";" -NoTypeInformation -UseQuotes Never

Or instead of wrapping the foreach loop inside a script block using the Call Operator &, we can use Foreach-Object. You can have a look at about_script_blocks for more information about script blocks.

$csvFile2 | ForEach-Object {
    if ($ht.ContainsKey($_.Name)) {
        $_.Name = $ht[$_.Name]
    }
    $_
} | Export-Csv -Path File4.csv -Delimiter ";" -NoTypeInformation -UseQuotes Never

File4.csv

ID;Name;Status*;Scheduled Start Date;Actual Start Date;Actual End Date;Scheduled End Date;SLA
144862;L001;Scheduled;1524031200;;;1524033000;
149137;L002;Implementation In Progress;1528588800;;;1548968400;
150564;L003;Scheduled;1569456000;;;1569542400;
150564;L004;Scheduled;1569456000;;;1569542400;
150564;L005;;;;;;

Update

For handling multiple values with the same Name, we can transform the above to use a hash table of System.Management.Automation.PSCustomObject, where we have two properties Count to keep track of the current item we're seeing and NO which is an array of numbers:

$csvFile1 = Import-Csv -Path .\File1.csv -Delimiter ";"
$csvFile2 = Import-Csv -Path .\File2.csv -Delimiter ";"

$ht = @{}
foreach ($row in $csvFile1) {
    if (-not $ht.ContainsKey($row.Description) -and 
        -not [string]::IsNullOrEmpty($item.Description)) {
        $ht[$row.Description] = [PSCustomObject]@{
            Count = 0
            NO = @()
        }
    }
    $ht[$row.Description].NO += $row.NO
}

& {
    foreach ($line in $csvFile2) {
        if ($ht.ContainsKey($line.Name)) {
            $name = $line.Name
            $pos = $ht[$name].Count
            $line.Name = $ht[$name].NO[$pos]
            $ht[$name].Count += 1
        }
        $line
    }
} | Export-Csv -Path File4.csv -Delimiter ";" -NoTypeInformation -UseQuotes Never
RoadRunner
  • 25,803
  • 6
  • 42
  • 75
  • I don't what to do wrong but this part is not working for me `$csvFile2 | ForEach-Object { if ($ht.ContainsKey($_.Name)) { $_.Name = $ht[$_.Name] } $_ } | Export-Csv -Path File4.csv -Delimiter ";" -NoTypeInformation -UseQuotes Never` – majan Apr 23 '20 at 11:09
  • @majan What is the error your getting? Works fine for me. – RoadRunner Apr 23 '20 at 11:10
  • my files are big and contains sometimes in File1.csv empty value in Description Column, is there any way to skip if there is no value. I do somethng wrong with second option of code probalby – majan Apr 23 '20 at 11:20
  • @majan I've added a check to make sure no `$null` or empty values are added with `[string]::IsNullOrEmpty`. The row of the second file will only get updated if a key is found in `$ht` using `ContainsKey`. – RoadRunner Apr 23 '20 at 11:24
  • ok, I have last polite request, sometimes in file1.csv I have multiple instance of same description but another number of it, `NO;Description L001;DREAM L002;CAR L005;PLANE L011;DREAM L111;DREAM` right now only first attemp is adding into file2, how I can add each attemp. The best would be copy whole row and add new ID like `ID;Name;Status*;Scheduled Start Date;Actual Start Date;Actual End Date;Scheduled End Date;SLA 144862;L001;Scheduled;1524031200;;;1524033000; 144862;L011;Scheduled;1524031200;;;1524033000; 144862;L111;Scheduled;1524031200;;;1524033000;` I hope you get it – majan Apr 23 '20 at 12:01
  • @majan I understand. I'll add that in my answer. – RoadRunner Apr 23 '20 at 12:04
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/212343/discussion-between-roadrunner-and-majan). – RoadRunner Apr 23 '20 at 12:22
0

If your files aren't too big, you could do this with a simple ForEach-Object loop:

$csv1   = Import-Csv -Path 'D:\Test\File1.csv' -Delimiter ';'
$result = Import-Csv -Path 'D:\Test\File2.csv' -Delimiter ';' | 
          ForEach-Object {
              $name = $_.Name
              $item = $csv1 | Where-Object { $_.Description -eq $name } | Select-Object -First 1
              # update the Name property and output the item
              if ($item) { 
                $_.Name = $item.NO
                # if you output the row here, the result wil NOT contain rows that did not match
                # $_   
              }
              # if on the other hand, you would like to retain the items that didn't match unaltered,
              # then output the current row here
              $_
          }

# output on screen
$result | Format-Table -AutoSize

#output to new CSV file
$result | Export-Csv -Path 'D:\Test\File4.csv' -Delimiter ';' -NoTypeInformation

Result on screen:

ID     Name Status*                    Scheduled Start Date Actual Start Date Actual End Date Scheduled End Date SLA
--     ---- -------                    -------------------- ----------------- --------------- ------------------ ---
144862 L001 Scheduled                  1524031200                                             1524033000            
149137 L002 Implementation In Progress 1528588800                                             1548968400            
150564 L003 Scheduled                  1569456000                                             1569542400            
150564 L004 Scheduled                  1569456000                                             1569542400            
150564 L005
Theo
  • 57,719
  • 8
  • 24
  • 41
  • files are really big, but don't know but your code is not working – majan Apr 23 '20 at 11:04
  • @majan it works for me, but in your question you never explained it involves large files. What is "not working"? – Theo Apr 23 '20 at 11:42