2

I have two text files that contain many duplicate lines. I would like to run a powershell statement that will output a new file with only the values NOT already in the first file. Below is an example of two files.

File1.txt
-----------
Alpha
Bravo
Charlie


File2.txt
-----------
Alpha
Echo
Foxtrot

In this case, only Echo and Foxtrot are not in the first file. So these would be the desired results.

OutputFile.txt
------------
Echo
Foxtrot

I reviewed the below link which is similar to what I want, but this does not write the results to an output file.

Remove lines from file1 that exist in file2 in Powershell

JadonR
  • 193
  • 2
  • 3
  • 12
  • This script simply creates an outputfile.txt that is an exact copy of file2.txt. Do you get different results? – JadonR Dec 30 '19 at 00:00

2 Answers2

3

Here's one way to do it:

# Get unique values from first file
$uniqueFile1 = (Get-Content -Path .\File1.txt) | Sort-Object -Unique

# Get lines in second file that aren't in first and save to a file
Get-Content -Path .\File2.txt | Where-Object { $uniqueFile1 -notcontains $_ } | Out-File .\OutputFile.txt
Glenn
  • 1,687
  • 15
  • 21
  • 1
    Nice! That creates the results I'm looking for. I added " | Out-File .\OutputFile.txt " to the end of your script so that it would create an output file as I need. Thanks! – JadonR Dec 30 '19 at 13:11
  • Thanks, added that to the answer. – Glenn Dec 30 '19 at 15:13
2

Using the approach in the referenced link will work however, for every line in the original file, it will trigger the second file to be read from disk. This could be painful depending on the size of your files. I think the following approach would meet your needs.

$file1 = Get-Content .\File1.txt
$file2 = Get-Content .\File2.txt

$compareParams = @{
    ReferenceObject = $file1
    DifferenceObject = $file2
}

Compare-Object @compareParams | 
    Where-Object -Property SideIndicator -eq '=>' |
    Select-Object -ExpandProperty InputObject |
    Out-File -FilePath .\OutputFile.txt

This code does the following:

  1. Reads each file into a separate variable
  2. Creates a hashtable for the parameters of Compare-Object (see about_Splatting for more information)
  3. Compares the two files in memory and passes the results to Out-File
  4. Writes the contents of the pipeline to "OutputFile.txt"

If you are comfortable with the overall flow of this, and are only using this in one-off situations, the whole thing can be compressed into a one-liner.

(Compare-Object (gc .\File1.txt) (gc .\File2.txt) | ? SideIndicator -eq '=>').InputObject | Out-File .\OutputFile.txt
nabrond
  • 1,368
  • 8
  • 17
  • Thanks for the quick reply. I tried using your script, however it returned the opposite of what I was really looking for. So using my example above, it created an OutputFile.txt with only the line "Alpha". So it seems to be grabbing everything that is the same, rather than only the unique values from the second text file. – JadonR Dec 29 '19 at 22:21
  • Apologies, I misread your intent as gathering what was the same, not just the differences in the second file. I updated my code to better suit your use case. – nabrond Dec 31 '19 at 00:21