1

Firstly, which is the best and fastest unix command to get only the differences between two files ? I tried using diff to do it (below).

I tried the answer given by Neilvert Noval over here - Compare two files line by line and generate the difference in another file

code -

diff -a --suppress-common-lines -y file1.txt file2.txt >> file3.txt

But, I get a lot of spaces and a > symbol also before the different lines. How do I fix that ? I was thinking of removing trailing spaces and the first '>', but not sure if that is a neat fix.

My file1.txt has -

Hello World!
Its such a nice day!
#this is a newline and not a line of text# 

My file1.txt has -

Hello World!
Its such a nice day!
Glad to be here!
#this is a newline and not a line of text# 

Output - " #Many spaces here# > Glad to be here:)"

Expected output - Glad to be here:)

Community
  • 1
  • 1
Steam
  • 9,368
  • 27
  • 83
  • 122
  • 2
    Did you see the `comm` command in the second answer? – squiguy Aug 05 '13 at 19:09
  • @squiguy - yes, but that sorts text. I don't want to sort the differences in text. I want it as is. – Steam Aug 05 '13 at 19:11
  • `diff -u` is the universal way to show differences in text files, familiar to most developers and widely supported by tools. Do you just want lines that are in the second file that don't exist in the first? What about lines that are duplicated? – that other guy Aug 05 '13 at 19:21
  • 1
    If the output from `diff` is not acceptable you must show what you are expecting for output in all cases: 1) Line present in file1 but missing in file2; 2) missing in file1 but present in file2; 3) line present in both files but differs in one or more characters – Jim Garrison Aug 05 '13 at 19:42

3 Answers3

4

Another way to get diff is by using awk:

awk 'FNR==NR{a[$0];next}!($0 in a)' file1 file2

Though I must admit that I haven't run any benchmarks and can't say which is the fastest solution.

anubhava
  • 761,203
  • 64
  • 569
  • 643
  • You can search on internet for some benchmarks of finding differences in 2 files. – anubhava Aug 05 '13 at 20:26
  • btw, can you explain the meaning of fnr, nr etc. I am seeing this awk for the first time. – Steam Aug 05 '13 at 20:33
  • Is there any way I could make this script print a newline before showing the differences ? – Steam Aug 05 '13 at 20:58
  • See this answer for info on awk http://stackoverflow.com/questions/12172682/awk-in-shell-script-how-to-compare-and-merge-two-files-based-on-a-shared-key yes it can print new line. Use this command: `awk 'BEGIN{print ""} FNR==NR{a[$0];next}!($0 in a)' file1 file2` – anubhava Aug 05 '13 at 21:02
  • btw, how can I break this big line of code into multiple lines in my script file ? Tried ; and got many errors. Thanks. – Steam Aug 05 '13 at 21:26
  • You can insert new line in awk command so insert newline after this`'BEGIN{print ""}` – anubhava Aug 05 '13 at 21:59
1

The -y option to diff makes it produce a "side by side" diff, which is why you have the spaces. Try -u 0 for the unified format with zero lines of context. That should print:

+Glad to be here:)

The plus means the line was added, whereas a minus means it was removed.

Spencer Rathbun
  • 14,510
  • 6
  • 54
  • 73
-1
diff -a --suppress-common-lines -y file1.txt file2.txt|tr 'a >' '' |awk '{print $1}' >>file3.txt 
AstroCB
  • 12,337
  • 20
  • 57
  • 73