Need to diff two text files in linux with some patterns in filelines

Question

File A contains

Test-1.2-3
Test1-2.2-3
Test2-4.2-3

File B contains

Test1

Expected output should be

Test-1.2-3
Test2-4.2-3

diff A B doesn't work as expected.
Kindly let me know if any solutions here.

Oops, you forgot to post your code! StackOverflow is about helping people fix their code. It's not a free coding service. Any code is better than no code at all. Meta-code, even, will demonstrate how you're thinking a program should work, even if you don't know how to write it. Show us your work so far in an [MCVE](http://stackoverflow.com/help/mcve), the result you were expecting and the results you got, and we'll help you figure it out. — ghoti, Apr 12 '16 at 12:22
I've been trying for **years** to get it into the head of my boss that *one* special example case is a really poor way to specify a *pattern*.... — DevSolar, Apr 12 '16 at 12:25

Julien Lopez · Answer 1 · 2016-04-12T12:51:37.830

Using grep:

grep -vf B A

  -f FILE, --file=FILE
          Obtain patterns  from  FILE,  one  per  line.   The  empty  file
          contains zero patterns, and therefore matches nothing.

  -v, --invert-match
          Invert the sense of matching, to select non-matching lines.

Edit:

Optionally, you may want to use the -w option if you want a more precise match on "words" only which seems to be your case from your example since your match is followed by '-'. As DevSolar points out, you may also want to use the -F option to prevent input patterns from your file B to be interpreted as regular expressions.

grep -vFwf B A

  -w, --word-regexp
          Select only those  lines  containing  matches  that  form  whole
          words.   The  test is that the matching substring must either be
          at the  beginning  of  the  line,  or  preceded  by  a  non-word
          constituent  character.  Similarly, it must be either at the end
          of the line or followed by  a  non-word  constituent  character.
          Word-constituent   characters   are  letters,  digits,  and  the
          underscore.
  -F, --fixed-strings
          Interpret PATTERN as a list of fixed strings (rather than regular
          expressions), separated by newlines, any of which is to be matched.

Potentially adding `-F`, to use lines from file B as fixed strings instead of interpreting them as patterns. — DevSolar, Apr 12 '16 at 12:31

score 1 · Answer 2 · edited May 23 '17 at 11:45

To complement Julien Lopez's helpful answer:

If you want to ensure that lines from File B only match at the beginning of lines from File A, you can prepend ^ to each line from file B, using sed:

grep -vf <(sed 's/^/^/' fileB) fileA

grep, which by default interprets its search strings as BREs (basic regular expressions), then interprets the ^ as the beginning-of-line anchor.

If the lines in File B may contain characters that are regex metacharacters (such as ^, *,?, ...) but should be treated as literals, you must escape them first:

grep -vf <(sed 's/[^^]/[&]/g; s/\^/\\^/g; s/^/^/' fileB) fileA

^{An explanation of this grim-looking - but generically robust - sed command can be found in this this answer of mine.}

Note:

Assumes bash, ksh, or zsh due to use of <(...), a process substitution, which makes the output from sed act as if it were provided via a file.
sed command s/^/^/ looks like it won't do anything, but the first ^, in the regex part of the call, is the beginning-of-line anchor^[1] , whereas the second ^, in the substitution part of the call, is a literal to place at the beginning of the line (which will later itself be interpreted as the beginning-of-line anchor in the context of grep).

^{[1] Strictly speaking, to sed it is the beginning-of-pattern-space anchor, because it is possible to read multiple lines at once with sed, in which case ^ refers to the beginning of the pattern space (input buffer) as a whole, not to individual lines.}

A nice and compact way to modify the input for `grep`, neat! I always found odd that there are so few options in `grep` for matching control, it would be nice to have more than `-w` and `-x` to use with `-f`. — Julien Lopez, Apr 13 '16 at 08:27

Need to diff two text files in linux with some patterns in filelines

2 Answers2