3

I have two C source files with lots of defines and I want to compare them to each other and filter out lines that do not match. The grep (grep NO_BCM_ include/soc/mcm/allenum.h | grep -v 56440) output of the first file may look like:

...
...
# if !defined(NO_BCM_5675_A0)
# if !defined(NO_BCM_88660_A0)
# if !defined(NO_BCM_2801PM_A0)
...
...

where grep (grep "define NO_BCM" include/sdk_custom_config.h) of the second looks like:

...
...
#define NO_BCM_56260_B0
#define NO_BCM_5675_A0
#define NO_BCM_56160_A0
...
...

So now I want to find any type number in the braces above that are missing from the #define below. How do I best go about this? Thank you

codeforester
  • 39,467
  • 16
  • 112
  • 140
stdcerr
  • 13,725
  • 25
  • 71
  • 128

3 Answers3

4

Use comm this way:

comm -23 <(grep NO_BCM_ include/soc/mcm/allenum.h | cut -f2 -d'(' | cut -f1 -d')' | sort) <(grep "define NO_BCM" include/sdk_custom_config.h | cut -f2 -d' ' | sort)

This would give tokens unique to include/soc/mcm/allenum.h.

Output:

NO_BCM_2801PM_A0
NO_BCM_88660_A0

If you want the full lines from that file, then you can use fgrep:

fgrep -f <(comm -23 <(grep NO_BCM_ include/soc/mcm/allenum.h | cut -f2 -d'(' | cut -f1 -d')' | sort) <(grep "define NO_BCM" include/sdk_custom_config.h | cut -f2 -d' ' | sort)) include/soc/mcm/allenum.h

Output:

# if !defined(NO_BCM_88660_A0)
# if !defined(NO_BCM_2801PM_A0)

About comm:

NAME comm - compare two sorted files line by line

SYNOPSIS comm [OPTION]... FILE1 FILE2

DESCRIPTION Compare sorted files FILE1 and FILE2 line by line.

   With no options, produce three-column output.  Column one contains lines unique to FILE1, column two contains lines unique to

FILE2, and column three contains lines common to both files.

   -1     suppress column 1 (lines unique to FILE1)
   -2     suppress column 2 (lines unique to FILE2)
   -3     suppress column 3 (lines that appear in both files)
codeforester
  • 39,467
  • 16
  • 112
  • 140
4

You could use an awk logic with two process-substitution handlers for grep

awk 'FNR==NR{seen[$2]; next}!($2 in seen)' FS=" " <(grep "define NO_BCM" include/sdk_custom_config.h) FS="[()]" <(grep NO_BCM_ include/soc/mcm/allenum.h | grep -v 56440)
# if !defined(NO_BCM_88660_A0)
# if !defined(NO_BCM_2801PM_A0)

The idea is the commands within <() will execute and produce the output as needed. The usage of FS before the outputs are to ensure the common entity is parsed with a proper-delimiter.

FS="[()]" is to capture $2 as the unique field in second-group and FS=" " for the default whitespace de-limiting on first group.

The core logic of awk is identifying not repeating elements, i.e. FNR==NR parses the first group storing the unique entries in $2 as a hash-map. Once all the lines are parsed, !($2 in seen) is executed on the second-group which means filter those lines whose $2 from second-group is not present in the hash created.

Inian
  • 80,270
  • 14
  • 142
  • 161
3

It's hard to say without the surrounding context from your sample input files and no expected output but it sounds like this is all you need:

awk '!/define.*NO_BCM_/{next} NR==FNR{defined[$2];next} !($2 in defined)' include/sdk_custom_config.h FS='[()]' include/soc/mcm/allenum.h
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
  • :+1 for grep independent , FS='[()]' is for specifying any single character in [] can be a space separator, right ? – Vicky Jan 26 '17 at 05:25
  • what I meant was e.g if a file has :(colon), ,(comma), | (pipe) characters and I want them all to be treated as field separator I then Can I specify FS as FS=[:,|] ? – Vicky Jan 26 '17 at 15:52
  • 1
    @user3369871 correct, a bracket expression can contain character lists as you described and/or character classes and/or character ranges and they match against any single character described by that bracket expression. – Ed Morton Jan 26 '17 at 15:57