14

Looking for an awk (or sed) one-liner to remove lines from the output if the first field is a duplicate.

An example for removing duplicate lines I've seen is:

awk 'a !~ $0; {a=$0}'

Tried using it for a basis with no luck (I thought changing the $0's to $1's would do the trick, but didn't seem to work).

codeforester
  • 39,467
  • 16
  • 112
  • 140
Kyle
  • 269
  • 1
  • 2
  • 8
  • 1
    You asked to remove lines 'if the first field matches' ... what? I've assumed 'the same value as the first field in some previous input line'; another person assumed 'some particular pattern'. What did you intend? – Jonathan Leffler Apr 08 '10 at 23:24
  • Your changed version `awk 'a !~ $1; {a=$1}'` *works for me* for adjacent duplicates (e.g. a sorted file). **Jonathan Leffler's** version has the advantage that it will work to remove duplicates on an unsorted file, but at the expense of creating a potentially large array. – Dennis Williamson Apr 08 '10 at 23:43
  • I think my main problem was that I was dealing with a few different types of field seperators and wasn't defining FS properly – Kyle Apr 09 '10 at 15:36

4 Answers4

26
awk '{ if (a[$1]++ == 0) print $0; }' "$@"

This is a standard (very simple) use for associative arrays.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • That worked! I had another bug that I didn't realize as well that may have been giving me problems aswell. Thanks! – Kyle Apr 08 '10 at 23:25
11

this is how to remove duplicates

awk '!_[$1]++' file
ghostdog74
  • 327,991
  • 56
  • 259
  • 343
1

If you're open to using Perl:

perl -ane 'print if ! $a{$F[0]}++' file

-a autosplits the line into the @F array, which is indexed starting at 0
The %a hash remembers if the first field has already been seen


This related solution assumes your field separator is a comma, rather than whitespace

perl -F, -ane 'print if ! $a{$F[0]}++' file
Chris Koknat
  • 3,305
  • 2
  • 29
  • 30
0

it print the unique as well as single value of the duplicates

awk '!a[$1]++' file_name
Ravi Saroch
  • 934
  • 2
  • 13
  • 28