Delete duplicate rows but keep the first 2 instances

Question

I have a column which contends duplicate rows, then i will like to delete but to keep the first 2 instances .

Remove duplicate lines which has been repeated more than 2 times

Example input

i 10
i 10
a 12
a 12
b 12
b 12
c 14
c 14
x 14
x 14
y 14
y 14
a 14
a 14
n 13
n 13
m 13
m 13
x 13
x 13

output desired.

i 10
i 10
a 12
a 12
c 14
c 14
n 13
n 13

I tried

awk '!a[$2]++' file

Appreciate your help

Possible duplicate of [awk - Remove line if field is duplicate](https://stackoverflow.com/questions/2604088/awk-remove-line-if-field-is-duplicate). You only alter it a little: `awk '{ if ( a[$2]++ <= 1 ) print; }' file` — PesaThe, Dec 01 '17 at 21:01

score 4 · Accepted Answer · answered Dec 01 '17 at 21:04

4

I think the problem with your command is that you are checking if it is the first one instead of checking whether it is the one of the first two. Something like this should work:

awk 'a[$2]++<2' file

answered Dec 01 '17 at 21:04

EdmCoff

3,506
1
9
9

Delete duplicate rows but keep the first 2 instances

1 Answers1