0

I have a file as below

cat file
a 1 
a 2
b 3

I want to delete a 1 row and a 2 row as the first column of it is the same.

I tried cat file|uniq -f 1, im getting the desired output. But I want to delete this from the file.

Alberto Zaccagni
  • 30,779
  • 11
  • 72
  • 106
  • Possible duplicate of [How can I delete duplicate lines in a file in Unix?](http://stackoverflow.com/questions/1444406/how-can-i-delete-duplicate-lines-in-a-file-in-unix) – Claudio Nov 19 '15 at 10:01

1 Answers1

1
 awk 'NR==FNR{a[$1]++;next}a[$1]==1{print}' file file

This one-liner works for your needs. no matter if your file was sorted or not.

add some explanation:

This one-liner is gonna process the file twice, 1st go record (in a hashtable, key:1st col, value:occurences) the duplicated lines by the 1st column, in the 2nd run, check if the 1st col in the hashtable has value==1, if yes, print. Because those lines are unique lines respect to the col1.

Kent
  • 189,393
  • 32
  • 233
  • 301
  • Could you explain how does this work? It looks to me it stores the occurrences of each line and just prints it if no other "similar" line was found. – Alberto Zaccagni Nov 19 '15 at 10:40
  • @AlbertoZaccagni I posted something totally wrong just now, it was a copy/paste mistake. Commands look similar, so ... :( now fixed. thanx for the comment. – Kent Nov 19 '15 at 10:46
  • I didn't know it was wrong, I am just learning awk and I was going through it :D Could you please add some comments to explain it? – Alberto Zaccagni Nov 19 '15 at 11:05
  • @AlbertoZaccagni added in answer. – Kent Nov 19 '15 at 11:13