-3

How can I find if my file has any repeated duplication. ?

Many of my vi files have large number of molecular co-ordinates, and sometimes, the software I use duplicates molecular co-ordinates on top of the first one, which goes unnoticed and only when I start using the molecule in simulations, that I get to know that this file had a repeated co-rodinates.

Using general grep, i need to test for every line , and see if a pattern is found.

Instead, is there a better approach ?

Ex:

C          8.72073       15.19207       10.44503

C          9.57223       14.02835       10.59743

C         10.54225       13.88199        9.86998

repeats in the file

jaypal singh
  • 74,723
  • 23
  • 102
  • 147
quarktosh
  • 11
  • 1

1 Answers1

0

Use sort and uniq plus sed to clean the output:

Example:

echo -e 'a\nb\nc\na\nb'
a
b
c
a
b

echo -e 'a\nb\nc\na\nb' | sort | uniq -c
      2 a
      2 b
      1 c

echo -e 'a\nb\nc\na\nb' | sort | uniq -c | sed -re '/^\s+1\s+/d; s/^\s+[0-9]+\s+//g'
a
b
Tiago Lopo
  • 7,619
  • 1
  • 30
  • 51