list the uniq lines based on ":" delimiter

Question

I am trying to write a script which will find the unique lines(first occurance) based on columns/delimiters. In this case to my understanding delimiter is ":".

for example:

May 14 00:00:01  SERVER1 ntp[1006]:  ntpd[Info]: 1430748797.780852: ndtpq.c(20544): this is the log  
May 14 00:00:01  SERVER1 ntp[1006]:  ntpd[Info]: 1430748797.780853: ndtpq.c(20544): this is another log  
May 14 00:00:02  SERVER1 ntp[1006]:  ntpd[Info]: 1430748798.780852: ndtpq.c(20544): this is another log  
May 14 00:00:03  SERVER1 ntp[1006]:  ntpd[Info]: 1430748799.780852: ndtpq.c(20544): this is the log  
May 14 00:00:04  SERVER1 ntp[1006]:  ntpd[Info]: 1430748800.780852: ndtpq.c(20544): this is the log  
May 14 00:00:04  SERVER1 ntp[1006]:  ntpd[Info]: 1430748800.790852: ndtpq.c(20544): this is the log  
May 14 00:00:05  SERVER1 ntp[1006]:  ntpd[Info]: 1430748801.790852: ndtpq.c(20544): thisis really different log

desired output:

May 14 00:00:01  SERVER1 ntp[1006]:  ntpd[Info]: 1430748797.780852: ndtpq.c(20544): this is the log  
May 14 00:00:01  SERVER1 ntp[1006]:  ntpd[Info]: 1430748797.780853: ndtpq.c(20544): this is another log  
May 14 00:00:05  SERVER1 ntp[1006]:  ntpd[Info]: 1430748801.790852: ndtpq.c(20544): thisis really different log

I am able to find the uniq log using the following command but ,I am loosing the timestamp by using this way.

cat fileName |awk -F: '{print $7}'

What is the criteria for determining entries that should be grouped? Is it the content of the message? If so, what if the same message is generated later on? — Tom Fenech, May 14 '15 at 08:52
its the content of the message and it can be ignored if seen later. Only first occurance is needed. — , May 14 '15 at 08:59

score 2 · Accepted Answer · answered May 14 '15 at 08:52

2

This may do:

awk -F: '!seen[$NF]++' file
May 14 00:00:01  SERVER1 ntp[1006]:  ntpd[Info]: 1430748797.780852: ndtpq.c(20544): this is the log
May 14 00:00:01  SERVER1 ntp[1006]:  ntpd[Info]: 1430748797.780853: ndtpq.c(20544): this is another log
May 14 00:00:05  SERVER1 ntp[1006]:  ntpd[Info]: 1430748801.790852: ndtpq.c(20544): thisis really different log

It splits the file using :, then looks at the last field, and prints only the unique.

answered May 14 '15 at 08:52

Jotne

40,548
12
51
55

error I got : seen[: Event not found. – May 14 '15 at 09:04
1

@user2809888 : You are invoking the shell's history substitution. Surround the exclamation point with single quotes – Akshay Hegde May 14 '15 at 09:07

score 1 · Answer 2 · answered May 14 '15 at 08:54

1

Try this

Awk

 awk -F: '!x[$NF]++' infile

GNU Sort if order doesn't matter

 sort -u -t: -k7 infile

answered May 14 '15 at 08:54

Akshay Hegde

16,536
2
22
36

Your `awk` is just the same is I already posted. – Jotne May 14 '15 at 09:02
x[: Event not found. – May 14 '15 at 09:03
@user2809888 : You are invoking the shell's history substitution. Surround the exclamation point with single quotes – Akshay Hegde May 14 '15 at 09:05
1

thanks, worked for me, I found sort sweeter and simple :) – May 14 '15 at 09:27

list the uniq lines based on ":" delimiter

for example:

desired output:

2 Answers2