How to add number of identical line next to the line itself?

Question

I have file file.txt which look like this

a
b
b
c
c
c

I want to know the command to which get file.txt as input and produces the output

a 1
b 2
c 3

what is the logic here? to count them? what if they are not ordered? what are your attempts? — fedorqui, Jul 02 '15 at 10:59

score 4 · Accepted Answer · answered Jul 02 '15 at 11:01

4

I think uniq is the command you are looking for. The output of uniq -c is a little different from your format, but this can be fixed easily.

$ uniq -c file.txt
      1 a
      2 b
      3 c

answered Jul 02 '15 at 11:01

kirelagin

13,248
2
42
57

score 2 · Answer 2 · answered Jul 02 '15 at 11:04

If you want to count the occurrences you can use uniq with -c.

If the file is not sorted you have to use sort first

$ sort file.txt | uniq -c
1 a
2 b
3 c

If you really need the line first followed by the count, swap the columns with awk

$ sort file.txt | uniq -c | awk '{ print $2 " " $1}'
a 1
b 2
c 3

anubhava · Answer 3 · 2015-07-02T11:14:51.040

0

You can use this awk:

awk '!seen[$0]++{ print $0, (++c) }' file
a 1
b 2
c 3

seen is an array that holds only uniq items by incrementing to 1 first time an index is populated. In the action we are printing the record and an incrementing counter.

Update: Based on comment below if intent is to get a repeat count in 2nd column then use this awk command:

awk 'seen[$0]++{} END{ for (i in seen) print i, seen[i] }' file
a 1
b 2
c 3

edited Jul 02 '15 at 11:14

answered Jul 02 '15 at 11:01

anubhava

761,203
64
569
643

1

This won't work, it only appears to work on OP's example because the letters increase by 1 each time, how could you possibly print the number of unique lines before getting to the end of the file ? – 123 Jul 02 '15 at 11:08
1

I may not have understood the question and it is not clear if 2nd column is incrementing counter (as I interpreted) or a repeat count – anubhava Jul 02 '15 at 11:09
2

We don't know what OP requires. Whether just indexing + uniq, or count – anishsane Jul 02 '15 at 11:10
@anubhava Pretty sure from context of the question that it is the number of unique occurences of each line, but I'll retract my downvote for now as i hadn't considered the other interpretation :) – 123 Jul 02 '15 at 11:11
Thanks, in any case I've added an addendum in my answer to take care of repeat counts (in case). – anubhava Jul 02 '15 at 11:15

How to add number of identical line next to the line itself?

3 Answers3