47

have this text file:

name, age
joe,42
jim,20
bob,15
mike,24
mike,15
mike,54
bob,21

Trying to get this (count):

joe 1
jim 1
bob 2
mike 3

Thanks,

C B
  • 12,482
  • 5
  • 36
  • 48

6 Answers6

113
$ awk -F, 'NR>1{arr[$1]++}END{for (a in arr) print a, arr[a]}' file.txt
joe 1
jim 1
mike 3
bob 2

EXPLANATIONS

  • -F, splits on ,
  • NR>1 treat lines after line 1
  • arr[$1]++ increment array arr (split with ,) with first column as key
  • END{} block is executed at the end of processing the file
  • for (a in arr) iterating over arr with a key
  • print a print key , arr[a] array with a key
ahmet alp balkan
  • 42,679
  • 38
  • 138
  • 214
Gilles Quénot
  • 173,512
  • 41
  • 224
  • 223
  • 6
    +1 for a one line awk answer (which was the tag in the question)! I love learning here... – Floris Feb 17 '13 at 00:53
  • Any comment why "mike" is printed before "bob", when the first occurrence of "bob" is before "mike" in the file?... – Floris Feb 17 '13 at 00:55
  • Arrays are arbitrarily sorted in `awk`. So, the output order is not guaranteed. – nneonneo Feb 17 '13 at 01:03
  • 1
    I see now, NR skips the 1st line, everything after END runs only once. thx! – C B Feb 17 '13 at 01:10
  • 2
    A small modification allows you to SUM the ages instead of just counting records: `awk -F, 'NR>1{arr[$1]+=$2}END{for (a in arr) print a, arr[a]}' file.txt'` – Dave Sep 20 '15 at 00:10
30

Strip the header row, drop the age field, group the same names together (sort), count identical runs, output in desired format.

tail -n +2 txt.txt | cut -d',' -f 1 | sort | uniq -c | awk '{ print $2, $1 }'

output

bob 2
jim 1
joe 1
mike 3
nneonneo
  • 171,345
  • 36
  • 312
  • 383
  • +1 for for fast and compact answer! I was only halfway through... And you give it in alphabetical order (wasn't asked...) – Floris Feb 17 '13 at 00:46
  • We'll see how OP wants it sorted, if at all. (To sort by the count, stick a `sort -n` before the `awk`). – nneonneo Feb 17 '13 at 00:47
10

It looks like you want sorted output. You could simply pipe or print into sort -nk 2:

awk -F, 'NR>1 { a[$1]++ } END { for (i in a) print i, a[i] | "sort -nk 2" }' file

Results:

jim 1
joe 1
bob 2
mike 3

However, if you have GNU awk installed, you can perform the sorting without coreutils. Here's the single process solution that will sort the array by it's values. The solution should still be quite quick. Run like:

awk -f script.awk file

Contents of script.awk:

BEGIN {
    FS=","
}

NR>1 {
    a[$1]++
}

END {
    for (i in a) {
        b[a[i],i] = i
    }

    n = asorti(b)

    for (i=1;i<=n;i++) {
        split (b[i], c, SUBSEP)
        d[++x] = c[2]
    }

    for (j=1;j<=n;j++) {
        print d[j], a[d[j]]
    }
}

Results:

jim 1
joe 1
bob 2
mike 3

Alternatively, here's the one-liner:

awk -F, 'NR>1 { a[$1]++ } END { for (i in a) b[a[i],i] = i; n = asorti(b); for (i=1;i<=n;i++) { split (b[i], c, SUBSEP); d[++x] = c[2] } for (j=1;j<=n;j++) print d[j], a[d[j]] }' file
Steve
  • 51,466
  • 13
  • 89
  • 103
4

A strictly awk solution...

BEGIN { FS = "," }
{ ++x[$1] }
END { for(i in x) print i, x[i] }

If name, age is really in the file, you could adjust the awk program to ignore it...

BEGIN   { FS = "," }
/[0-9]/ { ++x[$1] }
END     { for(i in x) print i, x[i] }
DigitalRoss
  • 143,651
  • 25
  • 248
  • 329
0

I come up with two functions based on the answers here:

topcpu() {
    top -b -n1                                                                                  \
        | tail -n +8                                                                            \
        | awk '{ print $12, $9, $10 }'                                                          \
        | awk '{ CPU[$1] += $2; MEM[$1] += $3 } END { for (k in CPU) print k, CPU[k], MEM[k] }' \
        | sort -k3 -n                                                                           \
        | tail -n 10                                                                            \
        | column -t                                                                             \
        | tac
}

topmem() {
    top -b -n1                                                                                  \
        | tail -n +8                                                                            \
        | awk '{ print $12, $9, $10 }'                                                          \
        | awk '{ CPU[$1] += $2; MEM[$1] += $3 } END { for (k in CPU) print k, CPU[k], MEM[k] }' \
        | sort -k2 -n                                                                           \
        | tail -n 10                                                                            \
        | column -t                                                                             \
        | tac
}
$ topcpu
chrome           0    75.6
gnome-shell      6.2  7
mysqld           0    4.2
zsh              0    2.2
deluge-gtk       0    2.1
Xorg             0    1.6
scrcpy           0    1.6
gnome-session-b  0    0.8
systemd-journal  0    0.7
ibus-x11         6.2  0.7

$ topmem
top              12.5  0
Xorg             6.2   1.6
ibus-x11         6.2   0.7
gnome-shell      6.2   7
chrome           6.2   74.6
adb              6.2   0.1
zsh              0     2.2
xdg-permission-  0     0.2
xdg-document-po  0     0.1
xdg-desktop-por  0     0.4

enjoy!

rodfersou
  • 914
  • 6
  • 10
0
cut -d',' -f 1 file.txt |
sort | uniq -c
2 bob
1 jim
1 joe
3 mike
tripleee
  • 175,061
  • 34
  • 275
  • 318
Ajay Ahuja
  • 1,196
  • 11
  • 26