Uniq a column and print out number of rows in that column

Question

I have a file, with header

name, age, id, address 
Smith, 18, 201392, 19 Rand Street, USA
Dan, 19, 029123, 23 Lambert Rd, Australia
Smith, 20, 192837, 61 Apple Rd, UK
Kyle, 25, 245123, 103 Orange Rd, UK

And I'd like to sort out duplicates on names, so the result will be:

Smith, 18, 201392, 19 Rand Street, USA
Dan, 19, 029123, 23 Lambert Rd, Australia
Kyle, 25, 245123, 103 Orange Rd, UK

# prints 3 for 3  unique rows at column name

I've tried sort -u -t, -k1,1 file, awk -F"," '!_[$1]++' file but it doesn't work because I have commas in my address.

So only the first occurrence of name is on result file ? What about `Kyle` ? — Niloct, May 09 '21 at 13:37
`awk -F"," '!_[$1]++' file` works, you will have to discard first output line if you don't need the header on output. — Niloct, May 09 '21 at 14:30
Does this answer your question? [Is there a way to 'uniq' by column?](https://stackoverflow.com/questions/1915636/is-there-a-way-to-uniq-by-column) — Niloct, May 09 '21 at 14:37
`awk -F"," '!_[$1]++' data | sed "1 d"`, assuming your file is named `data`. — Niloct, May 09 '21 at 14:38
@Niloct they both work, however is it possible to only have the "name" column printed out? — quickhelp, May 09 '21 at 14:54

Niloct · Accepted Answer · 2021-05-09T17:15:31.633

0

Well, you changed the functionality since the OP, but this should get you unique names in your file (considering it's named data), unsorted:

#!/bin/bash
sed "1 d" data | awk -F"," '!_[$1]++ { print $1 }'

If you need to sort, append | sort to the command line above.

And append | wc -l to the command line to count lines.

edited May 09 '21 at 17:15

answered May 09 '21 at 15:57

Niloct

9,491
3
44
57

I passed a ```wc -l``` to count the number of unique names, but it returns 11 when the actual number is 10. Why do you think that's happening? – quickhelp May 09 '21 at 17:21
Are there empty newlines in the data file ? Probably you have the last line with a linebreak and no content. – Niloct May 09 '21 at 17:26
1

Right. Thank you so much :) – quickhelp May 09 '21 at 17:29

Uniq a column and print out number of rows in that column

1 Answers1