Unique entry set in the first column of all csv files under directory

Question

I have a list of comma separated files under the directory. There are no headers, and unfortunately they are not even the same length for each row.

I want to find the unique entry in the first column across all files.

What's the quickest way of doing it in shell programming?

awk -F "," '{print $1}' *.txt | uniq

seems to only get uniq entries of each files. I want all files.

karakfa · Accepted Answer · 2015-06-10T16:25:59.730

0

Shortest is still using awk (this will print the row)

awk -F, '!a[$1]++' *.txt

to get just the first field

awk -F, '!a[$1]++ {print $1}' *.txt

edited Jun 10 '15 at 16:25

answered Jun 10 '15 at 16:11

karakfa

2

It doesn't print singular elements, but whole rows on my machine (GNU Awk 3.1.5). You probably meant this `awk -F, '!a[$1]++ {print $1}' *.txt` – Eugeniu Rosca Jun 10 '15 at 16:15
@chatraed yes, default is to print the row, need to add `{print $1}` to just to get the first field. – karakfa Jun 10 '15 at 16:24
Thanks. but why does my script not work tho? `awk -F "," '{print $1}' *.txt | uniq` – CuriousMind Jun 10 '15 at 16:27
@CodeNoob uniq only works on sorted lists, the duplicates need to be consecutive. Need to insert sort between awk and uniq. – karakfa Jun 10 '15 at 16:28

1 Answers1