Finding common elements from 100 files all containing 3 columns but different number of rows

Question

I have 100 files containing 3 columns and different no. of rows. All three columns contain repeating elements. I want to find the common elements among all 100 files. The files look like:

1.txt

5901 5902   8229
5901 5902  17481
5901 5902  27561
5929 5930  12875

2.txt

5901 5902  8229
5929 5930  12875

and so on. Code which I am trying to use is as for ((i=0;i<=100;i++)) do comm -12 file-"$i".txt file-"$((i+1))".txt > common-element-"$i".txt done

I have used comm command but that was only for 2 files. I have 100 such files.

The files which I have shown above have common elements like 5901 5902 8229 — Abhinav Srivastava, Apr 21 '17 at 12:27
do add the `comm` command you tried for 2 files... did it solve for two files? if so, you could very well use a loop, like it was shown in your previous question: https://stackoverflow.com/questions/43472246/finding-common-value-across-multiple-files-containing-single-column-values — Sundeep, Apr 21 '17 at 12:28
You want to output any number, regardless of row or column, that appears in all 100 files? — jas, Apr 21 '17 at 12:28
Yes regardless of rows as column numbers are same. I want output for those numbers which are present in all 100 files — Abhinav Srivastava, Apr 21 '17 at 12:34
Using the loop can I am using following loop: for ((i=0;i<=100;i++)) do comm -12 -nocheck-order file-"$i".txt file-"$((i+1))".txt > common-element.txt done Will it work for comparing elements among all 100 files ? — Abhinav Srivastava, Apr 21 '17 at 12:36
please click https://stackoverflow.com/posts/43542609/edit to add the code you tried to question and use https://stackoverflow.com/editing-help if you face formatting issues — Sundeep, Apr 21 '17 at 12:46

karakfa · Answer 1 · 2017-04-21T13:25:42.893

0

if the values are unique within the file, you can count the occurrences of each row and select the ones that are equal to the number of files, which can be done with uniq -c after sorting all the files, but sorting not required with the alternative below.

awk to the rescue!

awk '{$1=$1} ++a[$0]==(ARGC-1)' file{1..100}.txt

5901 5902  8229
5929 5930  12875

$1=$1 statement is to normalize white space since it's not consistent in your example.

edited Apr 21 '17 at 13:25

answered Apr 21 '17 at 12:57

karakfa

66,216
7
41
56

How to use the above mentioned awk command for 100 files since the above was for two files only. – Abhinav Srivastava Apr 21 '17 at 13:21
there is no limit, you can list all names or use bash expansion as I did or globbing or all directory contents etc. – karakfa Apr 21 '17 at 13:27
Thanks a lot. This was really helpful. – Abhinav Srivastava Apr 21 '17 at 13:34

Finding common elements from 100 files all containing 3 columns but different number of rows

1 Answers1