3

I have 100 text files containing single columns each. The files are like:

file1.txt
10032
19873
18326

file2.txt
10032
19873
11254

file3.txt
15478
10032
11254

and so on. The size of each file is different. Kindly tell me how to find the numbers which are common in all these 100 files.

The same number appear only once in 1 file.

Community
  • 1
  • 1

4 Answers4

5

This will work whether or not the same number can appear multiple times in 1 file:

$ awk '{a[$0][ARGIND]} END{for (i in a) if (length(a[i])==ARGIND) print i}' file[123]
10032

The above uses GNU awk for true multi-dimensional arrays and ARGIND. There's easy tweaks for other awks if necessary, e.g.:

$ awk '!seen[$0,FILENAME]++{a[$0]++} END{for (i in a) if (a[i]==ARGC-1) print i}' file[123]
10032

If the numbers are unique in each file then all you need is:

$ awk '(++c[$0])==(ARGC-1)' file*
10032
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
2

awk to the rescue!

to find the common element in all files (assuming uniqueness within the same file)

awk '{a[$1]++} END{for(k in a) if(a[k]==ARGC-1) print k}' files

count all occurrences and print the values where count equals number of files.

karakfa
  • 66,216
  • 7
  • 41
  • 56
1

Files with a single column?

You can sort and compare this files, using shell:

for f in file*.txt; do sort $f|uniq; done|sort|uniq -c -d

Last -c is not necessary, it's need only if you want to count number of occurences.

A.N.
  • 278
  • 2
  • 13
0

One using Bash and comm because I needed to know if it would work. My test files were 1, 2 and 3, hence the for f in ?:

f=$(shuf -n1 -e ?)                     # pick one file randomly for initial comms file

sort "$f" > comms 

for f in ?                             # this time for all files
do 
  comm -1 -2 <(sort "$f") comms > tmp  # comms should be in sorted order always
  # grep -Fxf "$f" comms > tmp         # another solution, thanks @Sundeep
  mv tmp comms
done
James Brown
  • 36,089
  • 7
  • 43
  • 59
  • 1
    instead of `comm+sort` I would suggest using `grep -Fxf file1 file2` – Sundeep Apr 18 '17 at 15:44
  • Any ideas on how to pick one of the files to be the initial `comms`? Better than my `for ... break` :D? – James Brown Apr 18 '17 at 15:49
  • 1
    I did try similar solution but didn't post it... don't forget to quote `"$f"`... coming to picking, would help to know file names and use bash extglob... `grep -Fxf file1.txt file2.txt > f1` and then `for f in !(file[12]).txt ` – Sundeep Apr 18 '17 at 15:55