I need to filter only duplicated lines from many files using bash

Question

I have the following three files

filea

a
bc
cde

fileb

a
bc
cde
frtdff

filec

a
bc
cddeeer
erer34

I am able to filter by the duplicated lines from these three files. I am using the following command

ls file* | wc -l

which returns 3. Then, I am launching

sort file* | uniq --count --repeated | awk '{ if ($1 == 3) { print $2} }'

The last command returns precisely what I need, only in case I am not creating more files starting with "file".

But, in case I have thousands of files that need to be created during the time a script is running , I should get an exact number of files coming retrieved from this command

n=`ls file* | wc -l`
sort file* | uniq --count --repeated | awk '{ if ($1 == $n) { print $2} }'

Unfortunately, variable n is not accepted inside the if condition of the awk command.

My issue is that I am not able to use the value of the variable n as a comparison criteria inside an if conditional that is part of awk command.

`awk -v n=$n '$1 == n { print $2 }'` – Barmar Oct 22 '20 at 00:54 — Barmar, Oct 22 '20 at 00:54

score -1 · Answer 1 · answered Oct 22 '20 at 00:44

-1

You can use:

awk '!line[$0]++' file*

This will print only once any string even if present in several files and or in same file.

answered Oct 22 '20 at 00:44

1218985

7,531
2
25
31

They only want the lines that are repeated in all the files. – Barmar Oct 22 '20 at 00:52

I need to filter only duplicated lines from many files using bash

1 Answers1