Select files from sub-directories and print a certain line from each file

Question

I have a directory with several sub-directories, these sub-directories have many files and I'm interested in *.txt files. I want to go to every sub-directory, read the *.txt file and print a certain line matching a "pattern". I would prefer to have it as a one-liner.

Here is the command what I tried.

for i in $(ls -d *_fastqc); do cd $i; awk '/FAIL/ {print $0}' ls -l su*.txt; done

I get an error command for this, as:

awk: cmd. line:1: fatal: cannot open file `-rw-rw-r--' for reading (No such file or directory)

What might be going wrong here?

The error comes from your `ls -l` showing long format as input for awk. try `ls -1` but the link in previous comment is the correct way to go instead of looping. — Tensibai, Jan 09 '17 at 13:11

Inian · Accepted Answer · 2017-01-09T10:12:04.273

3

Awk is not the right tool meant for this, see why you shouldn't be parsing ls ouput,

Rather use GNU find to list the files matching your criterion with xargs for de-limiting the output returned from find and grep for pattern-matching.

find . -type f -name "*.txt" -print0 | xargs -0 grep "FAIL"

-print0 (a GNU find specific option) appends a NULL character at end of each file/directory to handle files with white-spaces/special-characters and xargs -0 splits input from stdin with \0 as the de-limiter. Using grep on the returned file to return the line from the file if matched.

edited Jan 09 '17 at 10:12

answered Jan 09 '17 at 09:52

Inian

80,270
14
142
161

1

xargs is unnecessary here... `find -type f -name '*.txt' -exec grep -F 'FAIL' {} +` – Sundeep Jan 09 '17 at 10:15
1

also, can use GNU grep, without needing find... `grep --include='*.txt' -rF 'FAIL'` – Sundeep Jan 09 '17 at 10:23
1

@Sundeep the GNU guys really screwed up by giving grep options to find files. There's a perfectly good tool for that with a perfectly obvious name. What will they give grep next - options to sort it's output or options to stat files? The tool to **find** files is named `find` - just use it. – Ed Morton Jan 09 '17 at 19:37
@EdMorton not agree, find is to find file, grep to find element (text) in file or stream. In this case you are forking the grep when using the find first. So it will depend on otehr criteria to define which way is the best (perf, memory, code reading, compatibility, ...) – NeronLeVelu Oct 10 '18 at 09:02
@NeronLeVelu I understand the possible performance impact but find (in Sundeeps first comment) will be calling grep with multiple files at a time, not just 1, which minimizes any potential performance benefit of grep -r. The fact that grep -r is doing what find already does violates the UNIX principle of each tool doing one thing well and made the potential arg list nightmarish. By giving grep arguments to find files they made it inconsistent with every other tool that reads files, e.g. sed, awk, cat, sort, uniq, head, tail, etc. You could argue the performance improvement for all of those too – Ed Morton Oct 10 '18 at 12:09
You could also argue it'd be a performance improvement if grep could `sort` it's output or `tr`anslate chars to other chars or `paste` lines from different files or do anything else than any tool already does. Should it? Of course not because **that violates the UNIX principle** of constructing solutions using multiple tightly cohesive, loosely coupled tools. It'd be different if there was a **need** for this but there simply isn't. – Ed Morton Oct 10 '18 at 12:15

Select files from sub-directories and print a certain line from each file

1 Answers1