3

I have a find command that outputs as I would like (it looks for files with strelka in the name and ends with .vcf):

find . -name *strelka* | grep '.vcf$'

But I want to then iterate over each and perform additional stuff.

To simply experiment, I'm just having it do a line count on each file (but ultimately I would want to do more in the end, this is just me playing around), but it already doesn't seem to work.

I have :

for i in find . -name \*strelka\* | grep '.vcf$'; do wc -l $i; done

Anyone know what is wrong with this?

michas
  • 25,361
  • 15
  • 76
  • 121
user3412393
  • 105
  • 1
  • 10
  • Please don't improve the commands in the question or both the question and answers get pointless. – michas Mar 12 '14 at 22:35
  • Several solutions fail if filenames contain spaces or other special characters. To address that issue, see https://stackoverflow.com/questions/7039130/iterate-over-a-list-of-files-with-spaces – TextGeek Dec 04 '19 at 17:30

3 Answers3

3
  1. Use find . -name '*strelka*.vcf', thereby avoiding the grep and letting you use find ... -exec

  2. Either pipe into xargs, or use -exec:

    find . -name '*strelka*.vcf' | xargs wc -l  
    
    find . -name '*strelka*.vcf' -exec wc -l '{}' \;
    

Prefer the latter, for various reasons.

Your approach (with a $(...) around your find) is OK, except:

  • You'll have grief if there are spaces in filenames. (You'd have grief with xargs too - there's a way round that involving \0, but it's a bit arcane.)
  • You'll exceed the command line length limit if there are too many matching files.
user3392484
  • 1,929
  • 9
  • 9
  • 2
    The `xargs` version will call `wc` many fewer times, possibly once (depending on how many files there are). The `-exec` version as written will call `wc` once per file. In the former case, you'll get possibly useful (or possibly undesired) totals from `wc`. You could modify the `-exec` case to do the same by changing the `\;` to `+`. – rici Mar 12 '14 at 21:40
  • for the first option using `find` and `xargs` I would suggest to use: `find . -print0 -name '*strelka*.vcf' | xargs -0 wc -l`. Its is to ensure you dont end up having problems with wierd filename containing `\n` or other preblematic chars. – Lynch Mar 13 '14 at 01:12
3

Assuming you are using a recent (4.0 or greater) version of bash, you don't need to use find for this; a file pattern is sufficient, along with the globstar option.

shopt -s globstar
for i in **/*strelka*.vcf; do
    wc -l $i
done

For a small number of files, you can probably just use

wc -l **/*strelka*.vcf

For a large number of files,

find -name '*strelka*.vcf' -execdir wc -l '{}' +

is most efficient.

chepner
  • 497,756
  • 71
  • 530
  • 681
2

You can change your command as follows:

for i in $(find . -name \*strelka\* | grep '.vcf$'); do wc -l $i; done

or

find . -name \*strelka\* | grep '.vcf$' | while read i; do wc -l $i; done

or

find . -name \*strelka\* | grep '.vcf$' | xargs wc -l

You can also improve you find command to look for \*strelka\*.vcf to avoid the grep.

Also pay attention that files might contain spaces or even newlines. They would probably break your command. Therefore the best way is probably using:

find . -name '*strelka*.vcf' -exec wc -l '{}' \;
michas
  • 25,361
  • 15
  • 76
  • 121