1

I'd like to search a directory structure to count the number of times I've loaded various R packages. The source is contained in .org and .R files. I'm willing to assume that "library(" is the first non-blank entry on any line I care about, and I'm willing to assume that there is at most only one such call per line.

 find . -regex ".*/.*\.org" -print 

gets me a list of .org files, and

find . -regex ".*\.\(org\|R\)$" -print 

gets me a list of .org and .R files (thanks to https://unix.stackexchange.com/questions/15308/how-to-use-find-command-to-search-for-multiple-extensions).

Given a particular file,

grep -h "library(" file | sed 's/library(//' | sed 's/)//'

gets me the package name. I'd like to hook them together and then possibly redirect the output to a file, from which I can use R to calculate frequencies.

The seemingly straightforward

find . -regex ".*/.*\.org" -print | xargs -0 grep -h "library("  | sed 's/library(//' | sed 's/)//'

doesn't work; I get

 find . -regex ".*/.*\.org" -print | xargs -0 grep -h "library("  |   sed 's/library(//' | sed 's/)//'
Usage: /usr/bin/grep [OPTION]... PATTERN [FILE]...
Try '/usr/bin/grep --help' for more information.

and I'm not sure what to do next.

I also tried

find . -regex ".*/.*\.org" -exec grep -h "library(" "{}" "\;"

and got

find . -regex ".*/.*\.org" -exec grep -h "library(" "{}" "\;"
find: missing argument to `-exec'

It seems simple. What am I missing?

UPDATE: Adding -t to the above xargs shows me the first command:

grep -h library ./dirname/filename.org

followed by, presumably, a list of all the matching files with paths relative to the PWD. Actually, that works if I only search for .org files; if I add .R files, too, I get "xargs: argument line too long". I think that means xargs is passing the entire list of files as the argument to one invocation of grep.

Community
  • 1
  • 1
BillH
  • 125
  • 6
  • If it helps, find . -regex ".*\.\(org\|R\)$" -print0 | xargs -0 grep -h 'library' returns Usage: /usr/bin/grep [OPTION]... PATTERN [FILE]... Try '/usr/bin/grep --help' for more information. Usage: /usr/bin/grep [OPTION]... PATTERN [FILE]... Try '/usr/bin/grep --help' for more information. If I only use one extension, I get only the first two error lines. – BillH Nov 21 '16 at 23:55
  • I thought http://stackoverflow.com/questions/199266/make-xargs-execute-the-command-once-for-each-line-of-input had the answer, but I couldn't make it work. find . -type f -regex ".*\.\(org\|R\)$" -exec grep -h 'library(' '{}' ';' | sed 's/library(//' | sed 's/)//' | sed 's/\"//g' does seem to work, though I'm sure there are more elegant and concise approaches. – BillH Nov 22 '16 at 18:41
  • `find . -type f -regex ".*\.\(org\|R\)" -print0 | xargs -0 grep -h "library(" | sed 's/library(//' | sed 's/)//'` seems to work for me. – NickD Jan 13 '17 at 05:33

1 Answers1

0

find ... -print | xargs OK

find ... -print0 | xargs -0 OK

find ... -print0 | xargs broken

find ... -print | xargs -0 broken (what you used)

Also, please don't:

grep -h "library(" | sed 's/library(//' | sed 's/)//'

when this is faster:

grep -h "library(" | sed -e 's/library(//' -e 's/)//'

and this is even faster, and more interesting:

grep -h "library(" | grep -o '(.*)' | tr -d ' ()'

webb
  • 4,180
  • 1
  • 17
  • 26