0

I'm using the node module Pdf-text-extract to extract pdfs texts, and i would like to extract exactly 10.025 PDF's. The problem is that my Mac Yosemite is returning the error:

-bash: /usr/local/bin/extract: Argument list too long

First I thought it would be a ulimit of error, but I increased my limit to 15000 and even then the error occurs. Is there any way to fix this?

Thanks.

tripleee
  • 175,061
  • 34
  • 275
  • 318
  • Did you try to *loop* over the list of PDFs and call `extract` for each single one? This can be slower (because starting the command has to be done again and again) or faster (because the separate commands can be executed in parallel on different processors. – Alfe Oct 05 '15 at 19:26

1 Answers1

2

The limit on the command length isn't something you can change easily. I suspect your problem is that you have a shell pattern that expands to too many files, like

extract *.pdf

One way to manage this is to let find expand the pattern and call extract multiple times, with as many arguments as possible on each call.

find . -prune -name '*.pdf' -exec extract outputfile {} +
chepner
  • 497,756
  • 71
  • 530
  • 681
  • Hei chepner, thanks for the faster reply. I have a doubt, i'm using node, so the 'extract' its a node command, and the complete command should be extract 'NAME_OF_OUTPUTFILE' 'PATH_TO_PDFS/*.PDF'. How could i use the find command with mine? Is possible? –  Oct 05 '15 at 19:05
  • `find . -prune -name '*.pdf' -exec extract NAME_OF_OUTPUTFILE {} +` – tripleee Oct 06 '15 at 05:11