-2

Error: Argument list too long

sudo cp and find... -exec sudo cp yield

/usr/bin/sudo(or find): Argument list too long

I get this error in my Travis CI build output. My .travis.yml file uses a shell script (deploy.sh) to run a git diff against branch2's force-app folder, and the output (several thousand files) is cp into a new directory:

sudo cp --parents $(git diff --name-only branch2 force-app/) $DEPLOYDIRECTORY;

The error is two lines:

./deploy.sh: line 122: /usr/bin/git: Argument list too long
./deploy.sh: line 122: /usr/bin/sudo: Argument list too long

I have the same issue with the subsequent find command. I use a naming convention to find and copy a file that corresponds with each file from the git diff:

for FILE in $GIT_DIFF_FILES; do
    if [[ $FILE == *Test.cls ]]; then
        find $classPath -maxdepth1 -samefile "$FILE-meta.xml" -exec sudo cp --parents {} $DEPLOY_DIRECTORY +
    fi;
done;

Again, I get the error: ./deploy.sh: line 142: /usr/bin/find: Argument list too long.

Note: Travis CI is running an Ubuntu Linux (Xenial) virtual environment for this build.


What I've Tried

1. Resetting the stack size

Travis CI automatically sets the stack size to 8192 and the arg max to 8388608 for each new build. In my deploy.sh file, I ulimit -s 9999999 to change the stack size to 9999999 and the arg max to 2559999744 for each new build.

2. Command Variations

For the first sudo cp command, I've tried:

  1. formatting command as a for loop
  2. tar -cf - -C files... | tar xpf - -C target directory...
  3. formatting command as git diff ... | xargs cp ...

For the find command, I've tried:

  1. find ... | xargs -n 1000 sudo cp ....
    • Any different flags added to this command don't work either.
  2. find ... -exec cp ...
    • Any flags or syntax changes return the same error

Changing the command doesn't appear to be the solution

After reproducing the error locally, I found that my commands already work fine when interacting with a smaller number of files (~1000 or less).

However, running the cp and find commands when the output of git diff is approximately 1200-1500 files or more then returns:

Argument list too long

In addition, simply running find or cp from my shell script also returns:

Argument list too long

It does not seem to matter what I do to these commands, therefore. Something more fundamental is causing the problem, but only when the file count of the git diff output exceeds approximately 1200 files.

How do I fix my cp and find commands?


How to Reproduce the Error

Here are the steps to take to reproduce this error according to my use case. I used VS Code, so I recommend you do as well to replicate this as similarly to me as possible. Note that you will need to replace USERNAME in the below code with your username.

Step 1: Fork this repo.

Step 2: Follow the steps below in your terminal

cd force-app/main/default
mkdir diff
git checkout -b branch2
cd force-app/main/default/classes

Add //comment to the bottom of myclass.cls and myclass.cls-meta.xml files. Save changes.

for n in {001..1500}; do cp myclass.cls myclass$n.cls; done
for n in {001..1500}; do cp myclass.cls-meta.xml myclass$n.cls-meta.xml; done
git add .
git commit -m “first commit”
git checkout master
for n in {001..1500}; do cp myclass.cls myclass$n.cls; done
for n in {001..1500}; do cp myclass.cls-meta.xml myclass$n.cls-meta.xml; done
git add .
git commit -m “second commit”
cd .. #back to the sfdx-travisci folder
sudo cp -p $(git diff --name-only branch2 /Users/USERNAME/sfdx-travisci/force-app/main/default/classes) /Users/USERNAME/sfdx-travisci/force-app/main/default/diff
for file in $(sudo cp -p $(git diff --name-only branch2 /Users/USERNAME/sfdx-travisci/force-app/main/default/classes) /Users/USERNAME/sfdx-travisci/force-app/main/default/diff); do if [[ $file == *.cls ]]; then find /Users/USERNAME/sfdx-travisci/force-app/main/default/classes -samefile “$file-meta.xml” -exec sudo cp -p {} /Users/USERNAME/sfdx-travisci/force-app/main/default/diff +; fi; done;
  • This can happen if you use a wildcard in a directory with thousands of files. – Barmar Aug 04 '20 at 18:40
  • @Barmar That still wouldn't explain why the first `sudo cp` command also returns the error message, right. – Jack Barsotti Aug 04 '20 at 18:43
  • You're saying that the 40 byte command `git diff --name-only branch2 force-app/` gives `/usr/bin/git: Argument list too long`? Sounds like you've put too much data in your environment variables. – that other guy Aug 04 '20 at 19:04
  • Quote your variable exapsnions... without quoting if one of the variables is `*` it get's expanded. Try `find "$classPath" -maxdepth1 -samefile "$FILE-meta.xml" -exec sudo cp --parents {} "$DEPLOYDIR" \;`. Or, I guess, your workstation is just running out of memory or/and you have too many environment variables. What does `printenv` output? – KamilCuk Aug 04 '20 at 20:48
  • @KamilCuk Good idea. Did not work, unfortunately – Jack Barsotti Aug 04 '20 at 20:53
  • Are you still looking into the size of your environment? – that other guy Aug 04 '20 at 21:17
  • @KamilCuk What info would be useful to you from running `printenv`? quite a lot is spit out – Jack Barsotti Aug 04 '20 at 21:42
  • Did you try `xargs -L100 ...`, which should group the arguments in batches of 100? – Walter A Aug 04 '20 at 21:47
  • @WalterA Just tried it, didn't work. Also tried -L10 with no luck. – Jack Barsotti Aug 04 '20 at 22:05
  • Maybe `printenv | wc`? – KamilCuk Aug 04 '20 at 22:15
  • @KamilCuk `printenv | wc` returns `27 37 1370` when run locally. But it returns `140 200 5953` when run through the shell script in my Travis build. – Jack Barsotti Aug 04 '20 at 23:55
  • `git diff --name-only branch2 force-app/ | wc` would be a useful command to run but apparently you ran it in a different directory? Do you have any Git aliases which would affect what `git diff` actually runs? – tripleee Aug 05 '20 at 06:36
  • Please try to come up with a [mcve]. Yes this also applies to shell commands. We don't have your git repo so please avoid git commands in there. What is the smallest command that causes the "argument list too long" error? Try replacing actual commands with `echo` and see what happens. – n. m. could be an AI Aug 05 '20 at 07:04
  • To really get to the bottom of this run `strace -f -v -s 99999999 -o strace.log `. This is a low-level debugging utility that traces system calls. You should be able to find a failing [`execve`](https://linux.die.net/man/2/execve) system call that's the root cause of this error (see `E2BIG` in the linked page). If so it will show the full argument list and environment variables and help you identify the higher-level culprit. – John Kugelman Aug 05 '20 at 13:52
  • @n.'pronouns'm. the smallest command that causes the error is simply `cp` or `find` – Jack Barsotti Aug 05 '20 at 14:11
  • So the answer you already got was correct all along. I see no reason to reopen this only to then close as a duplicate of a common FAQ. – tripleee Aug 05 '20 at 18:16
  • @tripleee I didn't close it. I also don't have a "correct" answer. I still don't know how to fix my approach and/or commands to get rid of the error given that I'm using `cp` and `find` for thousands of files. – Jack Barsotti Aug 05 '20 at 18:25
  • `git diff --name-only branch2 force-app/ | xargs sudo cp --parents -t "$DEPLOYDIRECTORY"` though your use of `sudo` should probably also be reduced or eliminated. – tripleee Aug 05 '20 at 18:26
  • @tripleee both with and without `sudo` I get the same error as before – Jack Barsotti Aug 05 '20 at 20:11
  • @JohnKugelman Seems like a good utility. I had trouble getting it to work though. I ran the command and first was greeted with `bash: strace: command not found`. I then ran `sudo apt install strace`, and now it returns: `/deploy.sh: line 123: /usr/bin/strace: Argument list too long` – Jack Barsotti Aug 05 '20 at 20:51
  • 1
    I can see you are doing a lot of work to get this resolved, but still I see no indication that this is different from the very basic [Argument list too long error for rm, cp, mv commands](https://stackoverflow.com/questions/11289551/argument-list-too-long-error-for-rm-cp-mv-commands). If you don't like that canonical, there are other FAQs from 30+ years of Unix bloggers explaining how this works; see e.g. https://www.in-ulm.de/~mascheck/various/argmax/ – tripleee Aug 06 '20 at 06:07
  • I don't mean to imply that removing `sudo` would help at all here; I mean that it's wrong for other reasons. – tripleee Aug 06 '20 at 06:10
  • Your repro steps are absolutely not minimal; for one thing, the final `for` loop is a no-op since it attempt to loop over the output of a command which doesn't output anything. – tripleee Aug 06 '20 at 07:27
  • Anyway, your repro steps don't appear to repro: I don't get anything at all in the `git diff`. https://repl.it/repls/LuminousAfraidDeeplearning#main.sh – tripleee Aug 06 '20 at 07:38
  • Nested command substitutions are a code smell, and command substitutions which produce a lot of output are always problematic. See also [useless use of backticks](http://www.iki.fi/era/unix/award.html#backticks) (which of course is so old that it only discusses backticks, not modern `$(command substitution)` syntax). – tripleee Aug 06 '20 at 08:24
  • @tripleee It's likely that the reason the repro didn't work for you is that you never made a comment to the class files (see my post) before the `git diff`. Otherwise I'm not sure why it wouldn't work - worked on my end. Thanks for sending those links. Some of them I've already previously looked at, tried the suggestions, and had no luck, which is why I don't think it was redundant to be posting my question. A couple of the links you sent look interesting and I will try implementing the suggestions I see... – Jack Barsotti Aug 06 '20 at 18:51
  • @tripleee would it be more prudent to eliminate `$(git diff ...)` and replace with a variable `$diff` that is called in my `sudo cp -p $diff $deploy_directory`? – Jack Barsotti Aug 06 '20 at 19:40
  • No, that'll fail for the same reason. You've got to stop passing thousands of file names to a single command. Whether you do it with `$(git diff ...)` or `$diff` you'll have the same problem. **Use a [while-read loop](https://stackoverflow.com/questions/19570413/how-to-pipe-input-to-a-bash-while-loop-and-preserve-variables-after-loop-ends) instead.** – John Kugelman Aug 06 '20 at 20:20
  • Thanks for pointing out the problem with the repro. I missed it because the instruction was switching between straight up code and messy meandering prose instructions. I still can't repro because repl.it runs out of disk space before getting this to exceed `ARG_MAX` which is 2M on their Linux. (I tried going up to 10000 changed files and that was not enough; now the disk fills up with 20k.) What's the output of `getconf ARG_MAX` for you? https://repl.it/repls/DimgrayDecisiveIde#main.sh – tripleee Aug 07 '20 at 05:13
  • 1
    Anyway, here is a refactoring which should hopefully do what you _actually_ wanted to, and avoid the problem. https://repl.it/repls/GloomyHollowBookmarks – tripleee Aug 07 '20 at 05:17
  • @tripleee cc: @JohnKugelman Thank you very much, that code worked for me and fixed my error. I had to make minor tweaks to it by adding back `sudo` and changing `-p` to `--parents`, and then the error disappeared. I also simplified the repro instructions a little, so hopefully it "meanders" less. btw: my getconf ARG_MAX is already listed in the question. Thanks again – Jack Barsotti Aug 07 '20 at 16:58
  • You say you tried to raise it, but I don't think you can actually set it that high; and the error indirectly confirms that you were still exceeding it. If you are on Linux I would guess you have the same 2M limit as everybody else, but then the repro instructions really should repro. – tripleee Aug 07 '20 at 20:27

1 Answers1

1

There is a limit to command line length, on most operating systems. The usual solution is not to use a command line, but a pipe, and eventually xargs to get back batches of files as arguments.

For example you can use:

find . | xargs -n 1000 sudo cp ....

in such cases, xargs will split arguments in batches of 1000 files at a time, and it execute sudo cp, at every batch. In this manner you should not have problems (but on very very long directories and filenames).

If you want to execute sudo only once you may want to move find inside sudo (warning: you may have access to more files, as root, and you should care that pipes and redirections are done inside sudo, and not in the calling shell).

Note: you should also change diff to git diff.

Often we use -print0 in find and -0 in xargs, so that you will not have problems with special characters in filenames. (\U+0000 is forbidden in filenames in POSIX compatible systems).

Note: if it works, you may want to send a bug report with the patch to the group who created the original file (if it is not originated by you). We find often such kind of bugs: commands that fails on special cases.

Giacomo Catenazzi
  • 8,519
  • 2
  • 24
  • 32
  • Thanks for your answer. Both with and without `sudo`, I get the same error as before, however. xargs hasn't given me luck so far, regardless of the variations I try. – Jack Barsotti Aug 05 '20 at 20:12
  • You want find . | xargs -n 1000 -i sudo cp {} $DEPLOYDIRECTORY – Cosmicnet Jul 29 '22 at 10:28