48

I want to pass all the files as a single argument on Linux but I am not able to do that.

This is working

ls | sort -n | xargs  -i pdftk  {} cat output combinewd2.pdf

This passes a single argument per command, but I want all in one command.

phuclv
  • 37,963
  • 15
  • 156
  • 475
user2027303
  • 609
  • 2
  • 7
  • 8

10 Answers10

62

Use -I option:

echo prefix | xargs -I % echo % post

Output:

prefix post
Hongbo Liu
  • 2,818
  • 1
  • 24
  • 18
  • 16
    However, this doesn't work as desired (it calls the command three times, rather than once with three arguments) if you use `ls` rather than `echo` as the input, which is what the OP is trying to do... – DNA Aug 14 '17 at 12:37
  • This treats the repl string as a single argument., – brian d foy Feb 14 '20 at 01:26
  • 2
    Don't know why it is top voted. The author is asking `passing all arguments` and this one doesn't answer the question. – Weihang Jian Feb 22 '22 at 06:16
30

This is one way to do it

pdftk $(ls | sort -n) cat output combinewd2.pdf

or using backtick

pdftk `ls | sort -n` cat output combinewd2.pdf

For example, if the filenames are 100, 2, 9, 3.14, 10, 1 the command will be

pdftk 1 2 3.14 9 10 100 cat output combinewd2.pdf

To handle filenames with spaces or other special characters consider this fixed version of @joeytwiddle's excellent answer (which does not sort numerically, see discussion below):

#-- The following will handle special characters, and
#   will sort filenames numerically
#   e.g. filenames 100, 2, 9, 3.14, 10, 1 results in 
#      ./1 ./2 ./3.14 ./9 ./10 ./100
#
find . -maxdepth 1 -type f -print0 |
  sort -k1.3n -z -t '\0' |
  xargs -0 sh -c 'pdftk "$@" cat output combinewd2.pdf' "$0"

Alternatives to xargs (bash specific)

xargs is an external command, in the previous example it invokes sh which in turn invokes pdftk.

An alternative is to use the builtin mapfile if available, or use the positional parameters. The following examples use two functions, print0_files generates the NUL terminated filenames and create_pdf invokes pdftk:

print0_files | create_pdf combinewd2.pdf

The functions are defined as follows

#-- Generate the NUL terminated filenames, numerically sorted
print0_files() {
    find . -maxdepth 1 -type f -print0 |
        sort -k1.3n -z -t '\0'
}
#-- Read NUL terminated filenames using mapfile
create_pdf() {
    mapfile -d ''
    pdftk "${MAPFILE[@]}" cat output "$1"
}
#-- Alternative using positional parameters
create_pdf() {
    local -r pdf=$1
    set --
    while IFS= read -r -d '' f; do set -- "$@" "$f"; done
    pdftk "$@" cat output "$pdf"
}

Discussion

As pointed out in the comments the simple initial answer does not work with filenames containing spaces or other special characters. The answer by @joeytwiddle does handle special characters, although it does not sort numerically

#-- The following will not sort numerically due to ./ prefix,
#   e.g. filenames 100, 2, 9, 3.14, 10, 1 results in 
#      ./1 ./10 ./100 ./2 ./3.14 ./9
#
find . -maxdepth 1 -type f -print0 |
  sort -zn |
  xargs -0 sh -c 'pdftk "$@" cat output combinewd2.pdf' "$0"

It does not sort numerically due to each filename being prefixed by ./ by the find command. Some versions of the find command support -printf '%P\0' which would not include the ./ prefix. A simpler, portable fix is to add the -d, --dictionary-order option to the sort command so that it considers only blank spaces and alphanumeric characters in comparisons, but might still produce the wrong ordering

#-- The following will not sort numerically due to decimals
#   e.g. filenames 100, 2, 9, 3.14, 10, 1 results in 
#      ./1 ./2 ./9 ./10 ./100 ./3.14
#
find . -maxdepth 1 -type f -print0 |
  sort -dzn |
  xargs -0 sh -c 'pdftk "$@" cat output combinewd2.pdf' "$0"

If filenames contain decimals this could lead to incorrect numeric sorting. The sort command does allow an offset into a field when sorting, sort -k1.3n, one must be careful though in defining the field separator if filenames are to be as general as possible, fortunately sort -t '\0' specifies NUL as the field separator, and the find -print0 option indicates NUL is to be used as the delimiter between filenames, so sort -z -t '\0' specifies NUL as both the record delimiter and field separator-- each filename is then a single field record. Given that, we can then offset into the single field and skip the ./ prefix by specifying the 3rd character of the 1st field as the starting position for the numeric sort, sort -k1.3n -z -t '\0'.

amdn
  • 11,314
  • 33
  • 45
  • 3
    A very `bash`-centric answer, but cool nonetheless. This won't work in `csh/tcsh`, however. (all comments about shell choice > `/dev/null`) – radical7 Feb 01 '13 at 04:45
  • 1
    Crucially, **this will not work on filenames containing spaces**. The words will be broken into separate arguments. – joeytwiddle Feb 24 '16 at 09:08
  • @joeytwiddle right again, added sed command to escape double quotes – amdn Feb 24 '16 at 20:22
  • 2
    I fear your eval solution won't work on filenames containing `"` quotation marks `"`. I do like the simplicity of your `$(...)` approach for times when we know the filenames are free of whitespace. I've posted my solution as a separate answer. – joeytwiddle Feb 24 '16 at 20:22
  • I don't know how to break your `sed` solution. Good job! :) – joeytwiddle Feb 24 '16 at 20:23
  • Thanks! Yours is shorter, kudos to you as well – amdn Feb 24 '16 at 20:26
  • Can you fix the issue with spaces in filenames by changing the IFS, e.g. `IFS=$'\n'; echo pre $(ls | sort -n) post` ? – DNA Aug 14 '17 at 12:45
  • Use `pdftk "$(ls | sort -n)" cat output combinewd2.pdf` so you'll not have problems with whitespace or other special characters. – Mikko Rantalainen May 28 '20 at 17:23
  • 1
    This is a security bug waiting to happen due lack of quoting. Do never run these command son untrusted filenames, e.g. an unpacked archive or sth. similar. – Remember Monica Oct 09 '21 at 14:55
  • @RememberMonica, fair criticism, thank you, I've updated the answer to remove eval and recommended the use of a fixed version of joeytwiddle's answer. – amdn Oct 10 '21 at 09:18
16

It’s ugly, but you can run sh -c and access the list of arguments passed by xargs as "${@}", like so:

ls | sort -n | xargs -d'\n' sh -c 'pdftk "${@}" cat output combinewd2.pdf' "${0}"

The extra "${0}" at the end is there because, as the sh man page says

-c string

If the -c option is present, then commands are read from string. If there are arguments after the string, they are assigned to the positional parameters, starting with $0.

To test this, let’s first create some files with complicated names that will mess up most other solutions:

$ seq 1 100 | xargs -I{} touch '{} with "spaces"'
$ ls
1 with "spaces"    31 with "spaces"  54 with "spaces"  77 with "spaces"
10 with "spaces"   32 with "spaces"  55 with "spaces"  78 with "spaces"
100 with "spaces"  33 with "spaces"  56 with "spaces"  79 with "spaces"
11 with "spaces"   34 with "spaces"  57 with "spaces"  8 with "spaces"
12 with "spaces"   35 with "spaces"  58 with "spaces"  80 with "spaces"
13 with "spaces"   36 with "spaces"  59 with "spaces"  81 with "spaces"
14 with "spaces"   37 with "spaces"  6 with "spaces"   82 with "spaces"
15 with "spaces"   38 with "spaces"  60 with "spaces"  83 with "spaces"
16 with "spaces"   39 with "spaces"  61 with "spaces"  84 with "spaces"
17 with "spaces"   4 with "spaces"   62 with "spaces"  85 with "spaces"
18 with "spaces"   40 with "spaces"  63 with "spaces"  86 with "spaces"
19 with "spaces"   41 with "spaces"  64 with "spaces"  87 with "spaces"
2 with "spaces"    42 with "spaces"  65 with "spaces"  88 with "spaces"
20 with "spaces"   43 with "spaces"  66 with "spaces"  89 with "spaces"
21 with "spaces"   44 with "spaces"  67 with "spaces"  9 with "spaces"
22 with "spaces"   45 with "spaces"  68 with "spaces"  90 with "spaces"
23 with "spaces"   46 with "spaces"  69 with "spaces"  91 with "spaces"
24 with "spaces"   47 with "spaces"  7 with "spaces"   92 with "spaces"
25 with "spaces"   48 with "spaces"  70 with "spaces"  93 with "spaces"
26 with "spaces"   49 with "spaces"  71 with "spaces"  94 with "spaces"
27 with "spaces"   5 with "spaces"   72 with "spaces"  95 with "spaces"
28 with "spaces"   50 with "spaces"  73 with "spaces"  96 with "spaces"
29 with "spaces"   51 with "spaces"  74 with "spaces"  97 with "spaces"
3 with "spaces"    52 with "spaces"  75 with "spaces"  98 with "spaces"
30 with "spaces"   53 with "spaces"  76 with "spaces"  99 with "spaces"
$  ls | sort -n | xargs -d'\n' sh -c 'set -x; pdftk "${@}" cat output combinewd2.pdf' "${0}"
+ pdftk '1 with "spaces"' '2 with "spaces"' '3 with "spaces"' '4 with "spaces"' '5 with "spaces"' '6 with "spaces"' '7 with "spaces"' '8 with "spaces"' '9 with "spaces"' '10 with "spaces"' '11 with "spaces"' '12 with "spaces"' '13 with "spaces"' '14 with "spaces"' '15 with "spaces"' '16 with "spaces"' '17 with "spaces"' '18 with "spaces"' '19 with "spaces"' '20 with "spaces"' '21 with "spaces"' '22 with "spaces"' '23 with "spaces"' '24 with "spaces"' '25 with "spaces"' '26 with "spaces"' '27 with "spaces"' '28 with "spaces"' '29 with "spaces"' '30 with "spaces"' '31 with "spaces"' '32 with "spaces"' '33 with "spaces"' '34 with "spaces"' '35 with "spaces"' '36 with "spaces"' '37 with "spaces"' '38 with "spaces"' '39 with "spaces"' '40 with "spaces"' '41 with "spaces"' '42 with "spaces"' '43 with "spaces"' '44 with "spaces"' '45 with "spaces"' '46 with "spaces"' '47 with "spaces"' '48 with "spaces"' '49 with "spaces"' '50 with "spaces"' '51 with "spaces"' '52 with "spaces"' '53 with "spaces"' '54 with "spaces"' '55 with "spaces"' '56 with "spaces"' '57 with "spaces"' '58 with "spaces"' '59 with "spaces"' '60 with "spaces"' '61 with "spaces"' '62 with "spaces"' '63 with "spaces"' '64 with "spaces"' '65 with "spaces"' '66 with "spaces"' '67 with "spaces"' '68 with "spaces"' '69 with "spaces"' '70 with "spaces"' '71 with "spaces"' '72 with "spaces"' '73 with "spaces"' '74 with "spaces"' '75 with "spaces"' '76 with "spaces"' '77 with "spaces"' '78 with "spaces"' '79 with "spaces"' '80 with "spaces"' '81 with "spaces"' '82 with "spaces"' '83 with "spaces"' '84 with "spaces"' '85 with "spaces"' '86 with "spaces"' '87 with "spaces"' '88 with "spaces"' '89 with "spaces"' '90 with "spaces"' '91 with "spaces"' '92 with "spaces"' '93 with "spaces"' '94 with "spaces"' '95 with "spaces"' '96 with "spaces"' '97 with "spaces"' '98 with "spaces"' '99 with "spaces"' '100 with "spaces"' cat output combinewd2.pdf

All the arguments are quoted correctly. Note that this will fail if any filenames contain newlines, and that ls -v is basically ls | sort -n.

andrewdotn
  • 32,721
  • 10
  • 101
  • 130
  • This works on filenames containing spaces, but not on filenames containing newlines. Although those aren't very common, they can be handled correctly with: `find . -type f -maxdepth 1 -print0 | sort -zn | xargs -0 sh -c ...` – joeytwiddle Feb 24 '16 at 09:15
  • Although if we are using `find` then we don't need `xargs` at all! We can use `find ... -exec [command] {} +` as recommended in [BashFAQ/020](http://mywiki.wooledge.org/BashFAQ/020). – joeytwiddle Feb 24 '16 at 09:16
  • @joeytwiddle Yup, use `find` instead of `ls` if there might be newlines in filenames. – andrewdotn Feb 24 '16 at 16:30
  • But argh, we can't use find directly in this case, because of the OP's desire to sort. By the way, I think the `"$0"` should go inside the shell's command. It's an odd quirk of `sh -c 'foo' bar baz bim` that *bar* gets passed as the $0 argument, with *baz* and *bim* in $@, while *foo* get executed. Experiment with: `sh -c 'echo "$0" "$@"' a b c` – joeytwiddle Feb 24 '16 at 20:20
  • 1
    @joeytwiddle `$0` is outside so that the parent’s `$0` gets passed to the child, but `$@` expands to `$1 ...` on from xargs. It’s very subtle, which is why I lead with “It’s ugly, but …” – andrewdotn Feb 24 '16 at 21:30
  • You are quite right. When there are 0 lines of input, your `"$0"` ***outside works fine***, but my `"$0"` inside sees the first argument as `"sh"` or `"bash"`! Here is an example which breaks (remove the grep for non-empty input): `seq 1 3 | grep X | xargs -d'\n' bash -c 'echo "$0" "${@}"'` – joeytwiddle Mar 12 '16 at 04:46
  • nice solution! i keep coming back to this. the use of `${0}` is a bit confusing, though. it may as well be `sh` or `false` or `ignored`, since it's just used to offset the positional parameters received in from xargs. – RubyTuesdayDONO Apr 16 '20 at 19:29
  • While you can probably get away with passing a dummy value, I pass `${0}` explicitly so that the value of `${0}` in the child process is the same as in the parent. – andrewdotn Apr 16 '20 at 22:54
8

This should work on filenames containing spaces, newlines, apostrophes and quotation marks (all of which are possible on UNIX filesystems):

find . -maxdepth 1 -type f -print0 |
  sort -zn |
  xargs -0 sh -c 'pdftk "$@" cat output combinewd2.pdf' "$0"

That might be overkill compared to the accepted answer, if you know you are working with simple filenames.

But if you are writing a script that will be used again in future, it is desirable that it won't explode one day when it meets unusual (but valid) inputs.

This is basically an adaptation of andrewdotn's answer which terminates input files with a zero-byte, instead of with a newline, hence preserving filenames which contain one or more newline characters.

The respective options -print0, -z and -0 tell each of the programs that input/output should be delimited by the zero-byte. Three different programs, three different arguments!

Community
  • 1
  • 1
joeytwiddle
  • 29,306
  • 13
  • 121
  • 110
7

The most intuitive way I found was to:

  • first construct the commands with -I{} and "echo",
  • then execute the commands with "bash" (as if you are executing a shell script)

Here is an example to rename extensions from ".txt" to ".txt.json":

find .|grep txt$|xargs -I{} echo "mv {} {}.json"|bash

Slightly advanced example to rename .txt to .json (removing .txt extension)

find $PWD|grep txt$|cut -d"." -f1|xargs -I{} echo "mv {}.txt {}.json"|bash

I once had a requirement to append the string "End of File" to all files.

find .|grep txt|xargs -I{} echo "echo End of File >> {}"|bash

If you do it right, xargs is the king of all commands!

Thyag
  • 1,217
  • 13
  • 14
5

Here's what I did for the same problem, and am actually using in production:

cat chapter_order-pdf-sample.txt | xargs -J % pdftk % cat output sample.pdf
brian d foy
  • 129,424
  • 31
  • 207
  • 592
4

You can do this by chaining two calls to xargs. Use the first to chain all of the args together into one string and pass that as a param to echo, and the second using -I to place that chain of args into the place where you want it, as follows:

ls | sort -n | xargs echo | xargs -I {} pdftk {} cat output combinewd2.pdf
JakeRobb
  • 1,711
  • 1
  • 17
  • 32
  • 1
    This treats the replstr as a single argument, so pdftk look for a file that has the name of all files concatenated with spaces. – brian d foy Feb 14 '20 at 01:22
  • Interesting. It's been almost two years; I am not a pdftk user and so tested this with some other command (which I don't recall at this time). It worked for me. That could perhaps be a quirk of pdftk or some other factor. – JakeRobb Mar 02 '20 at 18:09
  • @JakeRobb no, it's a consequence of the `xargs echo` and of how xargs processes standard input. If you remove the `xargs echo` from the pipe, it should work as expected. And while `find` is actually the recommended tool to generate a list of files, there are ways to make using `ls` more robust for use in scripting. `/bin/ls -1 --zero | sort -nz | xargs -r0 -I {} pdftk {} cat output combinewd2.pdf`. The `/bin/` prefix is so it doesn't use an alias or function that might be defined in your shell. You may have to use `/usr/bin/ls` if it's not in `/bin` on your system. Or use `env ls`. – blubberdiblub Apr 04 '23 at 21:02
4

Don't following any xargs -I solution because it can't handle white space correctly and won't work as you expected.

TLDR

ls | sort -n | xargs -d '\n' sh -c 'echo pdftk "$@" cat output combinewd2.pdf' sh

For the following solutions, I'll use ruby -e 'p ARGV' to inspect arguments, for your reference:

ruby -e 'p ARGV' foo bar buz
["foo", "bar", "buz"]

BSD xargs

This is the easiest way if you use BSD system (like macOS).

echo 3 4 | xargs -J@ ruby -e 'p ARGV' 1 2 @ 5 6
["1", "2", "3", "4", "5", "6"]

references:

-J  replstr
         If this option is specified, xargs will use the data read from
         standard input to replace the first occurrence of replstr instead
         of appending that data after all other arguments.  This option
         will not affect how many arguments will be read from input (-n),
         or the size of the command(s) xargs will generate (-s).  The op-
         tion just moves where those arguments will be placed in the com-
         mand(s) that are executed.  The replstr must show up as a dis-
         tinct argument to xargs.  It will not be recognized if, for in-
         stance, it is in the middle of a quoted string.  Furthermore,
         only the first occurrence of the replstr will be replaced.  For
         example, the following command will copy the list of files and
         directories which start with an uppercase letter in the current
         directory to destdir:

           /bin/ls -1d [A-Z]* | xargs -J % cp -Rp % destdir

GNU xargs

Because there is no -J flag, here we create a sub-shell and make use of $@ to solve the problem.

It might be a little tricky but easy to remember if you know how $@ works.

echo 3 4 | xargs sh -c 'ruby -e "p ARGV" 1 2 "$@" 5 6' sh
["1", "2", "3", "4", "5", "6"]

Note that the last sh is required, otherwise $0 would be "3" and the output would be ["1", "2", "4", "5", "6"]. It doesn't need to be sh but you should use any sensible name for $0.

-c string

    If the -c option is present, then commands are read from string.
    If there are arguments after the string, they are assigned to the positional
    parameters, starting with $0. 

Why not -I?

Because it cannot handle white space correctly, for example:

echo  3 4 | xargs -I @ ruby -e 'p ARGV' 1 2 @ 5 6
["1", "2", "3 4", "5", "6"]

another example:

printf "hello world\0goodbye world" | xargs -0 -I @ ruby -e 'p ARGV' before @ after
["before", "hello world", "after"]
["before", "goodbye world", "after"]

Got expected output using sub-shell and $@:

printf "hello world\0goodbye world" | xargs -0 sh -c 'ruby -e "p ARGV" before "$@" after' sh
["before", "hello world", "goodbye world", "after"]
Weihang Jian
  • 7,826
  • 4
  • 44
  • 55
  • 1
    Note that even in your last example there is no guarantee that ruby will be run just once (hence there might be more than 1 "before" and 1 "after"). `xargs` has a default limit with its `-s` option, but you cannot increase it past the system limit. In effect, your called command or script should always be able to handle the case that it's called multiple times with just a subset of the items from the list for each pass. In the most extreme case it could be called once for each single item in the list. I.e. like your second to last example, which is actually a fine way to use `-I`. – blubberdiblub Apr 04 '23 at 20:50
0

I know this is not the OP's question, but I found this useful.

If you want to reshuffle your arguments you can use parallel in combination with xargs.

# limit the number of arguments per line
echo last first middle last first middle | xargs -n 3
last first middle
last first middle

GNU-parallel can now reshuffle those arguments at will with the --colsep argument.

echo last first middle last first middle | xargs -n 3 | parallel --colsep=" " echo {2} {3} {1}
first middle last
first middle last

You can also add constant arguments in there.

echo last first middle last first middle | xargs -n 3 | parallel --colsep=" " echo {2} {3} middle2 {1}
first middle middle2 last
first middle middle2 last
ssanch
  • 389
  • 2
  • 6
  • Why not just use parallel instead of combining xargs and parallel? – goji Feb 04 '21 at 03:44
  • Good question. Is just to generate a sort of space delimited `STDIN` list of arguments. An equivalent way to do this would be: `echo -e "last first middle\nlast first middle" | parallel ...` – ssanch Mar 22 '21 at 15:28
0

Here's a version that works even in plain sh across many versions of xargs:

{
  ls | sort -n
  echo cat output combinewd2.pdf
} | xargs pdftk

It uses command grouping to send the output of multiple commands to a single xargs invocation:

{ list; }

Placing a list of commands between curly braces causes the list to be executed in the current shell context.

That way you get the output of ls | sort -n first, then the additional arguments.

A Safe Version

Paths can contain spaces (not uncommon) and even line breaks (less common). For anything more than a single-use command, not handling those cases will break unpredictably and unexpectedly. Also, parsing ls should be avoided.

Instead, use find with its -print0 directive to print file names null-character separated. sort has the --zero-terminated switch to support this case. Add the additional arguments with printf '%s\0' cat output combinewd2.pdf, which adds the null separator. Finally, xargs --null passes everything to pdftk again:

{
  find . -maxdepth 1 -name '*.pdf' -print0 \
    | sort --zero-terminated --key 1.3n --field-separator '\0'
  printf '%s\0' cat output combinewd2.pdf
} | xargs --null pdftk

The arguments to sort are discussed in amdn's answer. This has the disadvantage of requiring GNU coreutils.

However, it has the advantage of:

  • working in plain sh
  • allowing the trailing arguments (cat output combinewd2.pdf) to be dynamic, for example "$@" instead of cat output combinewd2.pdf.

Just Use Python

If this needs to be readable and maintainable, just use a fully-fledged scripting language:

from subprocess import run
from pathlib import Path

run(
  ["pdftk"]
  + sorted(p.name for p in Path.cwd().glob("*.pdf") if p.is_file())
  + ["cat", "output", "combinewd2.pdf"]
)
just-max
  • 103
  • 2
  • 5