0

I have a variable ($var) that contains a series of numbers ordered and separated with commas. As an example: 3,31,35,57,85,108,120,130,193,234,266,354,369,406,430,438,472,490,503,553,579,591,629,670,715,742,768,792,813

My original plan was to use them for extracting a series of columns in a file1 by the next code: cat file1 | cut -f "$var"

The terminal warns me of a problem: "Argument list too long". My objective remains the same — which strategy/alternative may I follow? I require something that lets me obtain all the columns (saving them or not in a file). Using a loop or whatever could prevent me from doing it "manually/individually".

A (smaller) example of the desired output:

123  299    429
12   0      2 
0    0      2
4    15     20
4    22     27
3    2      7
0    0      0
61   155    77
8327 5961   10023
5    11     17 
5777 8840   5669 
10   3      1 
53   365    199 
1    0      3 
26   31     15 
1    0      0
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
Gero
  • 107
  • 1
  • 8
  • There must be a lot of columns to get 'argument list too long'. Is it the shell or `cut` that is giving the error? If it is the shell, then you probably have in excess of 128 KiB of argument in the string (it would require more than 256 KiB on macOS). That indicates that you have massively long lines in the file, too. If, instead, it is `cut` complaining, then you need to rebuild it with whatever limit is causing it to generate the message raised large enough. – Jonathan Leffler Feb 10 '21 at 18:49
  • See [To check the E2BIG error condition in `exec`](https://stackoverflow.com/q/18559403) for code that checks how big a command line you can use. The 'argument list too long' error corresponds to `E2BIG` in `` (in C code). – Jonathan Leffler Feb 11 '21 at 23:33

2 Answers2

2

Assuming:

  • Your file1 contains huge number of columns separated by tab.
  • You want to select the columns listed in the bash variable "$var".
  • The list is too long to be accepted as a command line argument.

Then would you try the awk solution:

#!/bin/bash

var="3,31,35,57,85,108,120,130,193,234,266,354,369,406,430,438,472,490,503,553,579,591,629,670,715,742,768,792,813"
echo "$var" > list
# the "echo" command above is for the demonstration purpose only.
# please create a csv file "list" which contains the column list as "var".

awk '
    BEGIN {FS = OFS = "\t"}                     # assign the field separators to TAB
    NR==FNR {len = split($0, a, ","); next}     # read the file "list" and assign an array "a" to the list
    {
        result = $a[1]                          # 1st element of the column indexed by array a
        for (i = 2; i <= len; i++)              # loop over the column list
            result = result OFS $a[i]           # append the next element of the indexed column
        print result                            # print the record
    }
' list file1

We need to store the list of column numbers in a separate file "list" to avoid the Argument list too long error. I have tested with approx. 600KB list and it works.

tshiono
  • 21,248
  • 2
  • 14
  • 22
0

Use awk:

awk -F, '{ print $4","$7 }' file1

In this example, we set the field delimiter to , with -F and then print the 4th and 7th fields/columns only.

Raman Sailopal
  • 12,320
  • 2
  • 11
  • 18
  • 1
    If the list is too long for the shell, then the `awk` script on the command line will be too long too. In that case, you'd have to create the `awk` code in a file (e.g. `script.awk`) and then run `awk -F, -f script.awk file1`. – Jonathan Leffler Feb 10 '21 at 18:56