0

I've a CDR file(.CSV) which contains around 150 columns and is a very large file. I'm trying to get the output where the 31st column should have value "13".

I'm trying with below command:

awk -F',' '$31~/^13/' report_1.csv > report_2.csv

But getting the following error:

awk: record `,1402786,535,1,47432... has too many fields record number 1`

Any help?

Joaquin
  • 2,013
  • 3
  • 14
  • 26
User123
  • 1,498
  • 2
  • 12
  • 26

3 Answers3

1

I suggest:

awk -F',' '$31 == "13"' report_1.csv > report_2.csv
Cyrus
  • 84,225
  • 14
  • 89
  • 153
1

The limit on number of fields shouldn't be so low as 150, so I'm guessing you're probably not parsing your CSV file properly.

If a particular, you should not split on just any comma - you should avoid splitting on , within quoted fields ("like,this").

If you're using GNU awk, proper CSV parsing is pretty simple via FPAT (according to this excellent answer by @Ed Morton):

awk -v FPAT='[^,]*|"[^"]+"' '$31 ~ /^13/' file

or, for an exact match:

awk -v FPAT='[^,]*|"[^"]+"' '$31 == "13"' file

In a non-GNU awk case, refer to the cited answer for an alternative parsing method.

randomir
  • 17,989
  • 1
  • 40
  • 55
  • Thanks for this....I need to ask if I want the output where 31st column as "13", 56th column as "ABC" and the 80th column, what should be the awk command? – User123 Nov 04 '17 at 17:02
  • 1
    Simply combine the conditions: ' $31 == "13" && $56 == "ABC" && $80 ~ /.../ '. – randomir Nov 04 '17 at 17:24
  • @User123, I see you already asked this in a [new question](https://stackoverflow.com/questions/47113325/trying-to-get-the-results-using-awk) before I was able to answer your comment :) Anyhow, does this answer help you, and/or does it solve the initial problem you had? – randomir Nov 04 '17 at 20:49
0

Some implementations of awk come with a maximum number of columns. mawk for example. You can test this easily by assigning to NF, like this:

$ mawk 'BEGIN{NF=32768}'
mawk: program limit exceeded: maximum number of fields size=32767
        FILENAME="" FNR=0 NR=0

To walk around this, you can use GNU awk, gawk, which does not have such an explicit limit.

$ gawk 'BEGIN{NF=32768}'
$ gawk 'BEGIN{NF=1000000}'

Well, it is still limited by the amount of available memory. (But that should allow you to have at least millions of fields on a normal pc).

PS: You might need to install gawk and of course, processing such large files might be slow.

hek2mgl
  • 152,036
  • 28
  • 249
  • 266