Unable to remove last field CSV file

Question

i have csv file contains data like, I need to get all fields as it is except last one.

"one","two","this has comment section1"
"one","two","this has comment section2 and ( anything ) can come here ( ok!!!"

gawk 'BEGIN {FS=",";OFS=","}{sub(FS $NF, x)}1'

gives error- fatal: Unmatched ( or (:

I know if i remove '(' from second line solves the problem but i can not remove anything from comment section.

Possible duplicate of [bash method to remove last 4 columns from csv file](https://stackoverflow.com/questions/14418511/bash-method-to-remove-last-4-columns-from-csv-file) — Corentin Limier, Jul 22 '19 at 13:37
The 3 characters important to include in an example of a field that can contain "anything" in a quoted-fields CSV are `,`, `"`, and `\n` (a literal newline). Include those in your example if you really do mean "anything" when you say it or clarify what you mean by "anything" otherwise. — Ed Morton, Jul 22 '19 at 15:07
The example given allows `cut -d"," -f1-2` as an approach, when you can have `...,"a field with , inside",...` please make this clear. — Walter A, Jul 22 '19 at 20:56
Thanks Ed and Walter, sorry i do not mean anything can come here. — CIGEEK, Jul 23 '19 at 06:17

RavinderSingh13 · Accepted Answer · 2019-07-23T06:23:05.240

2

With any awk you could try:

awk 'BEGIN{FS=",";OFS=","}{$NF="";sub(/,$/,"")}1'  Input_file

Or with GNU awk try:

awk 'BEGIN{FS=",";OFS=","}NF{--NF};1' Input_file

edited Jul 23 '19 at 06:23

answered Jul 22 '19 at 12:42

RavinderSingh13

130,504
14
57
93

1

There's no need to escape a `,` in a regexp, it's not a meta-character. That second script is undefined behavior per POSIX so it'd do different things in different awks. – Ed Morton Jul 22 '19 at 15:05
1

Thanks Ravi and Ed .. this works fine for me. Please explain how below code works awk 'BEGIN{FS=",";OFS=","}NF{--NF};1' – CIGEEK Jul 23 '19 at 06:16

kvantour · Answer 2 · 2019-07-22T13:23:25.833

Since you mention that everything can come here, you might also have a line that looks like:

"one","two","comment with a , comma"

So it is a bit hard to just use the <comma>-character as a field separator.

The following two posts are now very handy:

What's the most robust way to efficiently parse CSV using awk?
[U&L] How to delete the last column of a file in Linux (Note: this is only for GNU awk)

Since you work with GNU awk, you can thus do any of the following two:

$ awk -v FPAT='[^,]*|"[^"]+"' -v OFS="," 'NF{NF--}1'
$ awk 'BEGIN{FPAT="[^,]*|\"[^\"]+\"";OFS=","}NF{NF--}1'
$ awk 'BEGIN{FPAT="[^,]*|\042[^\042]+\042";OFS=","}NF{NF--}1'

Why is your command failing: The sub(ere,repl,in) command of awk assumes that the first part ere is an extended regular expression. Hence, the bracket has a special meaning. If you want to replace fields which are known and unique, you should not use sub, but just redefine the field:

$ awk '{$NF=""}'

If you want to replace a string matching a field, you should do this:

s=$(number);while(i=index(s,$0)){$0=substr(1,i-1) "repl" substr(i+length(s),$0) }

Unable to remove last field CSV file

2 Answers2