-2

i have csv file contains data like, I need to get all fields as it is except last one.

"one","two","this has comment section1"
"one","two","this has comment section2 and ( anything ) can come here ( ok!!!"

gawk 'BEGIN {FS=",";OFS=","}{sub(FS $NF, x)}1'

gives error- fatal: Unmatched ( or (:

I know if i remove '(' from second line solves the problem but i can not remove anything from comment section.

RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93
CIGEEK
  • 41
  • 7
  • 2
    Possible duplicate of [bash method to remove last 4 columns from csv file](https://stackoverflow.com/questions/14418511/bash-method-to-remove-last-4-columns-from-csv-file) – Corentin Limier Jul 22 '19 at 13:37
  • The 3 characters important to include in an example of a field that can contain "anything" in a quoted-fields CSV are `,`, `"`, and `\n` (a literal newline). Include those in your example if you really do mean "anything" when you say it or clarify what you mean by "anything" otherwise. – Ed Morton Jul 22 '19 at 15:07
  • The example given allows `cut -d"," -f1-2` as an approach, when you can have `...,"a field with , inside",...` please make this clear. – Walter A Jul 22 '19 at 20:56
  • Thanks Ed and Walter, sorry i do not mean anything can come here. – CIGEEK Jul 23 '19 at 06:17

2 Answers2

2

With any awk you could try:

awk 'BEGIN{FS=",";OFS=","}{$NF="";sub(/,$/,"")}1'  Input_file

Or with GNU awk try:

awk 'BEGIN{FS=",";OFS=","}NF{--NF};1' Input_file
RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93
  • 1
    There's no need to escape a `,` in a regexp, it's not a meta-character. That second script is undefined behavior per POSIX so it'd do different things in different awks. – Ed Morton Jul 22 '19 at 15:05
  • 1
    Thanks Ravi and Ed .. this works fine for me. Please explain how below code works awk 'BEGIN{FS=",";OFS=","}NF{--NF};1' – CIGEEK Jul 23 '19 at 06:16
2

Since you mention that everything can come here, you might also have a line that looks like:

"one","two","comment with a , comma"

So it is a bit hard to just use the <comma>-character as a field separator.

The following two posts are now very handy:

Since you work with GNU awk, you can thus do any of the following two:

$ awk -v FPAT='[^,]*|"[^"]+"' -v OFS="," 'NF{NF--}1'
$ awk 'BEGIN{FPAT="[^,]*|\"[^\"]+\"";OFS=","}NF{NF--}1'
$ awk 'BEGIN{FPAT="[^,]*|\042[^\042]+\042";OFS=","}NF{NF--}1'

Why is your command failing: The sub(ere,repl,in) command of awk assumes that the first part ere is an extended regular expression. Hence, the bracket has a special meaning. If you want to replace fields which are known and unique, you should not use sub, but just redefine the field:

$ awk '{$NF=""}'

If you want to replace a string matching a field, you should do this:

s=$(number);while(i=index(s,$0)){$0=substr(1,i-1) "repl" substr(i+length(s),$0) }
kvantour
  • 25,269
  • 4
  • 47
  • 72