Append nextline to current line until pattern matched in awk

Question

Input file data:

"1","123","hh
KKK,111,ll
Jk"
"2","124","jj"

Output data:

"1","123","hh KKK,111,ll jk"
"2","124","jj"

Tried below code in awk file. still not working for desired output:

BEGIN{
      `FS="\",\"";
        record_lock_flag=0;
        total_feilds=3;
        tmp_field_count=0;
        tmp_rec_buff="";
        lines=0;
        }
        {
        if(NR>0)
        {
        if( record_lock_flag == 0 && NF == total_feilds && substr($NF,length($NF)-1,length($NF)) ~ /^"/  )
                 {
        print $0;
                }
        else
                {
        tmp_rec_buff=tmp_rec_buff$0 ;
        tmp_field_count=tmp_field_count+NF ;
        if ( $0 != "")
        { lines++ ;}
        rec_lock_flag=1 ;
                 if(tmp_field_count==exp_fields+lines-1){
                                print tmp_rec_buff;
                                record_lock_flag=0;
                                tmp_field_count=0;
                                tmp_rec_buff="";
                                lines=0;
                                                        }
                }
        }
        }
        END{
        }`

score 3 · Answer 1 · answered Aug 02 '21 at 22:56

3

Using any awk in any shell on every Unix box:

$ awk 'BEGIN{RS=ORS="\""} !(NR%2){gsub(/\n/," ")} 1' file
"1","123","hh KKK,111,ll Jk"
"2","124","jj"

See also What's the most robust way to efficiently parse CSV using awk?.

answered Aug 02 '21 at 22:56

Ed Morton

188,023
17
78
185

anubhava · Answer 2 · 2021-08-03T15:12:07.573

Using gnu-awk we can break records using text "\n" then remove \n from each record and finally append "\n" in the end using same ORS (assuming there are no blank fields with opening and closing quotes on separate lines):

awk -v RS='"\n("|$)' '{gsub(/\n/, " "); ORS=RT} 1' file

"1","123","hh KKK,111,ll Jk"
"2","124","jj"

Another version using gnu-awk if you already know number of fields in each record as shown in your question:

awk -v n=3 -v FPAT='"[^"]*"' 'p {$0 = p " " $0; p=""}
NF < n {p = $0; next} 1' file

"1","123","hh KKK,111,ll Jk"
"2","124","jj"

score 1 · Answer 3 · answered Aug 03 '21 at 05:37

With your shown samples only, you could try following awk code. Written and tested with GNU awk.

awk -v RS="" -v FS="\n" '
{
  for(i=1;i<=NF;i++){
    sum+=gsub(/"/,"&",$i)
    val=(val?val OFS:"")$i
    if(sum%2==0){
      print val
      sum=0
      val=""
    }
  }
}
' Input_file

Explanation: Adding detailed explanation for above.

awk -v RS="" -v FS="\n" '    ##Starting awk program from here, setting RS as NULL and field separator as new line.
{
  for(i=1;i<=NF;i++){        ##Traversing through all fields here.
    sum+=gsub(/"/,"&",$i)    ##Globally substituting " with itself and keeping its count to sum variable.
    val=(val?val OFS:"")$i   ##Creating val which has current field in it and keep appending its value to it.
    if(sum%2==0){            ##Checking if sum is even number then do following.
      print val              ##Printing val here.
      sum=0                  ##Setting sum to 0 here.
      val=""                 ##Nullifying val here.
    }
  }
}
' Input_file                 ##Mentioning Input_file name here.

score 0 · Accepted Answer · answered Aug 05 '21 at 07:46

0

With awk setting ORS:

awk '{ORS = (!/"$/) ? " " : "\n"} 1' file
"1","123","hh KKK,111,ll Jk"
"2","124","jj"

answered Aug 05 '21 at 07:46

Carlos Pascual

1,106
1
5
8

Append nextline to current line until pattern matched in awk

4 Answers4