0

I have a file with in which the following content repeats n times

>QDN;6135785008
-------------------------------------------------------------------------------
DN:;;;;;5785008;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
TYPE:;SINGLE;PARTY;LINE
SNPA:;613;;;SIG:;DT;;;;LNATTIDX:;N/A;;;;;;;;;;;;;
LINE;EQUIPMENT;NUMBER:;;;;;BSAC;;39;0;00;01;;;
LINE;CLASS;CODE:;;IBN;;;
IBN;TYPE:;STATION
CUSTGRP:;;;;;;;;BSA_POS;;;;;SUBGRP:;0;;NCOS:;1
CARDCODE:;;V5LOOP;;;;GND:;N;;PADGRP:;NPDGP;;BNV:;NL;MNO:;N
PM;NODE;NUMBER;;;;;:;;;;80
PM;TERMINAL;NUMBER;:;;;;2
OPTIONS:
CWT;DGT;DDN;NOAMA;
;
-------------------------------------------------------------------------------
>QDN;6160160260
-------------------------------------------------------------------------------
DN:;;;;;0160260;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
TYPE:;SINGLE;PARTY;LINE
SNPA:;616;;;SIG:;DT;;;;LNATTIDX:;N/A;;;;;;;;;;;;;
LINE;EQUIPMENT;NUMBER:;;;;;BSAC;;39;0;00;03;;;
LINE;CLASS;CODE:;;IBN;;;
IBN;TYPE:;STATION
CUSTGRP:;;;;;;;;BSA_POS;;;;;SUBGRP:;0;;NCOS:;15
CARDCODE:;;V5LOOP;;;;GND:;N;;PADGRP:;NPDGP;;BNV:;NL;MNO:;N
PM;NODE;NUMBER;;;;;:;;;;80
PM;TERMINAL;NUMBER;:;;;;4
OPTIONS:
CWT;3WC;DGT;DDN;NOAMA;
;
----

I want to read all lines and store some values into 4 variables. Eg; var number (second column of the lines stating by ">QDN"), var type (lines starting with PARTY), var snpa and var options (that one stores the value of the next line after the occurrence of OPTIONS). The output could be a text file separated by semicolon (eg: var1;var2;var3;var4). This is partially working. I have the following code but I couldn't get all those variables together. I tried creating another while loop inside the first one to validate the 'last' check of my loop (the semicolon that separates the blocks of info), but it also did not work.

while IFS= read -r line || [[ -n "$line" ]]; read -r secondline; do
if [[ "$line" =~ ^'>QDN' ]]; then
    number=$(echo "$line" | awk -F ';' 'NF {print $2;}')                
elif [[ "$line" =~ ^'TYPE' ]]; then
    type=$(echo "$line" | awk -F ';' 'NF {print $2" "$3" "$4;}')    
elif [[ "$line" =~ ^'SNPA' ]]; then
    snpa=$(echo "$line" | awk -F ';' 'NF {print $2;}')  
elif [[ "$line" =~ ^'OPTIONS' ]]; then
    options=$(echo "${secondline}") 
fi  
echo $number";"$type";"$snpa";"$options         
done < "file.txt

The output of the code above is someway confused:

;613;CWT;3WC;DGT;DDN;NOAMA;SACB;ACT;I976;$;$;N;
;613;CWT;3WC;DGT;DDN;NOAMA;SACB;ACT;I976;$;$;N;
;613;CWT;DGT;DDN;NOAMA;
;613;CWT;DGT;DDN;NOAMA;
;613;CWT;DGT;DDN;NOAMA;
;613;CWT;DGT;DDN;NOAMA;
;616;CWT;DGT;DDN;NOAMA;
;616;CWT;DGT;DDN;NOAMA;
;616;CWT;DGT;DDN;NOAMA;
;616;CWT;DGT;DDN;NOAMA;
;616;DGT;ARTY LINE
;616;DGT;ARTY LINE
;616;DGT;ARTY LINE    

Could anyone of you help?

tripleee
  • 175,061
  • 34
  • 275
  • 318
hadesungod
  • 27
  • 4
  • 1
    what do you mean by the output being `confused`? please update the question with the desired output – markp-fuso Sep 09 '20 at 15:20
  • 1
    maybe the `read -r secondline` should go in the last elif, and parameters should be quoted in echo command`echo "$number;$type;$snpa;$options"`. although reading with bash `read` is innefficent – Nahuel Fouilleul Sep 09 '20 at 15:40
  • it is not showing the result as expected. for the example given above where we have only two blocks of information the output should be something like: **6135785008;SINGLE PARTY LINE;613;CWT;DGT;DDN;NOAMA; 6160160260;SINGLE PARTY LINE;616;CWT;3WC;DGT;DDN;NOAMA;** – hadesungod Sep 09 '20 at 15:40
  • updating, i foud the issue. the code is working but I had the ^M interpreter hidden in the output... I dont know why but I found that through cat -A ... now i just remove it with sed before processing.... again thank you folks for your attemption – hadesungod Sep 09 '20 at 15:47
  • 1
    This means you had carriage returns in your file. See https://stackoverflow.com/q/39527571/3266847 – Benjamin W. Sep 09 '20 at 15:55

2 Answers2

1

Repeated similar small snippets of Awk are often a sign that you should rewrite the whole script in Awk instead.

The following assumes that OPTIONS always comes after the other fields. It's not hard to remove this restriction but with that, the code is extraordinarily simple.

awk -F ';' 'BEGIN { OFS=";" }
   /^>QDN/ { number = $2 }
   /^TYPE/ { type = $2 " " $3 " " $4 }
   /^SNPA/ { snpa = $2 }
   /^OPTIONS/ { options = 1; next }
   options { print number, type, snpa, $0;
      number = type = snpa = options = "" }' file.txt

You should probably remove the DOS carriage returns from your file separately, but it's easy to add NF { sub(/\r/, "") } at the top if you need to cope with broken files, too.

Demo: https://ideone.com/zP102J

tripleee
  • 175,061
  • 34
  • 275
  • 318
0

If you're calling awk within a line read loop, most likely you are doing it wrong. You should consider doing it either in plain awk or in plain bash. Below is a plain bash version:

#!/bin/bash

while read -r line; do
    line=${line%$'\r'} # in case lines end in \r\n. Otherwise, you can remove this line
    case $line in
        \>QDN* | TYPE* ) printf %s "${line#*;};" ;;
        SNPA* ) line=${line#*;}; printf %s "${line%%;*};" ;;
        OPTIONS* ) read -r line && printf '%s\n' "$line" ;;
    esac
done < file.txt
M. Nejat Aydin
  • 9,597
  • 1
  • 7
  • 17