1

Scenario:

  • I am trying to write an Awk script.
  • I have two files. File1 (Tab Delimited), File2 (Strings).
  • In File1, I have a combination of Field4+Field3+Field2 of the Line 01 to make a reference key to the Field1 of the strings in File2.
  • I am able to match and extract the information but not a good format

Requirement

  • I want to print the information to File3.txt if the reference key is matched. I need to Print to File3 where the format in case of
  • KEY MATCHED: Line 01 -07 from File1 followed by the matching string from File2 with a prefix of 77 and so on.
  • KEY NOT MATCHED: If the keys are not matched then put all the unmatched records from File2 only with a prefix of 99.

Script:

awk -F'\t' -v OFS'\t' 'FNR==NR{a[substr($0,1,8)]=$4$3$2}
     {if ($4$3$2 in a) printf ("77""\t"); else printf ("99""\t");print $0}' \
     File2.txt File1.txt > File3.txt

File1 :

01  89  68  5000
02  89  11
03  89  00
06  89  00
07  89  19  RT  0428
01  87  23  5100
02  87  11
04  87  9   02
03  87  00
06  87  00
07  87  11  RT  0428
01  83  23  4900
02  83  11
04  83  9   02
03  83  00
06  83  00
07  83  11  RT  0428

File2:

50006889 CCARD /3010  /E     /C A87545457          /  //                ///11        ///

51002387 CCARD /3000  /E     /S N054896334IV          /  //                ///11        ///

51002390800666 CCARD /3000  /E     /S N0978898IV          /  //                ///11        ///

File 3: Current Output

99  50006889 CCARD /3010  /E     /C A87545457          /  //                ///11        ///
99
99  51002387 CCARD /3000  /E     /S N054896334IV          /  //                ///11        ///
99
99  51002390800666 CCARD /3000  /E     /S N0978898IV          /  //                ///11        ///
77  01  89  68  5000
99  02  89  11
99  03  89  00
99  06  89  00
99  07  89  19  RT
77  01  87  23  5100
99  02  87  11
99  04  87  9   02
99  03  87  00
99  06  87  00
99  07  87  11  RT  0428    
99  01  83  23  4900
99  83  11
99  83  9   02
99  83  00
99  83  00
99  83  11  RT  0428

Desired Output:

01  89  68  5000
02  89  11
03  89  00
06  89  00
07  89  19  RT  0428
77  50006889 CCARD /3010  /E     /C A87545457          /  //                ///11        ///

01  87  23  5100
02  87  11
04  87  9   02
03  87  00
06  87  00
07  87  11  RT  0428
77  51002387 CCARD /3000  /E     /S N054896334IV          /  //                ///11        ///

99  51002390800666 CCARD /3000  /E     /S N0978898IV          /  //                ///11        ///

Actually for 77 and 99 lines I need the whole string but 77 and 99 only in the beginning of the matched key. Currently, if string is long and carried on to the 2nd line, the script is putting 77 and 99 in front of the second line as well I am looking to put 77 and 99 in front of the matched code only.

For example following is the out put of the corrected awk code by Jonathan:

$ awk -f awk.script File2.txt File1.txt

        01  89  68  5000
        02  89  11
        03  89  00
        06  89  00
        07  89  19  RT  0428
        77  50006889 CCARD /3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ///
        01  87  23  5100
        02  87  11
        04  87  9   02
        03  87  00
        06  87  00
        07  87  11  RT  0428
        77  51002387 CCARD /3000  /E     /S N054896334IV          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //      
77          ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ///
        01  83  23  4900
        02  83  11
        04  83  9   02
        03  83  00
        06  83  00
        07  83  11  RT  0428
        99  51002390800666 CCARD /3000  /E     /S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ///////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ///////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ///////S N0978898IV    
    99        /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ///////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ///////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ///////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ///////S N0978898IV          /  //                ///11        ////S N09
    99  78898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ///
        $
HighTech
  • 25
  • 7
  • 2
    What do you mean by 'please disregard the bullets'? If you want data lines to appear as code/data, you write them in the edit box as you want them to appear, then select them, and use the **`{}`** button above the edit box to indent the lines by 4 spaces so that the output appears as code. Please edit your question so that there's nothing we need to disregard. – Jonathan Leffler Jan 21 '15 at 03:12
  • Hi @JonathanLefflerI have added an answer which is actually a question. It is related to the format that this script is creating for printing strings from File2. Is this a quick thing or should I post a new question ? Thanks again. – HighTech Jan 21 '15 at 18:26
  • In general, please don't do that. Either update the current question or ask a new question. Such non-answers get flagged 'not an answer' and deleted. – Jonathan Leffler Jan 21 '15 at 18:29

1 Answers1

1

You correctly read File2.txt before you read File1.txt. You need to ignore blank lines in File2.txt, though.

FNR == NR && ! /^[[:space:]]*$/ { key = substr($1, 1, 8); a[key] = $0; next }

This uses the first 8 characters of the first field as the key, and the whole line as the value. The next ensures that the lines are not otherwise processed.

The next part is fiddly. You need to spot the lines with 01 in $1, and build a key from that. When you next get an 01 line, you need to print out the 77-prefixed line from a (and delete the entry from a).

At the end, you need to print the 77-prefixed line from a (and delete the entry from a). Then you need to process any entries left in a and give them the 99-prefix.

$1 == "01" { if (code != 0)
             {
                 if (code in a)
                 {
                     printf("77\t%s\n", a[code])
                     delete a[code]
                 }
             }
             code = $4$3$2
           }
{ print }
END {
         if (code in a)
         {
             printf("77\t%s\n", a[code])
             delete a[code]
         }
         for (code in a)
             printf("99\t%s\n", a[code])
    }

Clearly, you can use less white space than I just did, though you might need to add some semicolons too. For testing purposes, I put the code above into a file awk.script and ran:

$ awk -f awk.script File2.txt File1.txt
01  89  68  5000
02  89  11
03  89  00
06  89  00
07  89  19  RT  0428
77  50006889 CCARD /3010  /E     /C A87545457          /  //                ///11        ///
01  87  23  5100
02  87  11
04  87  9   02
03  87  00
06  87  00
07  87  11  RT  0428
77  51002387 CCARD /3000  /E     /S N054896334IV          /  //                ///11        ///
01  83  23  4900
02  83  11
04  83  9   02
03  83  00
06  83  00
07  83  11  RT  0428
99  51002390800666 CCARD /3000  /E     /S N0978898IV          /  //                ///11        ///
$

That looks rather similar to what you wanted. If you want a blank line after the previous block of output, add printf("\n") after the if blocks that print the 77-prefixed lines. You can write it to File3.txt if you like with I/O redirection. You can embed the script in single quotes and add it to the command line in place of -f awk.script. You can squish the whole script onto one humongous line if you want to, too — but please don't; it is too big to make a good one-liner, and this program's name is awk, not apl.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • Thanks a lot @Jonathan Leffler. I really appreciate your help. It resolved my issue. This is exactly what I was looking for. Merci Beaucoup. – HighTech Jan 21 '15 at 15:21