0

Scenario:

  • I have two files. File1 (Tab Delimited), File2 (Strings). In File1, I have a combination of Field4+Field3+Field2 of the Line 01 to make a reference key to the Field1 of the strings in File2.
  • I am able to match and extract the information but not a good format

Requirement In output # 1, I need the 7 No Matched lines from file1.txt and then a line 99. The same lines I need to copy over to Output #2 but without 99. Please let me know if you need further details

Awk Script(I am using)

awk 'FNR == NR && ! /^[[:space:]]*$/ { key = substr($1, 1, 8); a[key] = $0; next }
$1 == "01" { if (code != 0)
             {
                 if (code in a)
                 {
                     printf("77\t%s\n", a[code])
                     delete a[code]
                 }
             }
             code = $4$3$2
           }
{ print }
END {
         if (code in a)
         {
             printf("77\t%s\n", a[code])
             delete a[code]
         }
         for (code in a)
             printf("99\t%s\n",  a[code])
}' \
     File2.txt File1.txt > File3.txt

awk -F '\t', '/^99/' File3.txt > File4.txt

File1.txt(INPUT)

01  89  68  5000
02  89  11
03  89  00
06  89  00
07  89  19  RT  0428
01  87  23  5100
02  87  11
04  87  9   02
03  87  00
06  87  00
07  87  11  RT  0428
01  83  23  4900
02  83  11
04  83  9   02
03  83  00
06  83  00
07  83  11  RT  0428

File2.txt (INPUT)

50006889 CCARD /3010  /E     /C A87545457          /  //                ///11        ///

51002387 CCARD /3000  /E     /S N054896334IV          /  //                ///11        ///

51002390800666 CCARD /3000  /E     /S N0978898IV          /  //                ///11        ///

File3.txt (OUTPUT #1)

01  89  68  5000
02  89  11
03  89  00
06  89  00
07  89  19  RT  0428
77  50006889 CCARD /3010  /E     /C A87545457          /  //                ///11        ///
01  87  23  5100
02  87  11
04  87  9   02
03  87  00
06  87  00
07  87  11  RT  0428
77  51002387 CCARD /3000  /E     /S N054896334IV          /  //                ///11        ///
01  83  23  4900
02  83  11
04  83  9   02
03  83  00
06  83  00
07  83  11  RT  0428
99  
99  51002390800666 CCARD /3000  /E     /S N0978898IV          /  //                ///11        ///

File4.txt (OUTPUT#2)

99  
99  51002390800666 CCARD /3000  /E     /S N0978898IV          /  //                ///11        ///

File3.txt (DESIRED OUTPUT #1)

        01  89  68  5000
        02  89  11
        03  89  00
        06  89  00
        07  89  19  RT  0428
        77  50006889 CCARD /3010  /E     /C A87545457          /  //                ///11        ///
        01  87  23  5100
        02  87  11
        04  87  9   02
        03  87  00
        06  87  00
        07  87  11  RT  0428
        77  51002387 CCARD /3000  /E     /S N054896334IV          /  //                ///11        ///
        01  83  23  4900
        02  83  11
        04  83  9   02
        03  83  00
        06  83  00
        07  83  11  RT  0428
        99  
        01  44  73  8800
        02  44  73
        04  44  73   02
        03  44  73
        06  44  73
        07  44  11  RT  0789
        99  
(When NO MATCH, THERE IS ONLY one line 99 <tab> <date> in the end of 7 lines and then the next 7 lines in case of another no match and then 99 <tab> <date> and so on)

File4.txt (DESIRED OUTPUT#2)

    01  83  23  4900
    02  83  11
    04  83  9   02
    03  83  00
    06  83  00
    07  83  11  RT  0428

(Current input files only have one mismatch, I want to keep adding other mismatched lines without 99 suffix to this file so it would have a structure like the following)

    01  83  23  4900
    02  83  11
    04  83  9   02
    03  83  00
    06  83  00
    07  83  11  RT  0428
    01  38  66  7000
    02  38  66
    04  38  66   02
    03  38  66
    06  38  66
    07  38  66  RT  0428
    01  44  73  8800
    02  44  73
    04  44  73   02
    03  44  73
    06  44  73
    07  44  11  RT  0789
HighTech
  • 25
  • 7
  • 1
    What seems to be the problem? – danfuzz Jan 21 '15 at 22:20
  • @danfuzz In output # 1, I need the 7 No Matched lines from file1.txt and then a line 99 after. The same lines I need to copy over to Output #2 but without 99 . Please let me know if you need further details. – HighTech Jan 21 '15 at 22:28
  • 2
    I recommend that you (a) move details of the actual problem into your question text, and (b) simplify the code to more clearly demonstrate said problem. – danfuzz Jan 21 '15 at 22:32
  • Wish I had checked your other question before I wasted my time here. – glenn jackman Jan 21 '15 at 22:43
  • it is not actually a duplicate. This is a different question and I on my other question, Jonathan had suggested me to post a new question. – HighTech Jan 21 '15 at 22:45

1 Answers1

0
gawk '
  BEGIN {
    OFS="\t"
    date = strftime("%Y-%m-%d", systime())
    out = "File3.txt"
    err = "File4.txt"
  }
  NR==FNR && NF {line[$1]=$0; next}
  function print_77_99() {
    if (key in line) 
      print "77", line[key] > out
    else {
      print "99", date > out
      printf "%s", lines >> err
    }
  }
  $1 == "01" {
    if (FNR > 1) print_77_99()
    key = $4 $3 $2
    lines = ""
  }
  {
    print > out
    lines = lines $0 "\n"
  }
  END {print_77_99()}
' File2.txt File1.txt
glenn jackman
  • 238,783
  • 38
  • 220
  • 352
  • Note specific use of `gawk`, due to time functions – glenn jackman Jan 21 '15 at 22:38
  • Thanks a lot for your feed back @glenn jackman .. but i am looking for an awk script. Gawk wont work for me but I appreciate your help. – HighTech Jan 21 '15 at 22:41
  • Is it possible to write an awk script if we can disregard the time requirement in the output? – HighTech Jan 21 '15 at 22:43
  • yes, `awk -v date=$(date +%F) '...'` -- use the same awk body except for the declaration of `date` in the BEGIN block. – glenn jackman Jan 21 '15 at 22:45
  • AWSOME!!! Output #1 is perfect now but Output#2 is putting the 99 and strings from file2. I dont want any data from file2 in the Output#2. Thanks a lot for the first resolution. – HighTech Jan 21 '15 at 22:50
  • This is the output # 2 : 99 99 51002390800666 CCARD /3000 /E /S N0978898IV / // ///11 /// 01 83 23 4900 02 83 11 04 83 9 02 03 83 00 06 83 00 07 83 11 RT 0428 01 83 23 4900 02 83 11 04 83 9 02 03 83 00 06 83 00 07 83 11 RT 0428 – HighTech Jan 21 '15 at 22:56
  • Forgot to copy you on my comments. Thanks. IT WORKED!!!!.... THANKS A LOT.. you are a pro! – HighTech Jan 21 '15 at 22:59