3

I have a flat file of records, each 33 lines long. I need to format this file to specs in a template. The template is in DOS format while the source file is in NIX format. The template has specific indenting and spacing which must be adhered to. I've thought of a few options:

  • BASH with classic nix tools: sed, awk, grep etc...
  • BASH with template toolkit
  • Perl eith template toolkit
  • Perl

These are in order of my familiarity. Here's a sample source record ( NIX format ): I've reduced the number of newlines to save space ( normally 33 lines ):

JACKSON HOLE SANITARIUM AND REPTILE ZOO
45 GREASY HOLLER LN
JACKSON HOLE, AK   99999


Change Service Requested


BUBBA HOTEP
3 DELIVERANCE RD
MINNEAPOLIS, MN   99998


BUBBA HOTEP 09090909090909

You have a hold available for pickup as of 2012-01-04:

Title: Banjo for Fun and Profit
Author: Williams, Billy Dee
Price: $10 

Here's the template ( DOS format -- lines reduced - 66 lines normally):

     <%BRANCH-NAME%>
     <%BRANCH-ADDR%>
     <%BRANCH-CTY%>


<%CUST-NAME%> <%BARCODE%>
You have a hold available for pickup as of <%DATE%>:

Title: <%TITLE%>
Author: <%AUTHOR%>
Price: <%PRICE%>


             <%CUST-NAME%>
             <%CUST-ADDR%>
             <%CUST-CTY%>

end of file

It actually does say "end of file" at the end of each record.

Thoughts? I tend to over-complicate things.

UPDATE2

Figured it out.

My answer is below. Feel free to suggest improvements.

Bubnoff
  • 3,917
  • 3
  • 30
  • 33
  • Maybe use PHP from the command line? If done right, it gives you a resuable component for a future web interface for free ... – Eugen Rieck Jan 06 '12 at 21:16
  • Thanks for the suggestion Eugen, but I won't be webify-ing this stuff and I don't know PHP. – Bubnoff Jan 06 '12 at 21:38
  • When you say `format a file to specs in template` what exactly do you wish to do with your file containing records? – jaypal singh Jan 06 '12 at 21:51
  • I need format the records to match the template record. My idea was to first create template ( done ), second pull each record out of the file, input it to template, last concatenate formatted records to new file. – Bubnoff Jan 06 '12 at 22:33
  • There may be a simpler method to do this. I don't want to lead people to my convoluted method. If a person format it in place in a simple way ... – Bubnoff Jan 06 '12 at 22:36
  • Will post code later this afternoon. Thanks. – Bubnoff Jan 06 '12 at 22:53
  • 2
    Perl *is* a classic unix tool. – ikegami Jan 06 '12 at 23:33
  • You're right ...but late 80's. Sed, awk and grep pre-date. A technicality though, you're right. If you have a better way than my bash above ...in perl ...I'll thank you. My perl sucks eggs or I'd try perl. – Bubnoff Jan 06 '12 at 23:36
  • Does your record have a starting identifier that we can use to mark start of a record. Ending won't be a problem since every record is 33 lines. Will that be a safe assumption? – jaypal singh Jan 07 '12 at 01:11
  • @JaypalSingh. Starting identifier would be a regex on LOCATION. Ending identifier is /Price:/. – Bubnoff Jan 07 '12 at 01:42
  • Thanks and your record file shows customer name, address etc before the customer name, barcode and Title, author, price. Is that how it should be? – jaypal singh Jan 07 '12 at 01:46
  • Actually, reverse that. Customer name and barcode first. Customer address last. Location info is first. – Bubnoff Jan 07 '12 at 01:51

2 Answers2

1

As a starter, here is a hint: Perl HERE-documents (showing just a few substitutions as a demo):

#!/usr/bin/perl
use strict;
use warnings;

my @lines = qw/branchname cust_name barcode bogus whatever/; # (<>);

my ($branchname, $cust_name, $barcode, undef, $whatever) = @lines;

print <<TEMPLATE;
     $branchname
     <%BRANCH-ADDR%>
     <%BRANCH-CTY%>


$cust_name $barcode
You have a hold available for pickup as of <%DATE%>:

Title: <%TITLE%>
Author: <%AUTHOR%>
Price: <%PRICE%>


             $cust_name
             <%CUST-ADDR%>
             <%CUST-CTY%>

end of file
TEMPLATE

Replace the dummy input array with the lines read from the stdin with (<>) if you will. (Use a loop reading n lines and push it to the array if that is more efficient). I just showed the gist, add more variables as required, and skip input lines by specifting undef for the 'capture' variable (as shown).

Now, simply interpolate these variables into your text.

If line-ends are giving you any grief, consider using chomp eg.:

my @lines = (<>); # just read em all...
my @cleaned = map { chomp } @lines;
sehe
  • 374,641
  • 47
  • 450
  • 633
  • added a little more hints on how to go forward from this – sehe Jan 07 '12 at 00:59
  • The template part would need to be removed from the code to make it reusable for other templates. Am going to work with this. – Bubnoff Jan 07 '12 at 01:45
0

This is what I am using for this project. Feel free to suggest improvements, or, submit better solutions.

cp $FILE $WORKING # we won't mess with original

NUM_RECORDS=$( grep "^Price:" "$FILE" | wc -l ) # need to know how many records we have 
                                              # counting occurences of end of record r

TMP=record.txt # holds single record, used as temp storage in loop below

# Sanity
# Make sure temp storage exists. If not create -- if so, clear it.
[ ! -f $TMP ] && touch $TMP || cat /dev/null >$TMP

# functions
function make_template () {
    local _file="$1"
    mapfile -t filecontent < "$_file"
    _loc_name="${filecontent[0]}"
    _loc_strt="${filecontent[1]}"
    _loc_city="${filecontent[2]}"
    _pat_name="${filecontent[14]}"
    _pat_addr="${filecontent[15]}"
    _pat_city="${filecontent[16]}"
    _barcode=${filecontent[27]:(-14)} # pull barcode from end of string
    _date=${filecontent[29]:(-11)}    # pull date from end of string
    # Test title length - truncate if necessary - 70 chars.
    _title=$(grep -E "^Title:" $_file)
    MAXLEN=70
    [ "${#_title}" -gt "$MAXLEN" ] && _title="${filecontent[31]:0:70}" || :
    _auth=$(grep -E "^Author:" $_file)
    _price=$(grep -E "^Price:" $_file)
    sed "
        s@<%BRANCH-NAME%>@${_loc_name}@g
        s@<%BRANCH-ADDR%>@${_loc_strt}@g
        s@<%BRANCH-CTY%>@${_loc_city}@g
        s@<%CUST-NAME%>@${_pat_name}@g
        s@<%CUST-ADDR%>@${_pat_addr}@
        s@<%CUST-CTY%>@${_pat_city}@
        s@<%BARCODE%>@${_barcode}@g
        s@<%DATE%>@${_date}@
        s@<%TITLE%>@${_title}@
        s@<%AUTHOR%>@${_auth}@
        s@<%PRICE%>@${_price}@" "$TEMPLATE"
}

####################################
#  MAIN
####################################

for((i=1;i<="$NUM_RECORDS";i++))
do
    sed -n '1,/^Price:/{p;}' "$WORKING" >"$TMP"  # copy first record with end of record
                                                # and copy to temp storage.

    sed -i '1,/^Price:/d' "$WORKING"             # delete first record using EOR regex.
    make_template "$TMP"                        # send temp file/record to template fu
done

# cleanup
exit 0
Bubnoff
  • 3,917
  • 3
  • 30
  • 33