1

I'm trying to generate a html file from a text file called index.db. Contents of index.db:

file="test.html"
    date="2013-01-07"
    title="Example title"

file="test2.html"
    date="2014-02-04"
    title="Second example title"

The command I'm trying:

sed '/^$/d;H;1h;$!d;g;s/\n\t\+/ /g' input/index.db |
    while read -r line; do
        awk '{
        print "<h1>"title"</h1>"
        print "<b>"date"</b>"
        print "<a href=\""file"\">"file"</a>"
    }' $line
done

And it returns:

awk: fatal: cannot open file `title"' for reading: No such file or directory
awk: fatal: cannot open file `example' for reading: No such file or directory

But if I try the following command, it runs perfectly:

sed '/^$/d;H;1h;$!d;g;s/\n\t\+/ /g' input/index.db |
    while read -r line; do
    echo $line
        awk '{
        print "<h1>"title"</h1>"
        print "<b>"date"</b>"
        print "<a href=\""file"\">"file"</a>"
    }' file="test.html" date="2013-01-07" title="Example title"
done
S9oXavyF
  • 105
  • 4
  • `echo $line` is _itself_ buggy, and for the same reason, so is `awk ... $line`. Always, _always_ quote your expansions: `echo "$line"` and `awk ... "$line"` -- see [I just assigned a variable, but `echo $variable` shows something else](https://stackoverflow.com/questions/29378566/i-just-assigned-a-variable-but-echo-variable-shows-something-else) – Charles Duffy Feb 19 '21 at 16:32
  • ...that said, starting awk in a loop is a "code smell" -- an indication that you're doing something wrong. It's almost always better to run awk only _once_, and let it do the looping itself. – Charles Duffy Feb 19 '21 at 16:33
  • @CharlesDuffy `echo $line` was just to see if the command ran the expected amount of times. That said commenting `$line` doesn't give an error but still doesn't give the desired output. – S9oXavyF Feb 19 '21 at 17:33

3 Answers3

2

with some reusable function to wrap html tags.

$ awk -F'[="]' -v RS= -v OFS='\n' -v ORS='\n\n' '
      function h(t,r,v) {return "<" t (r?" href=\"" r "\"":"")  ">"v "</"t">"}

      {print h("h1","",$9), h("b","",$6), h("a",$3,$3)}' file


<h1>Example title</h1>
<b>2013-01-07</b>
<a href="test.html">test.html</a>

<h1>Second example title</h1>
<b>2014-02-04</b>
<a href="test2.html">test2.html</a>
karakfa
  • 66,216
  • 7
  • 41
  • 56
1

Awk is designed to process files and so you shouldn't need to process line by line in a loop. Also, awk and sed are often inter changeable but rarely used together. You can do what you need to with a "complete" awk solution. Using GNU awk:

awk '/file=/ { lne=gensub(/(^.*=")(.*)(\".*$)/,"<a href=\"\\2\">\\2</a>",$0);print lne} /date=/ {lne=gensub(/(^.*=")(.*)(\".*$)/,"<b>\\2</b>",$0);print lne} /title=/ {lne=gensub(/(^.*=")(.*)(\".*$)/,"<h1>\\2</h1>",$0);print lne}' input/index.db

Explanation:

 awk '/file=/ { 
                lne=gensub(/(^.*=")(.*)(\".*$)/,"<a href=\"\\2\">\\2</a>",$0);       # Use the gensub function to split any lines with "file", into three section, leaving the section between quotes in section 2. We then surround section 2 with the required htlm and read the result in to the variable lne.
                print lne                                                            # Print lne
              } 
      /date=/ {                                                                       # Use the same logic for lines with date.
                lne=gensub(/(^.*=")(.*)(\".*$)/,"<b>\\2</b>",$0);
                print lne
             } 
      /title=/ {                                                                      # Use the same logic for lines with title.
                lne=gensub(/(^.*=")(.*)(\".*$)/,"<h1>\\2</h1>",$0);
                print lne
              }' input/index.db

Output:

<a href="test.html">test.html</a>
<b>2013-01-07</b>
<h1>Example title</h1>
<a href="test2.html">test2.html</a>
<b>2014-02-04</b>
<h1>Second example title</h1

This approach can also be used in a very similar manner with sed:

sed -r '/file=/s@(^.*=")(.*)(\".*$)@<a href=\"\2\">\2</a>@;/date=/s@(^.*=")(.*)(\".*$)@<b>\2</b>@;/title=/s@(^.*=")(.*)(\".*$)@<h1>\2</h1>@' input/index.db
Raman Sailopal
  • 12,320
  • 2
  • 11
  • 18
1

With your shown samples, could you please try following. This will generate a proper HTML file with title, body all tags.

awk '
BEGIN{
  print "<html>"ORS"<title>Your title here..</title>"ORS"<body>"
}
!NF{ val="" }
match($0,/"[^"]*/){
  val=substr($0,RSTART+1,RLENGTH-1)
}
/^file=/{
  print "<a href=\"" val "\"</a>"
  next
}
/date=/{
  print "<b>" val "</b>"
  next
}
/title/{
  print "<h1>"val"</h1>"
}
END{
  print "</body>" ORS "</html>"
}
'  Input_file

Above will generate following html file(as per shown samples details):

<html>
<title>Your title here..</title>
<body>
<a href="test.html"</a>
<b>2013-01-07</b>
<h1>Example title</h1>
<a href="test2.html"</a>
<b>2014-02-04</b>
<h1>Second example title</h1>
</body>
</html>
RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93