-1

I have to insert a textblock into multiple (> 200) html files. I decided to address the assignment using bash.

The textblock that has to be inserted is a multiliner:

textblockvar='
 <body class="article toc2 toc-right">
 <div id="readspeaker-button1" class="rs_skip rsbtn rs_preserve"> 
 <a rel="nofollow" class="rsbtn_play" accesskey="L" title="Read 
 out with ReadSpeaker" href="//app-eu.readspeaker.com/cgi-bin
 /rsent?customerid=5&lang=nl_nl&amp;voice=Ilse&re$ 
 <span class="rsbtn_left rsimg rspart"><span 
 class="rsbtn_text"><span>Lees voor</span></span></span> 
 <span class="rsbtn_right rsimg rsplay rspart"></span> 
 </a> 
 </div>
 '

The textblockvar will replace the <body class="article toc2 toc-right"> tag, that seems to be consistent in all 200+ html files.

I tried various techniques to execute the substitution:

FILES=leerstof/*/*.html
for f in $FILES
do

 sed -e "s|'<body class="article toc2 toc-right"'|${textblockvar}|g" $f

done

However, the script always ends up with a fatal error: "unterminated `s' command".

I tried to exchange quotes, tried without quotes, but still the error persists.

Maybe my solution is not the best option. Are there any workarounds available?

kzpm
  • 133
  • 11
  • 3
    Don't parse HTML with regular expressions! Use an HTML parser instead – Ruslan Osmanov Jan 06 '17 at 03:14
  • 2
    Are you supposed to do this assignment in bash, or using an external scripting tool like sed or awk? (Also, [don't parse HTML with regular expressions](http://stackoverflow.com/a/1732454/1072112), like the man said!) – ghoti Jan 06 '17 at 03:17
  • @ghoti I use bash scripting for this task. In the script I use sed for html replacement – kzpm Jan 06 '17 at 09:55
  • I suspect your problem here is the use of double quotes (`"`). This character has a special meaning for shell. Then the arguments of your `sed` command are actually very different than expected. Use back slash to escape double quotes. – Jdamian Jan 06 '17 at 13:04
  • 1
    kzpm, okay then, if your question is "How do I solve this problem the wrong way", I'll just move on to the next question. Good luck! :-) – ghoti Jan 06 '17 at 13:23

1 Answers1

0

here is a way that might work:

  1. quote your sed command with ', not ".
  2. replace the newlines in your string variable with \n.
  3. escape any ampersands in your replacement string, e.g., ...id=5\&lang=nl_nl\&amp;voice=Ilse\&re...
  4. use sed -i to change files in place.

e.g.,

textblockvar='<body class="article toc2 toc-right">\n<div id...'
...
sed -i 's|<body class="article toc2 toc-right"|'$textblockvar'|g' $f

note: test it without the -i first to just output what it would do and check if it works as intended. if it does, then use -i to actually change the files.

webb
  • 4,180
  • 1
  • 17
  • 26