0

I'm using iTextSharp to convert some ePubs to PDF and the conversion is working fine except for the TOC of the ePubs.

The points where the TOC links are linking too are generally in the format <a id but I need them in the format <a name, so I've been doing a string replace to get them in the correct form. However I have now found an ePub that does it's internal links differently and the href of the TOC links is linking too.

<h2 class="c007" id="chapII.">CHAPTER II.</h2>

And this document uses <a id in a different unrelated scenario. So replacing <a id with <a name is no longer an option.

Is there a better way to fix the Internal links so that they successfully work.

xiimoss
  • 775
  • 3
  • 11
  • 21
  • Instead of replacing something which is a destructive change, can you just duplicate the value of `id` into a `name` parameter? – Chris Haas Nov 25 '15 at 14:13
  • That's a good point actually. But now I need to figure out a way to find all instances of id=" " in the HTML string and make a copy, edit it to be name= and insert it back into the HTML string. – xiimoss Nov 25 '15 at 14:28
  • If your HTML is simple you can just use a regex pattern like `id="([^"]+)"` replacing with `id="$1" name="$1"` however if you ask about it here you'll be told it is [impossible](http://stackoverflow.com/a/1732454/231316) and that you should use something like the [HTML Agility Pack](http://stackoverflow.com/questions/846994/how-to-use-html-agility-pack) – Chris Haas Nov 25 '15 at 14:35

0 Answers0