I want to convert java specification documentation to easily editable formats(markdown or asciidoc) and upload GitHub Gist and customize(adding my code experiences and notes.) I want to convert to something like this
I use a tool called pandoc that allows us to convert from HTML to markdown.
I tried followings:
Technique 1 I tried to convert all table of components of java specification on index.html
pandoc -f html -t markdown -o test2.md
https://docs.orac le.com/javase/specs/jls/se10/html/index.html`
I got this:tes2.md (I did not upload here because the file of contents is too long)
Problem 1: This markdown file does not have contents of java specification documentation. I expected that I got markdown toc(table of components) and java specification documentation contents in markdown file like this`
Problem 2: When click the links on this markdown file then I get 404 error page.
Technique 2(Better than technique 1) I downloaded all HTML files of TOC with HTTrack and try to convert all files separately.
pandoc -f html-native_divs-native_spans -i jls-1.html -t markdown -o test2.md
Problem 1: I got following markdown file which have the table of components links that cannot redirect to another section of the same document. When I click on this links, they return external GitHub page like that:https://gist.github.com/lostdinar2/jls-1.html#jls-1.1 which is not available. test3.md
A demonstration of problem 1:
1)I want to convert this HTML internal id link(#) to the markdown internal link that redirects to another section of the same document
<dt><span class="section"><a href="jls-2.html#jls-2.2">2.2. The Lexical Grammar</a></span></dt>
[link text](#abcd)
2)But pandoc cannot convert this links to the markdown internal link.Pandoc create an external link like this:https://gist.github.com/lostdinar2/jls-1.html#jls-1.1
Is there a pandoc parameter to fix this? I make a search on pandoc documentation but I cannot do this feature.