4

HTML javadoc documents that were generated with javadoc tool from Java version 10 and newer use parentheses ( ) and commas , in method links/labels, for example: https://docs.oracle.com/javase/10/docs/api/java/lang/Object.html#wait(long,int).
Older versions however replace these characters with a dash -, for example: https://docs.oracle.com/javase/9/docs/api/java/lang/Object.html#wait-long-int-.
(kudos to this answer for explaining that format depends on javadoc version)

Now, when building a project with Java version 10+, how can I make maven-javadoc-plugin render proper links to methods from projects that had their HTML javadoc documents generated with older versions? (i.e. when one of the <link> tag in pom.xml in maven-javadoc-plugin's configuration section points to a set of HTML javadoc documents that use dashes instead of parentheses and commas).
By default, parentheses and commas are used, which results in links leading to the top of the given class's page instead of the desired method section.

Using an older javadoc tool to generate HTML for the project that uses Java 10+ is not a solution as in such case links to methods from standard libraries at docs.oracle.com (or to any other external projects built with java 10+) would be broken. A definitive solution must be applicable to a specific <link> section only.

morgwai
  • 2,513
  • 4
  • 25
  • 31
  • What do you mean by “work properly”? Are the links within the generated javadoc tree broken? – VGR Sep 19 '21 at 15:21
  • @VGR yes, they are broken as by default javadoc uses parentheses in method links (by broken, I mean they point the the right class html page, but the fragment does not resolve and you land at the top of the page instead of specific method section) – morgwai Sep 19 '21 at 23:40
  • Can you provide an example of a broken URL? https://docs.oracle.com/en/java/javase/16/docs/api/java.base/java/lang/System.html#currentTimeMillis() seems to work quite well. – VGR Sep 20 '21 at 00:01
  • @VGR as stated in the OP only "some projects" make this replacement. gRPC is an example of such project: javadoc links to gRPC methods in your project will not work by default because of this. For example rendering javadoc `{@link ServerCallStreamObserver#setOnCancelHandler(Runnable)}` will create a link to https://javadoc.io/doc/io.grpc/grpc-all/1.40.0/io/grpc/stub/ServerCallStreamObserver.html#setOnCancelHandler(java.lang.Runnable) which will take you to the top of `ServerCallStreamObserver`'s page as they replace parentheses with dashes somehow: see the properly working link in the OP. – morgwai Sep 20 '21 at 00:14
  • @EricAnderson could you help with this one? (not sure if this shoutout will generate a notification) – morgwai Sep 20 '21 at 00:25
  • I gather you are referring to the use of javadoc’s `-link` or `-linkoffline` option when generating javadoc? – VGR Sep 20 '21 at 00:30
  • @VGR the question is about maven (I've just added a tag to make this more clear). To be honest, I've never used `javadoc` from command line directly, but I'm guessing that the options you mention correspond to maven's `https://javadoc.io/doc/io.grpc/grpc-all/${grpc.version}` in this example. – morgwai Sep 20 '21 at 00:42
  • Yes, thank you. That greatly clarifies the problem you’re having. – VGR Sep 20 '21 at 00:46
  • @VGR nevertheless if you know how to make `javadoc` command line tool generate links to external projects using dashes instead of parentheses, that would be probably very helpful too :) (as far as I recall there's a way to pass additional arguments in `maven-javadoc-plugin`'s xml config) – morgwai Sep 20 '21 at 00:52
  • @VGR I've further clarified the OP based on your comments: thanks! – morgwai Sep 20 '21 at 01:06
  • I’m fairly sure it’s possible to execute a text search-and-replace on the generated javadoc files, and the regex for it is certainly feasible, but I don’t know enough about Maven lifecycles to provide a full answer. – VGR Sep 20 '21 at 01:25
  • 1
    I know Maven very well. I'm preparing an answer right now. Be patient! Stay tuned! ;) – Gerold Broser Sep 20 '21 at 01:39
  • @VGR i cannot say for sure, but since I've seen it in few other projects, I'd rather bet that there's some built-in mechanism for this (if not in `javadoc` itself then maybe gradle or maven or some plugin to them). – morgwai Sep 20 '21 at 01:57
  • What do you mean by "_when using `` tag in xml config_"? Which XML config? Links in Javadoc comments are usually `{@link ...}`. – Gerold Broser Sep 20 '21 at 12:00
  • 1
    @GeroldBroser I meant part of `maven-javadoc-plugin`'s configuration in maven `pom.xml` in which you specify where to look for external javadoc of your dependenicies. For example like in this: https://github.com/morgwai/grpc-scopes/blob/master/pom.xml#L110 – morgwai Sep 20 '21 at 12:04

2 Answers2

2

According to RFC 3986 both variants are valid.

For instance, URL fragments for methods with parameters for Java <=9 look like:

https://docs.oracle.com/javase/9/docs/api/java/lang/Object.html#equals-java.lang.Object-

and for Java 10–17 they look like:

https://docs.oracle.com/javase/10/docs/api/java/lang/Object.html#equals(java.lang.Object)

If you {@link ...} to the Javadoc of a library that used a different tool to create its Javadoc than you do (or used a different version of the same tool) you're out of luck for the moment.

If we really do not find an according option at the Javadoc tool(s) (and I think we won't because why should external links be handled differently than internal ones) the first that comes into my mind is the Maven Resources Plugin. It has the feature of resource filtering (which is bad naming, since in fact it's string interpolation) and perhaps this could be used to replace the characters accordingly.

If that doesn't work there are other options, like running an external program, e.g. sed, during a build. Let me think a while and try some things. I'm confident that I can come up with a working solution. However, please be patient. It's 4:30 a.m. here now and I think I need some hours of sleep soon. If someone comes up with a solution in the meantime, the better (though I don't think so but who knows... :)

Approach #1 – --release option

There's javadoc's --release option:

The following core javadoc options are equivalent to corresponding javac options. See Standard Options for the detailed descriptions of using these options:

  • ...
  • --release

javac's --release:

[Well, that's funny...well, no, it's embarassing: The deep links to --release and its Note: ... dont' work in the end because centiseconds after jumping to them apparently a JS (AJAX?) comes into play and the page finally lands at its top. I want Sun Microsystems back!]

--release release

Compiles against the public, supported and documented API for a specific VM version. Supported release targets are 6, 7, 8, 9, 10, and 11.

If this javadoc --release solves your problem we dont' have to think further for a handmade solution.

Update: The --release/<release> option doesn't solve the problem. It's just to specify the link target version as in https://docs.oracle.com/javase/<version>/docs/api/.... The docs above aren't too helpful in this regard and the maven-javadoc-plugin doc isn't either: "<release> Provide source compatibility with specified release". At least it's documented here now. ;)

Approach #2 – Maven resource filtering

Maven's resource filtering doesn't help either since in Javadoc comments method references for methods with just one parameter can look like:

/**
 * <p>Link to {@link Logger#info}</p>
 * <p>Link to {@link Object#equals}</p>
 */

and for string interpolation we would need ${...} (or the not well-known and unusual @...@) definitions.

It would work (in theory) with the explicit form:

/**
 * <p>"${(}" and "${)}" replaced by '-', if the additional '{' and '}' don't conflict with Javadoc comment's tags – but it seems they do</p>
 * <p>Link to {@link Logger#info${(}String${)}}</p>
 * <p>Link to {@link Object#equals${(}Object${)}}</p>
 *
 * <p> "@(@" and "@)@" replaced by '-'</p>, if the additional '@'s don't conflict with Javadoc comment's tags – but it seems they do</p>
 * <p>Link to {@link Logger#info@(@String@)@}</p>
 * <p>Link to {@link Object#equals@(@Object@)@}</p>
 */

I don't know (yet) whether these "reserved characters" can be escaped and if yes, how this could be done. I found How do you escape curly braces in javadoc inline tags, such as the {@code} tag but nothing from there works with {@link ...} (yet).

UPDATE

Approach #3 – Maven XML Plugin's xml:transform

Doesn't work since the Javadoc HTMLs contain non-X(HT)ML-conformant unclosed <meta ... >s and <link ... >s.

Approach #4 a) – Groovy script via the GMavenPlus Plugin

Using FileVisitor, XPath – if that works with the non-X(HT)ML-conformant HTML – or whatever works then.

XPath doesn't work: [Fatal Error] :18:3: The element type "link" must be terminated by the matching end-tag "</link>".

Approach #4 b) – Revive one of maven-javascript-plugin or Maven Javascript Plugin, ...

... add a goal javascript:execute and use a JS script with its CSS selectors and DOM manipulation.

Gerold Broser
  • 14,080
  • 5
  • 48
  • 107
  • ah, so it depends on the **version** of javadoc! how nice of oracle to change it and not provide any way of backward compatibility ;-] Thanks for the info and have a good sleep! :) (I'm definitely not a maven wizard, so I will lazily and cowardly wait until you have some time to play with it ;-) ) – morgwai Sep 20 '21 at 02:41
  • to make things even more messed up, I have a project that links to both variants: https://github.com/morgwai/grpc-scopes/blob/master/pom.xml#L110 (guice uses parentheses and grpc dashes) ;-] – morgwai Sep 20 '21 at 02:46
  • 1
    Probably it's due to the `javadoc` version. It's just a guess. But I think an obvious one if we look at the differences at the Java API doc versions. Why did they change that? Don't ask me. I'm the creator of [JANITOR – Java API Navigation Is The Only Rescue](https://gitlab.com/gerib/userscripts/-/wikis/JANITOR-%E2%80%93-Java-API-Navigation-Is-The-Only-Rescue) and I had to consider different URLs and DOMs for Java <=10, 11 to 14, 15 and 16 (and I just found that I have to adapt it for 17 again). Perhaps they found my userscript, didn't like it and try to make my life harder until I give up. :D – Gerold Broser Sep 20 '21 at 03:00
  • 2
    @user207421 i've just tested it: javadoc from java-1.8 on ubuntu **does** produce links/labels with dashes instead of parenthesis. – morgwai Sep 20 '21 at 09:42
  • 1
    @user207421 You mean the Java API docs prior to 10 haven't been created with [the `javadoc` tool](https://www.oracle.com/java/technologies/javase/javadoc-tool.html)? You allow my doubts about that. Which tool do you think did Sun/Oracle use then? And, BTW, did you downvote because of that? – Gerold Broser Sep 20 '21 at 10:16
  • @GeroldBroser please bear with my ignorance: I have no idea where to put this `--release` flag in my `pom.xml`... :( some hint would be very appreciated... I hope this does not mean that I would need to change `maven.compiler.source` or `maven.compiler.target` because I do use java-11 features. Event putting it somewhere in `maven-javadoc-plugin`'s config section (to affect only `javadoc` but not `javac`) will probably not be useful, as then links to oracle's java-11 API docs would probably be broken... For it work I would need to be able to associate it with specific `` sections. – morgwai Sep 20 '21 at 12:24
  • 1
    @morgwai NP, I'm already in the process of preparing a sample project here since I wanted to know myself whether this first approach works or not. Stay tuned! – Gerold Broser Sep 20 '21 at 13:13
  • @morgwai I made some progress with **Approach #4 a)** (and with some parts of it [i.e. HTML to XHTML transformation] **Approach #3** probably would become possible , too). The question is now: Shall I develop it completely until it works or shall I tell you what I found out so far and you develop it on your own (with my and with the help of all other contributors at SO in case)? The former would be interesting to implement, of course, but I also have other things to do. – Gerold Broser Sep 25 '21 at 14:35
  • sorry for the late reply. Yeah, please just point me in the right direction and I will try to take it over from there. Thanks! – morgwai Sep 30 '21 at 04:27
0

After much searching, I’m inclined to believe that it’s easier to just use Ant inside your pom.xml to change parentheses to hyphens, than to try to look for a dedicated plugin:

<plugin>
  <artifactId>maven-antrun-plugin</artifactId>
  <executions>
    <execution>
      <phase>prepare-package</phase>
      <configuration>
        <target>
          <replaceregexp
              match='(&lt;a href="[^"]*grpc[^"]*)[(]([^)"]*)[)]("[^&gt;]* class="external-link")'
              replace="\1-\2-\3"
              flags="g">
            <fileset dir="${project.reporting.outputDirectory}/apidocs"
                     includes="**/*.html"/>
          </replaceregexp>
        </target>
      </configuration>
      <goals>
        <goal>run</goal>
      </goals>
    </execution>
  </executions>
</plugin>

Unfortunately, I don’t have a way to test it.

VGR
  • 40,506
  • 4
  • 48
  • 63
  • Well, [Don't Parse HTML With Regex](https://stackoverflow.com/a/1732454/1744774). – Gerold Broser Sep 21 '21 at 00:14
  • I had to remove the initial `<` from the regexp because maven was complaining _[FATAL] Non-parseable POM /home/morgwai/projects/grpc-utils/pom.xml: markup not allowed inside attribute value - illegal <_ the regexp should be accurate enough without it. Unfortunately, it didn't have any effect: all links are still rendered using parenthesis :( – morgwai Sep 21 '21 at 00:51
  • @GeroldBroser VGR is not parsing a whole html document: he is finding fragments matching a regular expressions, which is totally fine. See https://stackoverflow.com/a/1733489/1220560 in the same question you linked. – morgwai Sep 21 '21 at 00:54
  • @morgwai Oops, my mistake, `<` and `>` can’t appear in XML attribute values; I’ve edited my answer and replaced them with `<` and `>` respectively. – VGR Sep 21 '21 at 01:07
  • @VGR still no effect whatsoever unfortunately :( – morgwai Sep 21 '21 at 01:36
  • @morgwai I knew that answer. It was more to honor this masterpiece of computer and writing arts once again. :) Don't you think that Ant's `replaceregexp` performs HTML parsing behind the scenes? However, if this solution is fine for you it's fine for me as well. The solutions I tried so far didn't work anyway and the next ideas I have in mind takes a day or so of developing and then I have to ask whether this is for your company and whereto I can send my invoice. :) – Gerold Broser Sep 21 '21 at 09:24
  • 1
    @GeroldBroser at the moment VGR's solution doesn't work: it's probably some detail in the regex or something: now that thanks to VGR I know the maven spells to do such manual post-processing, I will try to play with it myself later. I've stumbled on this issue when working on some open-source project (the before linked https://github.com/morgwai/grpc-scopes ) so no good candidate to invoice unfortunately ;-] Honestly I thought that many ppl have stumbled on it before, so there should be a well-known solution. Anyway many thanks to you and VGR for help! :) – morgwai Sep 21 '21 at 09:40
  • @morgwai You're welcome. [`javadoc:javadoc`](https://maven.apache.org/plugins/maven-javadoc-plugin/javadoc-mojo.html) "_• Invokes the execution of the lifecycle phase `generate-sources` prior to executing itself._". I'd use the very next `` [`process-sources`](https://maven.apache.org/guides/introduction/introduction-to-the-lifecycle.html#Default_Lifecycle) instead of `prepare-package`. – Gerold Broser Sep 21 '21 at 10:06