1

When using SML/NJ library's HTML4 library, how do I convert the Standard ML representation of HTML4 into a string?

For example, if I have the HTML representation below, what function can I use to get a string similar to <html><head><title>Example</title></head><body><h1>Hello!</h1></body></html>?

(* CM.make "$/html4-lib.cm"; *)
open HTML4;
val myHTML = HTML {
  version=NONE,
  head=[Head_TITLE ([], [PCDATA "Example"])],
  content=BodyOrFrameset_BODY (BODY ([], [
    BlockOrScript_BLOCK (H1 ([], [CDATA [PCDATA "Hello!"]]))]))
};

(SML/NJ version: 110.99.2)

Flux
  • 9,805
  • 5
  • 46
  • 92
  • 1
    I was not able to find any documentation on this library, but [this](https://smlnj-gforge.cs.uchicago.edu/scm/viewvc.php/smlnj-lib/trunk/HTML4/html4-printer.sml?view=markup&revision=3985&root=smlnj&sortby=author) _might_ be useful. – Chris May 12 '22 at 06:38
  • @Chris I don't think the `HTML4Printer` structure is useful because it is not listed in [`html4-lib.cm`](https://smlnj-gforge.cs.uchicago.edu/scm/viewvc.php/smlnj-lib/trunk/HTML4/html4-lib.cm?view=markup&root=smlnj&sortby=author), which means that I am not able to access it from the code in the question. – Flux May 12 '22 at 07:56

2 Answers2

2

According to the SML/NJ bug tracker, the following function can be used to convert HTML4.html to a string:

fun toString html =
  let
    val buf = CharBuffer.new 1024
  in
    HTML4Print.prHTML {
      putc = fn c => CharBuffer.add1 (buf, c),
      puts = fn s => CharBuffer.addVec (buf, s)
    } html;
    CharBuffer.contents buf
  end

To be able to use HTML4Print.prHTML in the SML/NJ REPL, the REPL should be started using sml '$/html4-lib.cm'. Alternatively, enter CM.make "$/html4-lib.cm"; after starting the REPL.

The function has signature val toString = fn : HTML4.html -> CharBuffer.vector. CharBuffer is an extension to the Basis Library (reference: 2018 001 Addition of monomorphic buffers). CharBuffer.vector is the same type as CharVector.vector, which is the same type as String.string, which is the same type as string.

Flux
  • 9,805
  • 5
  • 46
  • 92
1

It seems you could use the HTML4Print structure (which appears in the export list in the CM file):

$ sml '$/html4-lib.cm'
Standard ML of New Jersey (64-bit) v110.99.2 [built: Thu Sep 23 13:44:44 2021]
[library $/html4-lib.cm is stable]
- open HTML4Print;
[autoloading]
[library $SMLNJ-LIB/Util/smlnj-lib.cm is stable]
[autoloading done]
opening HTML4Print
  val prHTML : {putc:char -> unit, puts:string -> unit} -> HTML4.html -> unit
  val prBODY : {putc:char -> unit, puts:string -> unit} -> HTML4.body -> unit

So, with your value, it produces:

- HTML4Print.prHTML { putc = print o String.str, puts = print } myHTML;
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<HTML>
<HEAD>
<TITLE>
Example
</TITLE>
</HEAD>
<BODY>
<H1>Hello!</H1>
</BODY>
</HTML>
val it = () : unit
Ionuț G. Stan
  • 176,118
  • 18
  • 189
  • 202
  • Note that I do not want to print the HTML. I only want to get it as a string. I am looking for `HTML4.html -> string`, not `HTML4.html -> unit`. – Flux May 12 '22 at 10:30
  • 1
    @Flux I see. Indeed, it doesn't seem to expose anything that would let you do that, so I'd just go ahead and provide custom `putc` and `puts` functions above, functions that would write to some string buffer. Not ideal, but it's possible at least. – Ionuț G. Stan May 12 '22 at 11:03