3

Theoretically, a version of the Internet using TeX or Markdown would be possible, right? (Okay, MD websites probably aren't that advanced, but never mind).

So my question is twofold:

  1. Has this appeared as an idea before and
  2. Did someone even try and realize this (maybe in the early days of the web)

Thank you in advance.

zx485
  • 28,498
  • 28
  • 50
  • 59

2 Answers2

1

The idea of extending browsers to support other vocabularies than just HTML by re-specifying HTML itself using a markup meta-language was the original stated goal of XML. As the press release of the XML 1.0 specification (from 1998) states

The Extensible Markup Language (XML) is a subset of SGML that is completely described in this document. Its goal is to enable generic SGML to be served, received, and processed on the Web in the way that is now possible with HTML. XML has been designed for ease of implementation and for interoperability with both SGML and HTML.

Notably, the SVG and MathML vocabularies were created using XML as meta-language (eg. for defining the elements and attributes of SVG and MathML, resp.).

However, while XML was successful in many applications outside browsers, XHTML2 (HTML re-specified as an XML vocabulary with additional features such as XForms) wasn't adopted by browsers, and browser vendors under Ian Hickson's (of Google) lead created the WHAT working group in 2004 to start the specification process for what became HTML 5 as we know it today. HTML 5 made it possible to use the SVG and MathML vocabularies (which were specified using XML) directly in HTML by allowing eg. XML-style empty elements such as </g> in those foreign vocabularies.

A major feature of HTML 5 is that it's backward-compatible with the huge existing base of HTML content out there, whereas XHTML would have required adoption of the much more limited XML parsing rules. For example, HTML allows tag inference/tag omission, "void" elements (SGML-style empty elements with no end-element tag), and various forms of attribute minimization

HTML (up until Version 4) was originally specified using SGML as markup meta-language, and SGML remains the only markup meta-language capable to describe HTML parsing rules, including those for HTML 5 (see my paper/talk at http://sgmljs.net/blog/blog1701.html for details). Though browser have never supported full SGML natively (being limited as an SGML application to only handle the hard-coded HTML vocabulary), the idea of using more SGML features than directly supported by browsers was implemented in browser plugins in the 90's such as in SoftQuad's Panorama SGML/HyTime browser (linked from http://www.hytime.org/tools/index.html).

Custom Wiki syntaxes such as markdown are as old as digital text processing itself. SGML (since at least 1986) lets you define context-specific token replacement rules for this purpose. For example, to make SGML format a simplistic markdown fragment into HTML, you could use an SGML prolog like this:

<!DOCTYPE p [
  <!ELEMENT p - - ANY>
  <!ELEMENT em - - (#PCDATA)>
  <!ENTITY start-em '<em>'>
  <!ENTITY end-em '</em>'>
  <!SHORTREF in-p '*' start-em>
  <!SHORTREF in-em '*' end-em>
  <!USEMAP in-p p>
  <!USEMAP in-em em>
]>
<p>The following text:
   *this*
   will be put into EM
   element tags</p>
imhotap
  • 2,275
  • 1
  • 8
  • 16
0

Even with today's technology:

(a) browsers understand a variety of content types beyond just HTML, and will render the content provided it is properly identified in the HTTP header. (Remember Flash?)

(b) in particular, they recognise XML which can use any vocabulary of your own choosing, and will invoke a server-supplied XSLT (or CSS) stylesheet to render the XML content.

So yes, the idea of the web supporting multiple content types is not at all new.

Michael Kay
  • 156,231
  • 11
  • 92
  • 164