12

I tend to write a good amount of documentation so the MediaWiki format to me is easy for me to understand plus it saves me a lot of time than having to write traditional HTML. I, however, also write a blog and find that switching from keyboard to mouse all the time to input the correct tags for HTML adds a lot of time. I'd like to be able to write my articles in Mediawiki syntax and then convert it to HTML for use on my blog.

I've tried Google-ing but must need better nomenclature as surprisingly I haven't been able to find anything.

I use Linux and would prefer to do this from the command line.

Any one have any thoughts or ideas?

Nemo
  • 2,441
  • 2
  • 29
  • 63
Todd Partridge 'Gen2ly'
  • 2,258
  • 2
  • 19
  • 18
  • see also [lexers / parsers for (un) structured text documents](http://stackoverflow.com/q/2087699/33499) for alternative formats – wimh Feb 18 '12 at 19:03

3 Answers3

14

The best would be to use MediaWiki parser. The good news is that MediaWiki 1.19 will provide a command line tool just for that!

Disclaimer: I wrote that tool.

The script is maintenance/parse.php some usage examples straight from the source code:

Entering text yourself, ending it with Control + D:

$ php maintenance/parse.php --title foo
''[[foo]]''^D
<p><i><strong class="selflink">foo</strong></i>
</p>
$

The usual file input method:

$ echo "'''bold'''" > /tmp/foo.txt
$ php maintenance/parse.php /tmp/foo.txt
<p><b>bold</b>
</p>$

And of course piping to stdin:

$ cat /tmp/foo | php maintenance/parse.php
<p><b>bold</b>
</p>$

as of today you can get the script from http://svn.wikimedia.org/svnroot/mediawiki/trunk/phase3/maintenance/parse.php and place it in your maintenance directory. It should work with MediaWiki 1.18

The script will be made available with MediaWiki 1.19.0.

  • 1
    Actually this is pretty useful, and just what I need. Appreciate the info Antoine. – Todd Partridge 'Gen2ly' Apr 06 '12 at 16:59
  • 3
    I get the error "PHP Fatal error: Call to undefined function mysql_error() in /scratch4/dhruv/mediawiki-1.20.2/includes/db/DatabaseMysql.php on line 326" when I try to run the above. Any idea how I can fix it? – dhruvbird Dec 18 '12 at 01:04
  • 1
    Also, why does this tool take in --dbuser and --dbpass? – dhruvbird Dec 18 '12 at 01:07
  • 1
    @dhruvbird The eval.php script is an old script that has not been migrated to take recognize --dbuser and --dbpass :( I have filled bug https://bugzilla.wikimedia.org/45254 to track this, though that is not that much of a high priority item =) – Antoine 'hashar' Musso Feb 21 '13 at 22:15
  • I had some permissions problems with CDB files, so the lazy way to get around them was to use `sudo`. Then it worked. – Sridhar Sarnobat Oct 19 '16 at 23:43
  • I wish it didn't generate "edit" hyperlinks that point to nowhere. I guess some regex manipulation can take care of that. I'm happy it generates a table of contents. – Sridhar Sarnobat Oct 19 '16 at 23:45
  • the best would be running a mediawiki instance, so it also has access to templates. – milahu Jul 01 '23 at 21:03
  • That is what `maintenance/parse.php` is doing! It requires a MediaWiki instance and has full access to templates, it is just that the interface is a command line utility rather than editing a page with a web browser. And I think that question triggered me to write the command line tool :] – Antoine 'hashar' Musso Jul 11 '23 at 16:40
9

Looked into this a bit and think that a good route to take here would be to learn to a general markup language like restucturedtext or markdown and then be able to convert from there. Discovered a program called pandoc that can convert either of these to HTML and Mediawiki. Appreciate the help.

Example:

pandoc -f mediawiki -s myfile.mediawiki  -o myfile.html -s
Community
  • 1
  • 1
Todd Partridge 'Gen2ly'
  • 2,258
  • 2
  • 19
  • 18
  • 3
    Please don't. [Alternative parsers](https://www.mediawiki.org/wiki/Alternative_parsers) for wikitext are always very fragile, due to how wikitext has (not) been designed. – Nemo May 03 '15 at 07:35
  • I just tried pandoc as a result of this answer for converting mediawiki to tex and HTML and am very pleased with the results. I can't speak for its fragility but if you're just using the basics like headings, lists etc it looks perfectly fine. It plays nicely with other UNIX commands since it supports stdin/stdout IO which is great for pipes. – Sridhar Sarnobat Dec 19 '15 at 23:40
  • 5
    Pandoc does not recognize the full wiki markup. Therefore there will be a lot of articles which cannot be properly parsed. I tried this myself. – Waschbaer Feb 16 '16 at 17:08
  • 1
    @Waschbaer - if you remember, what kind of mediawiki syntax does it fail on? Readers considering using it may not need unsupported features that are catered mainly for Wikipedia-like use cases. – Sridhar Sarnobat Apr 20 '17 at 22:52
  • pandoc fails to translate templates in wikitext. – milahu Jul 01 '23 at 21:02
5

This page lists tons of MediaWiki parsers that you could try.

Thomas
  • 174,939
  • 50
  • 355
  • 478