Questions tagged [wikitext]

Wikitext is a document written in a wiki markup language, used on Mediawiki sites such as wikipedia.org.

From MediaWiki.org:

Wikitext is a document written in a wiki markup language, such as the current one explained in Help:Editing (see also Markup spec/DTD). It is a mixture of content, markup, and metadata. The current and old versions of all pages of a wiki are stored in the database in the text table, in the form of wikitext.

73 questions
12
votes
3 answers

Convert MediaWiki wikitext format to HTML using command line

I tend to write a good amount of documentation so the MediaWiki format to me is easy for me to understand plus it saves me a lot of time than having to write traditional HTML. I, however, also write a blog and find that switching from keyboard to…
Todd Partridge 'Gen2ly'
  • 2,258
  • 2
  • 19
  • 18
11
votes
4 answers

Get first paragraph (and only text) of a Wikipedia article returns not desired result

I'm trying to retrieve the first paragraph of text for an article of Wikipedia, UNIX in this example, but it returns me a non-desired output. For what I've been reading on the Wikipedia api and here on StackOverflow, this is the request URL to make…
udexter
  • 2,307
  • 10
  • 41
  • 58
9
votes
4 answers

Parsing wikimedia markup - are EBNF-based parsers poorly suited?

I am attempting to parse (in Java) Wikimedia markup as found on Wikipedia. There are a number of existing packages out there for this task, but I have not found any to fit my needs particularly well. The best package I have worked with is the…
toluju
  • 4,097
  • 2
  • 23
  • 27
4
votes
4 answers

Wikitext editor for OS X

Could anyone suggest a Wikitext editor for OS X? My company uses MediaWiki extensively and I am looking for more of an IDE-like text editor I can use offline.
Ethan Whitt
  • 157
  • 2
  • 6
4
votes
1 answer

How does one parse simple inline markup (i.e. *bold*), in Python?

How does one implement a parser (in Python) for a subset of wikitext that modifies text, namely: *bold*, /italics/, _underline_ I'm converting it to LaTeX, so the conversion is from: Hello, *world*! Let's /go/. to: Hello \textbf{world}! Let's…
Brian M. Hunt
  • 81,008
  • 74
  • 230
  • 343
3
votes
2 answers

Parsing wikiText with regex in Java

Given a wikiText string such as: {{ValueDescription |key=highway |value=secondary |image=Image:Meyenburg-L134.jpg |description=A highway linking large towns. |onNode=no |onWay=yes |onArea=no |combination= *…
Mulone
  • 3,603
  • 9
  • 47
  • 69
3
votes
0 answers

How to use Parsoid to convert wikitext to html (instead of a full html page with extra info)

Both parsoid and parsoid-jsapi give you a .parse(... function to parse wikitext to html, but I'm having trouble getting a clean html string. Say I want to parse This is [[it]] I do this: var parsoid = require('parsoid-jsapi') ||…
01AutoMonkey
  • 2,515
  • 4
  • 29
  • 47
3
votes
1 answer

How to use MediaWiki::DumpFile to convert Wikipedia XML dump to HTML?

On page MediaWiki::DumpFile following code is present: use MediaWiki::DumpFile; $mw = MediaWiki::DumpFile->new; $sql = $mw->sql($filename); $sql = $mw->sql(\*FH); $pages = $mw->pages($filename); $pages = $mw->pages(\*FH); …
DSblizzard
  • 4,007
  • 7
  • 48
  • 76
3
votes
1 answer

Convert wiki text to html in python for displaying in a website

I know there are many questions on this topic but After 6 hours of try-this-and-try-that-tool, I still can't find a single tool that takes wikitext of the form '
Welcome to the world's foremost open content
'''Organic…
prongs
  • 9,422
  • 21
  • 67
  • 105
2
votes
0 answers

Bulk convert Media Wiki pages to HTML (either via API or locally from saved wikitext)

I'm trying to get the actual HTML of about 600-700 pages hosted on a Media Wiki. I have thought of/tried the following: Option 1 Action API with action=parse: works well, takes about 0.75 seconds per page. I haven't been able to do this for multiple…
Ash
  • 134
  • 1
  • 6
2
votes
1 answer

Is it possible to add css for mobile in fandom wiki?

The mobile version of the fandom wikis are very restricted, and actively strips away all style tags from html elements. I tried to define styles in MediaWiki:Common.css and that works on desktop, but it does not apply on mobile. I have also tried to…
awe
  • 21,938
  • 6
  • 78
  • 91
2
votes
0 answers

How to train GPT2 with Huggingface trainer

I am trying to fine tune GPT2, with Huggingface's trainer class. from datasets import load_dataset import torch from torch.utils.data import Dataset, DataLoader from transformers import GPT2TokenizerFast, GPT2LMHeadModel, Trainer,…
2
votes
2 answers

What is the best way to expand the wikitexts of a full Wikipedia dump?

It is easy to download dumps of Wikipedia in XML format. However, the content of the articles are written in wikitext, which has a template system. To extract clean full texts from these dumps, it is necessary to expand these templates. Wikipedia…
Robin
  • 1,531
  • 1
  • 15
  • 35
2
votes
1 answer

Parsoid - parse wikitext locally

Is that even possible? I am not sure, if I understand the project properly. I am trying to parse a big amount of wikitext into html using the Parsoid-JSAPI project. Parsing works fine, but it is still calling the wikimedia API. I have run the…
prespic
  • 1,635
  • 1
  • 17
  • 20
2
votes
1 answer

Working example of wikitext-to-HTML in ANTLR 3

I'm trying to flesh out a wikitext-to-HTML translator in ANTLR 3, but I keep getting stuck. Do you know of a working example that I can inspect? I tried the MediaWiki ANTLR grammar and the Wiki Creole grammar, but I can't get them to generate the…
Dan
  • 9,912
  • 18
  • 49
  • 70
1
2 3 4 5