1

I want to get the text of the edit made on a Wikipedia page before and after the edit. I have this url:

https://en.wikipedia.org/w/index.php?diff=328391582&oldid=328391343

But, I want the text in the json format so that I can directly use it in my program. Is there any API provided by MediaWiki that gives me the old and new text after an edit or do I have to parse the HTML page using a parser?

Hellboy
  • 1,199
  • 2
  • 15
  • 33

3 Answers3

1

Try this: https://www.mediawiki.org/wiki/API:Revisions

There are a few options which may be of use, such as:

  1. rvparse: Parse revision content. For performance reasons if this option is used, rvlimit is enforced to 1.

  2. rvdifftotext: Text to diff each revision to.

If those fail there's still

  1. rvprop / ids: Get the revid and, from 1.16 onward, the parentid

Then once you get the parent ID, you can compare the text of the two.

wwl
  • 2,025
  • 2
  • 30
  • 51
0

Leaving a note in JavaScript, how to query the Wikipedia API to get all the recent edits.

In some cases the article get locked, the recent edits can't be seen.

This article is semi-protected due to vandalism

Querying the API as follow allow to read all edits.

fetch("https://en.wikipedia.org/w/api.php?action=query&origin=*&prop=revisions&format=json&titles=Timeline_of_the_2020_United_States_presidential_election&rvslots=*&rvprop=timestamp|user|comment|content")
.then(v => v.json()).then((function(v){
    main.innerHTML = JSON.stringify(v, null, 2)
 })
)
<pre id="main" style="white-space: pre-wrap"></pre>

See also How to get Wikipedia content as text by API?

NVRM
  • 11,480
  • 1
  • 88
  • 87
0

You can try WikiWho. It tracks every single token written in Wikipedia (with 95% accuracy). In a nutshell, it assigns IDs to every token, and it tracks them based on the context. You just need to check for the existence (or not) of the ID between two revisions (it works even if the revisions are not consecutive).

There is a wrapper and a tutorial. There is a bug in the tutorial because the name of the article change (instead of "bioglass", you should look for "Bioglass_45S5")

You can (sometimes) access the tutorial online: Binder

toto_tico
  • 17,977
  • 9
  • 97
  • 116