1

I'm parsing in my app the JSON from Wiki api using volley requests with no problem, except from the following one. I'm need to parse these expressions along with the text.

I'm using this URL (for example):

https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&explaintext=&titles=%20Partition%20function%20(statistical%20mechanics)

This is a problematic part in the article: enter image description here

The parsing works juse fine, but when it comes to a math expression, it looks like this in the API:

enter image description here

and in my app, I get along with the text the" {displaystyle = "part. I don't get the "/n" or something. In my app, it looks like this:

enter image description here

enter image description here

I get a lot of spaces and this "{displaystyle". The text freaks out. Is there something that I can do in order to overcome this? I couldnt find an api query for this, but maybe I can do something with the JSON respone itself.

Thanks.

Tal Barda
  • 4,067
  • 10
  • 28
  • 53
  • I guess you are out of luck here. It seems that what you get via the interface has just some html tags removed from the original text. It is hard to make sense out of what remains. – Henry Jul 09 '17 at 08:20

1 Answers1

2

formatversion=2

API:Data formats#JSON parameters

Specify formatversion=2 for to get json (and php) format responses in a cleaner format. This also encodes most non-ASCII characters as UTF-8. MW 1.25+

So: https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&titles=Partition%20function%20(statistical%20mechanics)&formatversion=2 returns JSON containing <math> markup like:

<annotation encoding=\"application/x-tex\">{\\displaystyle \\beta }</annotation>

which might be more useful.

More information about formatversion=2 can be found at API:JSON version 2

format=json suffers from a number of shortcomings that make it more difficult to use than necessary. Many of these arise because XML was the original output format and the underlying data structure of API responses was designed around this.

To address this, after discussion MediaWiki 1.25 introduces a new JSON response format. It is not the default, you only get results in the new format if you specify formatversion=2, and it's only for the json and php formats (and their human-readable jsonfm and phpfm variants).

Community
  • 1
  • 1
Fred Gandt
  • 4,217
  • 2
  • 33
  • 41
  • 1
    Can you suggest a good way for parsing this " MathML" code in java? I have found some libraries for parsing math expressions, but I am not sure they will suit this case. – Tal Barda Jul 09 '17 at 14:58
  • @TalBarda - I've never played with Java and am unfamiliar with it, but [this question about parsing MathML with Java under Android](https://stackoverflow.com/questions/1784786/mathml-and-java) has an informative top answer and several under it that might prove useful. – Fred Gandt Jul 09 '17 at 17:33