0

I'm trying to make a batch call to Wikipedia's query API to retrieve the pageImages (main image) and extract. For some reason my API call only returns an image for the first result, not the second:

https://en.wikipedia.org/w/api.php?format=json&action=query&prop=pageimages|extracts&exintro=&explaintext=&titles=Stack%20Overflow%7CMicrosoft&exlimit=2&pithumbsize=500&pilimit=2

This was happening with the extracts too, but it was solved by adding the exlimit parameter. Which as far as I understand, for pageimages the equivalent parameter is pilimit. Unfortunately this doesn't fix it.

How can I change the API call above to return a batch collection of Wikipedia results, with each result having an extract and pageImages?

Termininja
  • 6,620
  • 12
  • 48
  • 49
Dol
  • 944
  • 3
  • 10
  • 25
  • 1
    The StackOverflow article does not have a page image, that's all. – Tgr Feb 28 '17 at 09:33
  • Which was what I found confusing because it seems to have a 'main image' on its Wikipedia page, but its just that its not labelled as the `pageImage`. – Dol Feb 28 '17 at 10:09

2 Answers2

2

First, there is probably a bug with pageimages, so you can use images instead. Second, the images for the second title will be displayed only after all images for the first title are displayed. etc., so you need to use imlimit=max:

https://en.wikipedia.org/w/api.php?action=query&prop=images|extracts&exintro=&explaintext=&titles=Stack%20Overflow|Microsoft&exlimit=2&imlimit=max
Termininja
  • 6,620
  • 12
  • 48
  • 49
  • I'm glad you agree that there could be a bug! I thought there was something wrong with my implementation. Do you have any suggestions of how to find the main image from the images? Thank you! – Dol Feb 27 '17 at 16:35
  • 1
    Hm, for the main image it is better to use `pageimages` because `images` sort them, but you have to decide do you need all images or only the main. Did you check http://stackoverflow.com/questions/35663229/how-to-get-a-short-snippet-of-text-and-the-main-image-of-wikipedia-articles-by-a – Termininja Feb 27 '17 at 18:16
  • WOW, imlimit=max is exactly what I needed. Most answers in other threads dont highlight this enough as I have overlooked this countless times. I was getting like 50 'continue' requests. and almost gave up. This fixed it! – JavaBeast Aug 23 '18 at 20:48
1

Add pilicense=any if you want to get non-free images as well (example, sandbox). Due to various limitations on how they can be used, these tend to be not so great (e.g. lower resolution).

Tgr
  • 27,442
  • 12
  • 81
  • 118
  • Thanks! It worked, it returns the exact image I was expecting and previously lacking. – Dol Mar 01 '17 at 15:48