4

I'm using Bing Web Search API v7, I'm sending following requests (selected few):

/bing/v7.0/search?q=mate%C5%99sk%C3%A1%20%C5%A1kola&count=50&offset=0&responseFilter=Webpages
/bing/v7.0/search?q=mate%C5%99sk%C3%A1%20%C5%A1kola&count=50&offset=50&responseFilter=Webpages
/bing/v7.0/search?q=mate%C5%99sk%C3%A1%20%C5%A1kola&count=50&offset=950&responseFilter=Webpages
/bing/v7.0/search?q=mate%C5%99sk%C3%A1%20%C5%A1kola&count=50&offset=1000&responseFilter=Webpages
/bing/v7.0/search?q=mate%C5%99sk%C3%A1%20%C5%A1kola&count=50&offset=1050&responseFilter=Webpages
  • First search query request with offset=0 returns 50 records, value of totalEstimatedMatches > 50000.

  • Second request with offset=50 returns another 50 records, value of totalEstimatedMatches is different, but still above 50000.

  • And so on with increasing offset (not presented above).

  • However, request with offset=1000 or any value offset >= 1000 provides result with records which are identical to records returned in request offset=950.

This behavior in fact corresponds to MS Bing web search - when I click on page 101 with offset 1001 (or any higher page), I in fact get page 96 with offset 951.

So, I can't figure out any way to access more than 1000 results, even if there should be above 50000 of them (I'm aware that totalEstimatedMatches is only an estimate and the real value can differ).

Does anyone know how to get more than 1000 webpage results (more than 100 pages with 10 records / more than 20 pages with 50 records)?

1 Answers1

2

Search engines optimize their index and return fewer results than totalEstimatedMatches to 1) Stop serving repetitive pages and 2) Focusing on relevance of top pages only. Bulk (if not 99.x%) of users alter query if they don't find results on first 2-3 pages. So maybe for search engines it is not worth storing index of billions of pages for a given query. Note that this behavior is common across all the search engines and not only Bing.

Ronak
  • 751
  • 5
  • 10
  • 1
    You are right that usual usecase (search of users) suffice with 2-3 pages of results. Usecase here was to search for a generic term (to make it easier e. g. "school") and by processing many results build a register of occurences (i. e. "list of schools"). The puzzle here is why present info about tens of thousands to millions of results and provide access to only first thousand of them. – Petr Stupka Jan 17 '18 at 10:47
  • 1
    For this purpose, you could potentially explore the Bing entity search API. That should give you a readymade list. If that doesn't suit your requirement, maybe you need to have a set of different queries and mine on those. For example. "schools", "public schools", "government schools", "private schools", etc. – Ronak Jan 17 '18 at 18:44