2

I'm trying to build a query to fetch instances of / any subclasses of abstract elements such as "human" (Q5) by name, however the query fails with a timeout, probably because it has too many nodes to traverse in the graph.

  1. Are there any better methods to query this? The best I could come up with is using the Wikidata API search entities endpoint with the element name, then filter the desired results in Sparql query to minimize the domain of the query instead of the whole graph.
  2. I'm a little worried about using this method in a production environment since Wikidata Sparql is in Beta. Any best practices for migrating knowledge graph use cases from freebase? Is there any update regarding the migration of data from Freebase to Wikidata?

Finally are there any other mature alternatives to the deprecated Freebase service?

Shlomi Uziel
  • 868
  • 7
  • 15
  • In a production environment use your own SPARQL endpoint and load the Wikidata into it. Anything else doesn't make sense as do not have any control over it's availability. – UninformedUser Jul 27 '16 at 05:08
  • Isn't there an external production service alternative? Maintaining an in-house wikidata mirror also doesn't make sense for the extent I need from the service. – Shlomi Uziel Jul 27 '16 at 08:07
  • What means "external" production? Indeed, you can use the public SPARQL endpoint, but it's hosted for free and you do not pay anything for what you get. Thus, you can not give rise to any claims. Hosting such a service costs money and you know that you're not the only one using it. In addition, you can't make it faster as you have to rely on the hardware that they use for hosting the service. – UninformedUser Jul 27 '16 at 08:58
  • I mean a paid alternative. Will Google Knowledge Graph be better in terms of reliability and the extend of the data? Is there any other suggestion? – Shlomi Uziel Jul 27 '16 at 09:17
  • This is a bit of a plug (so therefore not an answer), but perhaps a cloud service like [Ontotext S4](http://s4.ontotext.com/) would fit your need. They offer cloud instances of GraphDB as well as access to hosted versions of various open datasets. Not sure Wikidata is in there but if it's not, I'm sure you can ask them about it. – Jeen Broekstra Aug 01 '16 at 20:53
  • The answer at https://stackoverflow.com/a/62126802/4494 applies here, as well, as far as I can tell. – Matthias Winkelmann Aug 07 '21 at 09:59

1 Answers1

2

What endpoint are you querying against? Querying against a shared public endpoint with no SLA (beta or not) for a production service is very risky proposition.

Wikidata offers full database dumps that you can tailor/subset and load into whatever infrastructure you like. That would give you complete control over performance, quality, and any other metrics which are important to you.

As far as migrating from Freebase goes, there is no migration path. The track that train was on has come to an end (at least for external non-Google users). It's not just deprecated, it was shut down completely a while ago. A tiny fraction of the data was imported to Wikidata (and they shared a bunch in common already due to their common ancestor Wikipedia), but none of the programmatic features such as MQL's JSON query-by-example, Freebase Search, Freebase Suggest, Google-scale performance or availability, etc is available (yet?) for Wikidata.

If the data is important to you, you should self-host using whatever infrastructure meets your needs.

Tom Morris
  • 10,490
  • 32
  • 53
  • I'm using https://query.wikidata.org/sparql endpoint. Maintaining an in-house wikidata incurs overhead of constantly updating the data. Is there a better alternative than Wikidata with wider data? perhaps Google Knowledge Graph is a more comprehensive and more reliable alternative? Also, do you have any insights on the method I used to query all humans with a given name? – Shlomi Uziel Jul 27 '16 at 08:04