1

I have a list of strings (contains name entities, names, verbs, etc.) and for each of them, I want to find out if there is any node equivalent to that string in DBpedia dataset. I am using SPARQL to query DBpedia on my machine. As far as I know, SPARQL is case sensitive and there are different naming standards in DBpedia.

What is the best way to do that? I want to extract all triples that are related to each string in the list.

TallTed
  • 9,069
  • 2
  • 22
  • 37
Nilou
  • 145
  • 2
  • 10
  • Are you running Virtuoso, and hosting DBpedia, locally on your machine? Or are you querying the public endpoint? On the latter, you could work with the [FCT interface](http://dbpedia.org/fct), or at least use it to learn about helpful query constructions for use toward your goal. "all triples that are related to each string" may produce some very large result sets, so you'll definitely want to [learn about `OFFSET` and `LIMIT`, and public endpoint usage limits](https://medium.com/virtuoso-blog/dbpedia-usage-report-as-of-2018-01-01-8cae1b81ca71), if you haven't already. – TallTed Apr 09 '19 at 14:27
  • I am running virtuoso and DBpedia on my machine. by triples that are related to a string, I meant triples that the string (exact match) appears in them as subject or object. I think since most of the strings are not very general, I won't have problem with the size of the output. – Nilou Apr 09 '19 at 14:45
  • 1
    searching for subjects by string is expensive as you have to use a scan and can't make use of the fulltext index but use `regex` or `contains` or just string comparison. You should use the label of the entites and then use `bif:contains` - indeed, there is no support for fuzzy matching here – UninformedUser Apr 09 '19 at 16:59
  • 1
    First, I suggest you install the FCT Browser VAD (`fct_dav.vad`) appropriate to your [Enterprise Edition](http://download3.openlinksw.com/index.html?prefix=uda/vad-packages/) or [Open Source Edition](http://download3.openlinksw.com/index.html?prefix=uda/vad-vos-packages/), and use that to work out at least some preliminary query forms, which you can then adjust for your specific needs. – TallTed Apr 09 '19 at 17:41
  • 1
    Second, some pre-processing to determine whether you're matching your string against string literals or IRIs can help you adjust the query to speed the processing, among other things. There's not really enough information here to advise much further. – TallTed Apr 09 '19 at 17:44

0 Answers0