5

Good day, stackoverflow,

I need to suggest to user different contexts for a word, so that he could have a possibility to disambuguate it.

For example: a word "less" can be Unix program, css framework or some other things. A word "apple" can be a fruit, a corporation, a river, a state in the US (big apple) or a bunch of other things.

I hope you got the idea.

I looked over the internet and so far I could come up only with this query.

But it's still far from being perfect. It often gives too much or too few words and sometimes nothing (for "jquery").

http://www.visualdataweb.org/relfinder/relfinder.php seems to use dbpedia as well, but its results are far better than mine.

How should I change my query to get more relevant results?

Mironor
  • 1,157
  • 10
  • 25

1 Answers1

5

If you are looking for a Web API, use: DBpedia Lookup or DBpedia Spotlight. If you need to do it in SPARQL, you can use the DBpedia Lexicalization Dataset.

For DBpedia Lookup, you can give a string and retrieve DBpedia Resources with labels matching those strings: lookup.dbpedia.org/api/search.asmx/KeywordSearch?QueryString=apple

For DBpedia Spotlight, you can optionally give more context: spotlight.dbpedia.org/rest/candidates?text=apple+company+macintosh+computer

For the Lexicalization Dataset, there is no SPARQL endpoint available yet. You will need to download it, load it in your own RDF store and run a query like this:

PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
SELECT ?resource ?score WHERE {
GRAPH ?g {
  ?resource skos:altLabel ?label.
}
  ?g <http://dbpedia.org/spotlight/score> ?score.
  FILTER (REGEX(?label, "apple", "i"))
}
Chris Salij
  • 3,096
  • 5
  • 26
  • 43
Pablo Mendes
  • 391
  • 1
  • 8
  • I asked the same question on http://answers.semanticweb.com/questions/16309/query-dbpedia-to-find-possible-contexts-to-disambiguate-a-word . Your http://spotlight.dbpedia.org/rest/candidates?text=adobe seems to give better results for "adobe" (not in json, though), but no results for "less. Anyway, thanks for the help, I'll the "lexicalization" solution soon enough – Mironor May 22 '12 at 12:54
  • 1
    In order to get the desired output format, you need to do content negotiation. Add an accept header to your request. Example: `curl -H "Accept: application/json" http://spotlight.dbpedia.org/rest/candidates?text=adobe` – Pablo Mendes Jun 03 '12 at 12:17
  • 1
    The phrase "less" does not return results because the current implementation is filtering out surface forms containing only stopwords. This limitation will be removed in the near future. – Pablo Mendes Jun 03 '12 at 12:18
  • @PabloMendes I was trying to download the dbpedia Lexicalization dataset. But it's not working. – Hani Goc Apr 22 '15 at 08:54