Calculating the distance between two resources of DBPedia in JENA

Question

I'm trying to calculate the distance (number of the edges in the DBpedia'graph) between two resources that I receive after querying DBpedia. For example:

     Query query = QueryFactory.create(sparqlQueryString);
     QueryExecution qexec =QueryExecutionFactory.sparqlService("http://dbpedia.org/sparql",query);

     try{
       ResultSet results = qexec.execSelect();
        for(; results.hasNext();) {
         QuerySolution soln = results.nextSolution();
         for(int i=0;i<solutionConcept.length;i++){
                    System.out.print(solutionConcept[i]+":"+soln.get(solutionConcept[i]).toString() +";  "); 
         }
         System.out.println("\n");
         }
        } finally{
        qexec.close();

and this is the string query:

String s6= "PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> "+
             "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> "+
             "PREFIX dbpedia: <http://dbpedia.org/resource/> "+
             "PREFIX o: <http://dbpedia.org/ontology/> "+
             "PREFIX dbprop: <http://dbpedia.org/property/>"+
             "PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>" +
             "select ?Player ?nation ?club "+
             "where {?Player rdf:type o:SoccerPlayer; dbprop:birthPlace ?nation; dbprop:currentclub ?club.} LIMIT 10";

well, is there a way to calculate the number of the edges ( the path??) between two players, or two nation , or two club???? Thank you very much...

Do you mean a path of certain properties, or of any properties, or what? If you're looking for a path, all of whose edges are drawn from set of properties, you may be able to do this, but if you're just looking for arbitrary paths, I don't think you'll be able to do this. Can you given a (manually constructed) example of the kind of result that you're looking for, based on the data that you have (or can browse on DBpedia)? — Joshua Taylor, Oct 02 '13 at 12:09
Essentially I would like to calculate the similarity between two resource..For example, supposing that this is my result set PLAYER NATION CLUB Arturo Vidal Chile Juventus// Claudio pizarro Chile Fiorentina I want to calculate the similarity between Vidal-Pizarro Chile Chile Juventus-Fiorentina And then calculating the total similarity — user2837896, Oct 02 '13 at 12:47
I gathered that that is your intent. However, you need to specify _how_ you want to compute this before anyone can tell you whether you _can_ compute this. In OWL, everything is an instance of `owl:Thing`, so there's always a path of length two between any two resources: `rdf:type/^rdf:type`. But there will be other paths two. And some relationships indicate dissimilarity rather than similarity. Will you attempt to exclude those relations? To answer this question, more information is needed about _what_ you're trying to compute. — Joshua Taylor, Oct 02 '13 at 12:53
Well, thank you for your attention. I want to compute a similarity measure between the tuples of my resultset becouse I want to cluster the results. Then I will use this measure in the K-means algorithm — user2837896, Oct 02 '13 at 13:03
After thinking about it a while longer, this occurred to me: If you're looking to compute the similarity and distance just based on the data that you're retrieving now, you could, using just one SPARQL query, get, for each pair of players, the number of birthplaces and clubs that they have in common, and combine those in some fashion to provide a distance measure. You could then use that for clustering. Would that be along the lines of what you're looking for? — Joshua Taylor, Oct 02 '13 at 13:33
Part of the reason I've been paying attention to this is because the title, as it's currently written, makes it sound like this question is a possible duplicate of [Calculate length of path between nodes?](http://stackoverflow.com/q/5198889/1281433), but I think you're actually trying to do something a bit different. — Joshua Taylor, Oct 02 '13 at 13:44
The reason of cluster is that I want to "extend" the operator Group By..This operator makes a syntactic grouping. Instead if I exploit a knowledge base I will be able to grouping the results in a semantic way...for example If my result set is composed by PLAYER/CLUB/NATION , I will be able to grouping the results by "Continent" — user2837896, Oct 02 '13 at 13:54
I would need to extract, from an istance of resultset , for example from one player all properties,all object...it's possible? — user2837896, Oct 02 '13 at 15:07
The only information you have in the result set is that which you queried for. You don't have any triples, just a table of variable bindings. If you wanted to compute some metric of similarity based on _that_ data, it could be done with SPARQL in the query. If you need to refer to additional data from DBpedia, then you'll need to get more data from DBpedia; it's not in the result set. RDF is a graph format, but just because there's a relation between two resources doesn't mean they're semantically similar. Looking at the graph structure without limiting it in some way probably won't help. — Joshua Taylor, Oct 02 '13 at 15:41
Thanks, maybe I found a solution..but I would like to know how I could do to extract other information of a resource after a quer..I downloaded these four file of dbpedia :dbpedia.org/Ontology. Then I tried to load locally these files following this:stackoverflow.com/questions/16832862/… . But this way is such heavy...Am I being wrong??? — user2837896, Oct 02 '13 at 16:10

score 1 · Accepted Answer · answered Oct 02 '13 at 09:11

1

You wont be able to calculate that distance on the QuerySolution resulting from your current query.

If you want to compute such a semantic distance you have to do it either through another SPARQL query or after updating your query to retrieve needed information locally.

answered Oct 02 '13 at 09:11

YMomb

2,366
1
27
36

I downloaded these four file of dbpedia :http://dbpedia.org/Ontology. Then I tried to load locally these files following this:http://stackoverflow.com/questions/16832862/load-dbpedia-locally-using-jena-tdb . But this way is such heavy...the pc is going to explode...is it a correct way?..Could you help me? – user2837896 Oct 02 '13 at 09:58

Calculating the distance between two resources of DBPedia in JENA

1 Answers1

Linked