2

I'm a bit confused here. I have the following SPARQL query that works brilliantly against the LinkedMDB explorer.

 PREFIX mdb: <http://data.linkedmdb.org/resource/movie/film>
 PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
 PREFIX dc: <http://purl.org/dc/terms/>

 SELECT ?label?resource WHERE {
    ?resource mdb:id ?uri .
    ?resource dc:title ?label . 
    FILTER regex(?label,'^Batman')
}

This one filters out all the Batman movies like this (I've filtered out all the results and is only showing five here):

-----------------------------------------------|
| Label                           | Resource   |
|----------------------------------------------|
| Batman                          | db:film/2  |
| Batman                          | db:film/3  |
| Batman & Robin                  | db:film/4  |
| Batman: Mask of the Phantasm    | db:film/737|
| Batman: Mystery of the Batwoman | db:film/974|
-----------------------------------------------|

But, here comes the question. If I write "Forrest Gump" instead of "Batman", the query can't find any result.

However, if I change the last line to

    ?resource dc:title "Forrest Gump". 

it finds the movie in the LinkedMDB database, so I know its hiding there somewhere. But it's not returned when I use the FILTER regex solution.

I've noticed that if I only search without filter and just print all the movies in the database, it looks like LinkedMDB have some sort of LIMIT on 2557 so that the webpage won't crash. And it looks like the FILTER only filters those 2557 movies. Is there a way to retrieve more movies?

Joshua Taylor
  • 84,998
  • 9
  • 154
  • 353
  • Your problem is reproducible, and doesn't look like it's a problem on your part. I don't see any support mailing list on their website, but you might try emailing the admin contact address, admin@linkedmdb.org. – Joshua Taylor Sep 13 '13 at 14:28

1 Answers1

1

SPARQL 1.1 introduces more string functions, such as contains, strstarts, and strends which are much more specialized and could be much faster than using a full blown regular expression. However, it doesn't look like the LinkedMDB explorer supports SPARQL 1.1 yet, so those aren't useful here.

If you know the exact name of the movie, it will be much more efficient to simply ask for it instead of using regular expressions. E.g.,

SELECT ?resource WHERE {
    ?resource movie:filmid ?uri .
    ?resource dc:title "Forrest Gump" .
}

SPARQL Results

returns the film db:film/38179.

Joshua Taylor
  • 84,998
  • 9
  • 154
  • 353
  • Thanks for the reply. The problem is that we don't know the exact name. The application we are developing is using a "search-box" where the user can search for a movie. We can't expect that the user knows the full name of the movie. – user2775969 Sep 13 '13 at 13:46
  • Hm, yes, that would complicate things. And I can reproduce the kind of problem you're talking about. It really does seem like something strange is happening on their end. If you use `regex filter(?title,"^For")`, you'll even get a number of movies starting with `For`, but not Forrest Gump. – Joshua Taylor Sep 13 '13 at 14:22
  • Yes. I bet those movies you get are the ones you'll get in the list if you just print "all" (2557) movies in the database. I've tried with several movies, and it finds everyone thats printed of the 2557, but not anyone else. Lets hope anyone knows a solution, or maybe its just an awfull bug! :) – user2775969 Sep 13 '13 at 21:53
  • @user2775969 Well, as I said in a comment on the question proper, you should probably send admin@linkedmdb.org an email asking about this (and you can link to this question, too, to show exactly what's happening). It might be something very easy to fix, but only on their end. – Joshua Taylor Sep 13 '13 at 22:32