1

I am trying to write a small program, which would read the context of a txt file and sent it to Spotlight Web service in order to get semantic annotations (in text\html output format). Unfortunately, only a fragment of entities are “recognised” in comparison to Spotlight Demo.For example, using my program ,for a text like "Ridley Scott directed a number of movies including Alien , Terminator and Blade Runner" I am getting a response where "movies" and "Alien" are not annotated where in the demo they are. The same happens for larger texts.I was having a similar problem with OpenCalais WebService, but this was because I was trying to encode the input text using the command

     input = URLEncoder.encode(input, "UTF-8");

Once I comment this out, the problem was solved.Unfortunately this is not the case here.

user1479847
  • 107
  • 11

1 Answers1

2

It would help if you posted the results obtained with the demo and with your program, alongside parameters used in the demo interface, so that we could try to understand what is happening. Without more info I am not sure I can help.

But let's do some guessing. I think it is one of these things:

  1. you did not set the parameters confidence and support, so it could be that the results to your web service call are being filtered out by higher values than the demo interface. Try to add parameters "&confidence=0.0" and "&support=0". This should show everything (even some clearly incorrect annotations). You can set those parameters higher in order to get higher precision (at the cost of lower recall). See this other answer for help with adding post parameters: How to add parameters to HttpURLConnection using POST
  2. you also did not explicitly set which spotter to use, so it is possible that the web services is using a different spotter from your web service. More about spotting on DBpedia Spotlight: https://github.com/dbpedia-spotlight/dbpedia-spotlight/wiki/Spotting (the same parameters apply to /annotate)
Community
  • 1
  • 1
Pablo Mendes
  • 391
  • 1
  • 8
  • Thanks! By adding the two parameters i am now getting the same results with the demo interface (if I select to annotate all spots). I am still quite confused about spotting.In the documentation it says that Long Pipe Spotter is the most reliable.In order to use it , do i treat it another parameter just like confidence and support? Thanks Again. – user1479847 Jun 28 '12 at 10:28
  • I forgot to ask you,can you explain what the support parameter actually is (or just point me to the right place) a i can not find it anywhere in hte documentation – user1479847 Jun 28 '12 at 10:44