I would like to configure something like this:
- RDF dataset of metadata about books;
- Books placed separately like XHTML files, paragraphs with unique IDs;
- Every book’s metadata includes something like
dc:source
link to the file (absolute? like a proper URI, what about scaling?);
I know this could be pretty trivial but I can’t grasp that properly. At the beginning I am trying to index just pure TXT tiny files, every linked from dc:source
in the metadata file. As I understand, this should be enough for indexing everything included. I am trying to do it like the guy in this post here. Unlike him, I want to index RDF dataset as well as external files. Especially these two commands log no errors (in contrary, it logs there are 57 triples):
java -cp /home/honza/.apache-jena-fuseki-2.3.0/fuseki-server.jar tdb.tdbloader --tdb=run/configuration/service2.ttl testDir/test_dataset.ttl
INFO -- Start triples data phase
INFO ** Load into triples table with existing data
INFO -- Start quads data phase
INFO ** Load empty quads table
INFO Load: testDir/test_dataset.ttl -- 2015/11/13 12:46:22 CET
INFO -- Finish triples data phase
INFO ** Data: 57 triples loaded in 0,29 seconds [Rate: 193,22 per second]
INFO -- Finish quads data phase
INFO -- Start triples index phase
INFO -- Finish triples index phase
INFO -- Finish triples load
INFO ** Completed: 57 triples loaded in 0,33 seconds [Rate: 172,21 per second]
INFO -- Finish quads load
and
java -cp /home/honza/.apache-jena-fuseki-2.3.0/fuseki-server.jar jena.textindexer --desc=run/configuration/service2.ttl
WARN Values stored but langField not set. Returned values will not have language tag or datatype.
After that, server runs properly, I see the graph but it includes no data.
My config for this service is (I don’t know whether it is right to have service and DB config in one file, for me it works better at the moment, dividing throws some errors):
@prefix fuseki: <http://jena.apache.org/fuseki#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix tdb: <http://jena.hpl.hp.com/2008/tdb#> .
@prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> .
@prefix text: <http://jena.apache.org/text#> .
@prefix : <#> .
[] rdf:type fuseki:Server
.
<#service2> rdf:type fuseki:Service ;
rdfs:label "TDB/text service" ;
fuseki:name "test" ; # http://host:port/ds
fuseki:serviceQuery "sparql" ; # SPARQL query service
fuseki:serviceQuery "query" ; # SPARQL query service (alt name)
fuseki:serviceUpdate "update" ; # SPARQL update service
fuseki:serviceUpload "upload" ; # Non-SPARQL upload service
fuseki:serviceReadWriteGraphStore "data" ; # SPARQL Graph store protocol (read and write)
# A separate read-only graph store endpoint:
fuseki:serviceReadGraphStore "get" ; # SPARQL Graph store protocol (read only)
fuseki:dataset :text_dataset
.
[] ja:loadClass "org.apache.jena.tdb.TDB" .
tdb:DatasetTDB rdfs:subClassOf ja:RDFDataset .
tdb:GraphTDB rdfs:subClassOf ja:Model .
[] ja:loadClass "org.apache.jena.query.text.TextQuery" .
text:TextIndexLucene rdfs:subClassOf text:TextIndex .
:text_dataset rdf:type text:TextDataset ;
text:dataset <#test> ;
text:index <#indexLucene> .