1

I am trying to build a federated RDF application based on rdf4j and FedX. What I need is to be able to:

  1. Optimize the querying plan and joining strategies.
  2. To expose different and heterogeneous databases (A timeseries or a relational DB for example) in a federated fashion.

I went a little bit through the rdf4j documentation and I got a grasp. And therefore I have some little questions:

  1. Is there any documentation that explains how to implement the SAIL API? I tried to debug and follow the flow of execution of an example query using a RDF memory store and I got lost.
  2. Suppose I want to expose a relational database in my datacenter, Should I implement a SPARQL repository or an HTTP repository? should I in anyway implement the SAIL api?
  3. Concerning fedX, how can I make it possible to use the SERVICE and VALUES terms as proposed in the SPARQL 1.1 federated queries? How can I change the Joning strategies? the query plan?

I know that this can be answered if I dive deeply into the code but I wonder if someone has already exposed some kind of a database using the rdf4j API or even worked and tuned RDF4J.

Thanks to you all!

M.Taki_Eddine
  • 160
  • 2
  • 11
  • There’s a lot of acronyms I’m not familiar with and haven’t dealt with in a while. Maybe include some links to background info to help clarify? – vol7ron Mar 23 '20 at 00:08
  • To be frank: these aren't "little" questions. They are very broad design inquiries that have no clear answer unless your provide a lot more clarification on what you intend to do, and you're probably better off seeking support on the [RDF4J support channels](https://rdf4j.org/support/) than here on StackOverflow. That said, I'll try and provide some pointers in an answers. – Jeen Broekstra Mar 23 '20 at 01:02

1 Answers1

1

Is there any documentation that explains how to implement the SAIL API? I tried to debug and follow the flow of execution of an example query using a RDF memory store and I got lost.

There is a basic design draft but it's incomplete. A more comprehensive HowTo has been in the planning for a while but it never quite gets the priority it needs.

That said, I don't think you need to implement your own SAIL for what you have in mind. There's plenty of existing implementations that can do what you need.

Suppose I want to expose a relational database in my datacenter, Should I implement a SPARQL repository or an HTTP repository?

I don't understand the question. HTTPRepository is a client-side proxy for an RDF4J Server. SPARQLRepository is a client-side proxy for a (non-RDF4J) SPARQL endpoint. Neither has anything to do with relational database.

should I in anyway implement the SAIL api?

It depends on your use case, but I doubt it - at least not right at the outset. I'd probably use an existing R2RML library that is compatible with RDF4J, like for example the R2RML API, or CARML - either a live mapping or an offline batch mapping between the relational data and your triplestore may solve your problem.

Concerning fedX, how can I make it possible to use the SERVICE and VALUES terms as proposed in the SPARQL 1.1 federated queries?

You don't need to "make it possible" to do that, FedX supports this out of the box.

How can I change the Joning strategies? the query plan?

You can't (at least not easily), nor should you want to. Quite a lot of research and development went into RDF4J's and FedX query planning strategies. I'm not saying either is perfect, but you're unlikely to do better.

Jeen Broekstra
  • 21,642
  • 4
  • 51
  • 73
  • Thanks from the answer. I could not reply correctly given that there is a limit to the comment length. Can you please join me in https://groups.google.com/g/rdf4j-users/c/HsoFLhBPUt0 :) – M.Taki_Eddine Mar 23 '20 at 08:19