3

The db schema contains of tables AUTHOR and BOOK with columns given below:

AUTHOR: author_id, name
BOOK: book_id, name, author_id

I use JOOQ to fetch the data to java.

I use POJOs Book and Author:

public class Author { int author_id; String name; }
public class Book { int book_id; String name; }

And code:

Result<Record> records = ctx.select()
            .from(AUTHOR)
            .leftOuterJoin(BOOK)
            .on(AUTHOR.AUTHOR_ID.eq(BOOK.AUTHOR_ID))
            .fetch();
Author author = records.get(0).into(Author.class);
Book book = records.get(0).into(Book.class);

The problem is that bookobject is filled with default values when there is no book for an author. How to detect that the left join resulted with no objects of the right table? One way to do it would be checking if book primary key field is null. Is there any other way?

Marcin Król
  • 1,555
  • 2
  • 16
  • 31
  • There are always many ways to do the same thing. What's bothering you with this solution? – Lukas Eder Oct 08 '15 at 13:19
  • It didn't seem elegant to me at first but JOOQ is about avoiding too distant abstraction from SQL so maybe that's how you should do that. – Marcin Król Oct 08 '15 at 13:35
  • 1
    Well, the SQL way to create nested collections [is by using `MULTISET`](http://stackoverflow.com/questions/27038707/oracle-cast-and-multiset-avaliable-in-postgres). Unfortunately, few databases really support this... `LEFT OUTER JOIN` denormalises your relationships as soon as you want to "join" more than one child relation, e.g. books *and* addresses to an author. I'll write up a more elaborate answer soon – Lukas Eder Oct 08 '15 at 21:10

1 Answers1

2

There's a conceptual mismatch between what the object model would like to express, and what SQL is able to produce. In the object model, people like to model relationships as nested collections, such as:

Author {
    List<Book> books;
}

Whereas in an RDBMS, people model relationships the other way round, having foreign keys on child entities:

CREATE TABLE book (
    author_id bigint REFERENCES author(id)
)

The most straightforward way to produce a nested collection (as in the object model) from SQL would be to use the MULTISET operator, as specified in the SQL standard:

SELECT
  author.*,
  MULTISET(
    SELECT * FROM book WHERE book.author_id = author.id
  ) AS books
FROM author;

Unfortunately, this operator is only supported in few databases, including Cubrid, Informix, Oracle, and it can be emulated in PostgreSQL. Note that future versions of jOOQ will also support, and possibly emulate it.

Lacking support for this feature, people often resort to using tricks, such as:

  • Lazy loading
  • Running a new query per entity
  • Running a single query with LEFT JOIN, the way you did

All of these approaches are tricks employed by ORMs like Hibernate, each with their individual drawback, which is documented in this blog post. ORMs don't always make the "right" choice of trick (or rather: users don't configure ORMs correctly), which is why they run a lot of queries causing performance issues.

One of the best approaches among the above is probably the "new query per entity" approach, where you first fetch all the authors, and then all the books for each author in 2 queries (not in N+1 queries!):

SELECT * FROM author WHERE [ some predicate ]
SELECT * FROM book WHERE author_id IN (SELECT id FROM author WHERE [ some predicate ])

You can then materialise each result with jOOQ, and use hashmaps to match authors and books in Java memory.

Community
  • 1
  • 1
Lukas Eder
  • 211,314
  • 129
  • 689
  • 1,509
  • That last query pair is what I do in LLBLGen Pro: I fetch the child nodes in the eager load graph using a filter on the parent nodes, and have an optimization where I switch to an IN(v1, v2, .. vn) instead of a correlated subquery like you have above. I then merge the sets in-memory using hashes. – Frans Bouma Oct 09 '15 at 09:18
  • That `IN` predicate is not always an optimisation :) You'll generate tons of cursor cache contention in Oracle that way. The interesting question is: How do you write the fetch from the client side? – Lukas Eder Oct 09 '15 at 09:24