In the very early days of making jOOQ, I had considered both options, of accessing JDBC ResultSet
values by index or by name. I chose accessing things by index for these reasons:
RDBMS support
Not all JDBC drivers actually support accessing columns by name. I forgot which ones didn't, and if they still don't, because I never touched that part of JDBC's API again in 13 years. But some didn't and that was already a show stopper for me.
Semantics of the name
Furthermore, among those that do support column names, there are different semantics to a column name, mainly two, what JDBC calls:
There is a lot of ambiguity with respect to implementations of the above two, although I think the intent is quite clear:
- The column name is supposed to produce the name of the column irrespective of aliasing, e.g.
TITLE
if the projected expression is BOOK.TITLE AS X
- The column label is supposed to produce the label (or alias) of the column, or the name if no alias is available, e.g.
X
if the projected expression is BOOK.TITLE AS X
So, this ambiguity of what a name/label is is already very confusing and concerning. It doesn't seem something an ORM should rely on in general, although, in Hibernate's case, one can argue that Hibernate is in control of most SQL being generated, at least the SQL that is produced to fetch entities. But if a user writes an HQL or native SQL query, I would be reluctant to rely on the name/label - at least without looking things up in ResultSetMetaData
, first.
Ambiguities
In SQL, it's perfectly fine to have ambiguous column names at the top level, e.g.:
SELECT id, id, not_the_id AS id
FROM book
This is perfectly valid SQL. You can't nest this query as a derived table, where ambiguities aren't allowed, but in top level SELECT
you can. Now, what are you going to do with those duplicate ID
labels at the top level? You can't know for sure which one you'll get when accessing things by name. The first two may be identical, but the third one is very different.
The only way to clearly distinguish between the columns is by index, which is unique: 1
, 2
, 3
.
Performance
I had also tried performance at the time. I don't have the benchmark results anymore, but it's easy to write another benchmark quickly. In the below benchmark, I'm running a simple query on an H2 in-memory instance, and consume the ResultSet
accessing things:
The results are staggering:
Benchmark Mode Cnt Score Error Units
JDBCResultSetBenchmark.indexAccess thrpt 7 1130734.076 ± 9035.404 ops/s
JDBCResultSetBenchmark.nameAccess thrpt 7 600540.553 ± 13217.954 ops/s
Despite the benchmark running an entire query on each invocation, the access by index is almost twice as fast! You can look at H2's code, it's open source. It does this (version 2.1.212):
private int getColumnIndex(String columnLabel) {
checkClosed();
if (columnLabel == null) {
throw DbException.getInvalidValueException("columnLabel", null);
}
if (columnCount >= 3) {
// use a hash table if more than 2 columns
if (columnLabelMap == null) {
HashMap<String, Integer> map = new HashMap<>();
// [ ... ]
columnLabelMap = map;
if (preparedStatement != null) {
preparedStatement.setCachedColumnLabelMap(columnLabelMap);
}
}
Integer index = columnLabelMap.get(StringUtils.toUpperEnglish(columnLabel));
if (index == null) {
throw DbException.get(ErrorCode.COLUMN_NOT_FOUND_1, columnLabel);
}
return index + 1;
}
// [ ... ]
So. there's a hashmap with upper casing, and each lookup also performs upper casing. At least, it caches the map in the prepared statement, so:
- You can reuse it on every row
- You can reuse it on multiple executions of the statement (at least that's how I interpret the code)
So, for very large result sets, it might not matter as much anymore, but for small ones, it definitely does.
Conclusion for ORMs
An ORM like Hibernate or jOOQ is in control of a lot of SQL and the result set. It knows exactly what column is at what position, this work has already been done when generating the SQL query. So, there's absolutely no reason to rely on the column name any further when the result set comes back from the database server. Every value will be at the expected position.
Using column names must have been some historic thing in Hibernate. It's probably also why they used to generate these not so readable column aliases, to make sure that each alias is non-ambiguous.
It seems like an obvious improvement, irrespective of the actual gains in a real world (non-benchmark) query. Even if the improvement had been only 2%, it would have been worth it, because it affects every query execution by every Hibernate based application.
Benchmark code below, for reproduction
package org.jooq.test.benchmarks.local;
import java.io.*;
import java.sql.*;
import java.util.Properties;
import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.infra.*;
@Fork(value = 1)
@Warmup(iterations = 3, time = 3)
@Measurement(iterations = 7, time = 3)
public class JDBCResultSetBenchmark {
@State(Scope.Benchmark)
public static class BenchmarkState {
Connection connection;
@Setup(Level.Trial)
public void setup() throws Exception {
try (InputStream is = BenchmarkState.class.getResourceAsStream("/config.properties")) {
Properties p = new Properties();
p.load(is);
connection = DriverManager.getConnection(
p.getProperty("db.url"),
p.getProperty("db.username"),
p.getProperty("db.password")
);
}
}
@TearDown(Level.Trial)
public void teardown() throws Exception {
connection.close();
}
}
@FunctionalInterface
interface ThrowingConsumer<T> {
void accept(T t) throws SQLException;
}
private void run(BenchmarkState state, ThrowingConsumer<ResultSet> c) throws SQLException {
try (Statement s = state.connection.createStatement();
ResultSet rs = s.executeQuery("select c as c1, c as c2, c as c3, c as c4 from system_range(1, 10) as t(c);")) {
c.accept(rs);
}
}
@Benchmark
public void indexAccess(Blackhole blackhole, BenchmarkState state) throws SQLException {
run(state, rs -> {
while (rs.next()) {
blackhole.consume(rs.getInt(1));
blackhole.consume(rs.getInt(2));
blackhole.consume(rs.getInt(3));
blackhole.consume(rs.getInt(4));
}
});
}
@Benchmark
public void nameAccess(Blackhole blackhole, BenchmarkState state) throws SQLException {
run(state, rs -> {
while (rs.next()) {
blackhole.consume(rs.getInt("C1"));
blackhole.consume(rs.getInt("C2"));
blackhole.consume(rs.getInt("C3"));
blackhole.consume(rs.getInt("C4"));
}
});
}
}