2

I'm trying to use JOOQ for quering Hive. Hive SQL dialect is pretty clode to MySQL dialect. Right now I've met these problems:

  • Hive supports LIMIT N, it doesn't support LIMIT N OFFSET K. Dummy solution - override select.limit(limit);

What are best practices resolving such problems in JOOQ?

Capacytron
  • 3,425
  • 6
  • 47
  • 80

2 Answers2

3

Unfortunately, extending jOOQ to thoroughly support a new SQL dialect isn't very straightforward. jOOQ's API has grown extensive over time, supporting a great set of standard and vendor-specific SQL syntax variants. While the Apache Hive dialect may appear similar to MySQL, there are probably lots of subtle differences that would need to be implemented in jOOQ's internals. The different implementation of the LIMIT .. OFFSET clause just being one issue. That said, it is generally not a good idea to use jOOQ with an "unknown" or "unsupported" dialect.

Solution: In the short run

In the short run, you will probably have to patch jOOQ's rendered SQL. The best technique for this is to use an ExecuteListener as documented here:

Upon receiving a "renderEnd()" event, you will be able to access the rendered SQL and modify it using regular expressions or whatever technique you may prefer.

Solution: In the long run

In the long run, there may be a better solution if / when #2337 is implemented (but we probably won't implement that)

Lukas Eder
  • 211,314
  • 129
  • 689
  • 1,509
  • Hi, I've spent a lot of time, inspecting the code. Unfortunately, there is no "normal" way to contribute new dialect. I'll look for listeners, I didn't read about them. THank you. – Capacytron Mar 16 '13 at 06:39
  • 1
    http://www.jooq.org/doc/3.0/manual/sql-execution/execute-listeners/ is not valid "final class DefaultConfiguration implements Configuration" is package visible iand there is no way to instantiate it (only creating the same package in my project). What do I do wrong? – Capacytron May 02 '13 at 13:48
  • 1
    Whoops, you're right. That constructor isn't visible. Should be [fixed](https://github.com/jOOQ/jOOQ/issues/2432) – Lukas Eder May 02 '13 at 14:38
  • @Sergey: Yeah, there might be a couple of blunders like this, due to changed API in jOOQ 3.0... Thanks again for reporting – Lukas Eder May 03 '13 at 11:20
1

Here is the dirtiest solution :) JOOQ user group didn't answer unfortunately :(

public class CountRatingQueryBuilder {

    private static final String SCORING_TABLE_NAME = "web_resource_rating";

    private final Connection connection;
    private final ScoringMetadata scoringMetadata;

    private final SelectSelectStep select;
    private final Factory create;

    public CountRatingQueryBuilder(Connection connection, ScoringMetadata scoringMetadata){
        this.connection = connection;
        this.scoringMetadata = scoringMetadata;

        create = new Factory(this.connection, SQLDialect.MYSQL);
        select = create.select();

        withSelectFieldsClause();
    }

    public CountRatingQueryBuilder withLimit(int limit){
        select.limit(limit);
        return this;
    }

    public CountRatingQueryBuilder withRegionId(Integer regionId){
        select.where(REGION_ID.field().equal(regionId));
        return this;
    }

    public CountRatingQueryBuilder withResourceTypeId(int resourceTypeId){
        select.where(RESOURCE_TYPE_ID.field().equal(resourceTypeId));
        return this;
    }

    public CountRatingQueryBuilder withRequestTimeBetween(long beginTimestamp, long endTimestamp){
        select.where(REQUEST_TIME.field().between(beginTimestamp, endTimestamp));
        return this;
    }

    public CountRatingQueryBuilder withResourceId(int resourceId){
        select.where(RESOURCE_ID.field().equal(resourceId));
        return this;
    }



    protected void withGroupByClause(){
        select.groupBy(REGION_ID.field());
        select.groupBy(RESOURCE_TYPE_ID.field());
        select.groupBy(RESOURCE_ID.field());
        select.groupBy(CONTENT_ID.field());
    }

    protected void withSelectFieldsClause(){
        select.select(REGION_ID.field());
        select.select(RESOURCE_TYPE_ID.field());
        select.select(CONTENT_ID.field());
        select.select(RESOURCE_ID.field());
        select.select(Factory.count(HIT_COUNT.field()).as(SUM_HIT_COUNT.fieldName()));
    }

    protected void withFromClause(){
        select.from(SCORING_TABLE_NAME);
    }

    protected void withOrderByClause(){
        select.orderBy(SUM_HIT_COUNT.field().desc());
    }

    public String build(){
        withGroupByClause();
        withOrderByClause();
        withFromClause();
        return select.getSQL().replace("offset ?","");//dirty hack for MySQL dialect. TODO: we can try to implement our own SQL dialect for Hive :)

    }

    public List<ResultRow> buildAndFetch(){
        String sqlWithPlaceholders = build();

        List<ResultRow> scoringResults = new ArrayList<ResultRow>(100);
        List<Record> recordResults = create.fetch(sqlWithPlaceholders, ArrayUtils.subarray(select.getBindValues().toArray(new Object[select.getBindValues().size()]),0, select.getBindValues().size()-1));//select.fetch();
        for(Record record : recordResults){
            ResultRowBuilder resultRowBuilder = ResultRowBuilder.create();

            resultRowBuilder.withContentType(scoringMetadata.getResourceType(record.getValue(RESOURCE_TYPE_ID.fieldName(), Integer.class)));
            resultRowBuilder.withHitCount(record.getValue(SUM_HIT_COUNT.fieldName(), Long.class));
            resultRowBuilder.withUrl(record.getValue(CONTENT_ID.fieldName(), String.class));
            scoringResults.add(resultRowBuilder.build());
        }
        return scoringResults;
    }

}
Capacytron
  • 3,425
  • 6
  • 47
  • 80