4

This is not the rare question on the net, but I does a few optimization work with MySQL server for solve this problem and did not get results. So at first I use maven's package mysql:mysql-connector-java:6.0.6. I try just to run this code:

try {
            mysqlConnection = DriverManager.getConnection(DatabaseUtils.mysqlUrl, DatabaseUtils.mysqlUser, DatabaseUtils.mysqlPassword);
            PreparedStatement valuesStatement = "SELECT * FROM `test` ORDER BY `id`"
            ResultSet cursor = valuesStatement.executeQuery();
            double value = 0;
            if (cursor.next())
                value = cursor.getDouble("value");
        } catch (SQLException sqlEx) {
            sqlEx.printStackTrace();
        } finally {
            cursor.close();
            pricesStatement.close();
        }

I have a lot records in the table. It's about million but every day add about thousand records. So I was very surprised when this simple example executed 30 seconds. I googled my problem and I find only "using pool", "tune mysql server", "try to EXPLAIN SELECT". But I've noticed that execution time related with rows count. So I looked into driver's code and found that:

TextResultsetReader::read():

        while(true) {
            if(row == null) {
                rows = new ResultsetRowsStatic(rowList, cdef);
                break;
            }

            if(maxRows == -1 || rowList.size() < maxRows) {
                rowList.add(row);
            }

            row = (ResultsetRow)this.protocol.read(ResultsetRow.class, trf);
        }

This means that even if I want to fetch only one row driver fetches all queried rows and get me first of it. Manuals suggest to use "setFetchSize" for fetching only n records. But it doesn't work. Driver code fetching all data anyway. So then I found that there is two recordsets: ResultRowsStatic and ResultSetStreaming. Second seems to be fetching data only when I need query it. How to use ResultRowsStreaming? I found it only into code. Parameter "fetchSize" must to equal -2147483648. I did try and it worked! Execution time of "executeQuery()" now if about 0.0007 sec. It's very fast for me. But wait.. My script anyway takes 30 seconds. Why? I debugged code after executing query. There's only two "close" methods after that. What's can go wrong? And that's true, "cursor.close()" takes the rest of time. I looked into library code again and reached ResultsetRowsStreaming::close():

boolean hadMore = false;
int howMuchMore = 0;
synchronized(mutex) {
    while(this.next() != null) {
        hadMore = true;
        ++howMuchMore;
        if(howMuchMore % 100 == 0) {
            Thread.yield();
        }
    }

    if(conn != null) {
        if(!((Boolean)this.protocol.getPropertySet().getBooleanReadableProperty("clobberStreamingResults").getValue()).booleanValue() && ((Integer)this.protocol.getPropertySet().getIntegerReadableProperty("netTimeoutForStreamingResults").getValue()).intValue() > 0) {
            int oldValue = this.protocol.getServerSession().getServerVariable("net_write_timeout", 60);
            this.protocol.clearInputStream();

            try {
                this.protocol.sqlQueryDirect((StatementImpl)null, "SET net_write_timeout=" + oldValue, (String)this.protocol.getPropertySet().getStringReadableProperty("characterEncoding").getValue(), (PacketPayload)null, -1, false, (String)null, (ColumnDefinition)null, (GetProfilerEventHandlerInstanceFunction)null, this.resultSetFactory);
            } catch (Exception var9) {
                throw ExceptionFactory.createException(var9.getMessage(), var9, this.exceptionInterceptor);
            }
        }

        if(((Boolean)this.protocol.getPropertySet().getBooleanReadableProperty("useUsageAdvisor").getValue()).booleanValue() && hadMore) {
            ProfilerEventHandler eventSink = ProfilerEventHandlerFactory.getInstance(conn.getSession());
            eventSink.consumeEvent(new ProfilerEventImpl(0, "", this.owner.getCurrentCatalog(), this.owner.getConnectionId(), this.owner.getOwningStatementId(), -1, System.currentTimeMillis(), 0L, Constants.MILLIS_I18N, (String)null, (String)null, Messages.getString("RowDataDynamic.2") + howMuchMore + Messages.getString("RowDataDynamic.3") + Messages.getString("RowDataDynamic.4") + Messages.getString("RowDataDynamic.5") + Messages.getString("RowDataDynamic.6") + this.owner.getPointOfOrigin()));
        }
    }
}

This code unconditionally fetching all the rest of data only for logging how many records I did not fetched. Really weird. And it would be justified if logger was attached. But in my case this code counting unfetched rows and in 30 seconds and... do nothing with it. And this proble I cannot fix because there's not parameter which can tell code not to count rows.

Now I don't know what to do next. Query time is very slow for me. For example mysql driver for php execute this query in 0.0004-0.001 seconds.

So people who using mysql-connector for Java, tell me please have you got these problems? If not, could you post any examples what should I do to bypass the above problems? Maybe you use another connectors. So tell me please, what to do?

N00b Pr0grammer
  • 4,503
  • 5
  • 32
  • 46
abr_stackoverflow
  • 691
  • 2
  • 10
  • 26
  • If I'm understanding right - you get the results you want but the statement continues to execute the select before it closes out? Have you tried `valuesStatement.cancel()`? [See here](http://stackoverflow.com/questions/295920/how-can-i-abort-a-running-jdbc-transaction) – Michael May 19 '17 at 12:23
  • I tried to add cancel() before close() but it doesn't help too. All the rest data is fetching by driver. I'm using jar, so I can't to edit driver's code. – abr_stackoverflow May 19 '17 at 12:26
  • If you only want one row, tell the server to only send one row, using [`LIMIT 1`](https://dev.mysql.com/doc/refman/5.7/en/select.html#idm140545627869536). – Andreas May 19 '17 at 12:29
  • In my case I don't know what limit I should set. My purpose is more complicated than this example. I SELECT all data, make a few arrays by non linear algorithm and when I got amount of arrays I need I just stop to fetching data. So I can't tell how many rows I need: one or ten thousands. – abr_stackoverflow May 19 '17 at 12:32
  • 1
    Respecting your comment about not knowing how many rows you need: Please [edit] your question to tell us something about your logic for deciding which rows you need. You'll be using `WHERE`, `ORDER BY`, and `LIMIT` clauses in SQL to implement that logic when you have solved this problem. Fetching all rows in a large table is generally an incorrect solution to this kind of problem. It's the kind of mistake that gets your work mentioned in http://thedailywtf.com/ . – O. Jones May 19 '17 at 13:01
  • @O.Jones, you get me good idea and probably I will refactor my code according to your advice. I can't to say my purpose, but thank's for helping. Anyway, I think it's very weird to leave unusable code in the public library, which take many resources (in this case time) just for nothing. – abr_stackoverflow May 19 '17 at 13:14
  • The source to this is available so you could fix it (maven, in particular, makes this fairly easy without having to jump through hoops) but at the same time, the other suggestions are fundamentally right - you should not be executing a million+ row query when you don't need a million+ rows. – pvg May 20 '17 at 02:11

1 Answers1

1

Your SQL query says

SELECT * FROM test ORDER BY id

You are, with that query, instructing your MySQL server to serialize every column of every row of your test table and send it to your Java program. So, MySQL obeys. You have a large table. So your instruction to MySQL takes time. And yes, the more rows in your table the longer it takes. This is not a problem with JDBC or the driver; it's a problem with the SQL you're using.

It seems from your sample code that you want one column -- named value -- from one row -- the first one -- in your table. You could accomplish that using this SQL statement:

 SELECT value FROM test ORDER BY id LIMIT 1

If your id column is your table's primary key, this will be fast.

The whole point of SQL is to allow your tables to contain so many rows that it's unreasonable to fetch them all into your Java (or other) program in a short amount of time. That's why SQL has WHERE and LIMIT clauses.

O. Jones
  • 103,626
  • 17
  • 118
  • 172
  • As I wrote, another MySQL drivers in particular PHP and native mysql query from bash execute this query very fast. And I have indexes for column `id`, so MySQL got to give me first record faster than ever. I insist that this is not MySQL problem and I described why I think so in the question. – abr_stackoverflow May 19 '17 at 13:03
  • 1
    Time to first row is not a valid indicator of the burden you're placing on your scarce resource -- your MySQL server. – O. Jones May 19 '17 at 13:04
  • @abr_stackoverflow i absolutely agree to this answer. If you need 100 rows, put Limit 100. If you dont know how may rows you are going to need, page over the results. – marstato May 20 '17 at 00:41