3

There is a table phonenumbers with two columns: id, and number. There are about half a million entries in the table. Database is MySQL.

The requirement is to develop a simple Java EE application, connected to that database, that allows a user to download all numbervalues in comma separated style by following a specific URL.

If we get all the values in a huge String array and then concatenate them (with comma in between all the values) in a String and then send it down to the user, does it sound a proper solution?

The application is not public and will be used by a limited no. of people.

Arjan Tijms
  • 37,782
  • 12
  • 108
  • 140
craftsman
  • 15,133
  • 17
  • 70
  • 86
  • Having half a million comma separated list of values does not sound good to me. May be newline separated is okay so that user can easily open this file in text editor if required. But this depends on how users wants to use this list of values. Can you please elaborate on user requirements? – Gladwin Burboz Feb 02 '10 at 01:40
  • CSV is a user requirement because it is supported by MS Excel. I am not sure how they are going to use it, probably they would use it for reporting purpose. – craftsman Feb 02 '10 at 05:43
  • CSV's map to columns in excel. If you have half a million CSV entries, then how is going to open correctly in excel? Excel does not have half a million columns. You need to as users how are they going to use it. Most of the time users are not sure of their requirements. – Gladwin Burboz Feb 02 '10 at 16:34

3 Answers3

9

Your best bet is to not store the data in Java's memory in any way, but just write the obtained data to the response immediately as the data comes in. You also need to configure the MySQL JDBC driver to serve the resultset row-by-row by Statement#setFetchSize() as per the MySQL JDBC driver documentation, otherwise it will cache the whole thing in memory.

Assuming you're familiar with Servlets, here's a kickoff example which takes that all into account:

protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
    response.setContentType("text/plain");
    response.setHeader("Content-Disposition", "attachment;filename=numbers.txt"); // Force download popup.

    Connection connection = null;
    Statement statement = null;
    ResultSet resultSet = null;
    Writer writer = response.getWriter();

    try {
        connection = database.getConnection();
        statement = connection.createStatement(ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY);
        statement.setFetchSize(Integer.MIN_VALUE);
        resultSet = statement.executeQuery("SELECT number FROM phonenumbers");

        while (resultSet.next()) {
            writer.write(resultSet.getString("number"));
            if (!resultSet.isLast()) {
                writer.write(",");
            }
        }
    } catch (SQLException e) {
        throw new ServletException("Query failed!", e);
    } finally { 
        if (resultSet != null) try { resultSet.close; } catch (SQLException logOrIgnore) {}
        if (statement != null) try { statement.close; } catch (SQLException logOrIgnore) {}
        if (connection != null) try { connection.close; } catch (SQLException logOrIgnore) {}
    }
}
BalusC
  • 1,082,665
  • 372
  • 3,610
  • 3,555
1

There's a bit more to properly formatting CSV output. It would be easiest to use an existing library such as this one to generate the output file.

You can generate output to a file on disk (on the web server) and then redirect the browser to that file (with a cron job or whatever to clean up old data) or just stream the result directly back to the user.

If you are streaming directly be sure and set the MIME type to something that will trigger a download in the user's browser (e.g. text/csv or text/comma-separated-values)

Eric J.
  • 147,927
  • 63
  • 340
  • 553
  • I agree but just want to add http://supercsv.sourceforge.net/ as a free alternative CSV reader/writer library – fvu Feb 01 '10 at 23:45
  • I never worked with CSV before. Can you please elaborate why do we need a full library for this purpose, as apparently it only seems to be writing values separated by commas with a Writer (as can be seen in BalusC's example)? – craftsman Feb 02 '10 at 05:50
  • In the general case your data can contain commas, which throw off the field order if not properly handled since fields are separated by commas. To handle that, you quote your fields that might contain commas. Then again, your data might contain quote characters, which throws off the quoting if not properly handled. You have to escape the quotes. Rather than do this by hand, I find it easier to use a pre-existing library. If you are 100% confident your data will never contain a quote or comma (including typos) you can just write your fields separated by commas. – Eric J. Feb 02 '10 at 17:16
0

If using Mysql 5.1+, I would simply use the proprietary syntax to dump the file somewhere and stream it in a Servlet response.

SELECT a,b,a+b INTO OUTFILE '/tmp/result.txt'
  FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
  LINES TERMINATED BY '\n'
  FROM test_table;

http://dev.mysql.com/doc/refman/5.1/en/select.html

For so many records, if you still want to use JDBC, you may try the following:

  • fetch the number of records fetch few records( using a query limit ) and write them
  • if you reach the number of records in a chunk, you fetch another one until you reach the maximum number of records
John Doe
  • 869
  • 5
  • 10
  • Assuming the web server shares a file system with the DB. In many deployment scenarios, it does not. – Eric J. Feb 02 '10 at 00:50
  • In my case as well, DB and application are running on different machines. – craftsman Feb 02 '10 at 05:47
  • I believe my JDBC suggestion is among the same lines as the fetch size suggestion for the query to avoid caching the all thing in memory and avoid the filesystem sharing issue. – John Doe Feb 03 '10 at 10:02