0

I have a problem regarding the Resultset of a large database. (MySQLDB, Java 1.7)

The task is to perform a transformation of all the entries of one column into another database. (e.g. divide every number by three and write them into another database)

As the database contains about 70 columns and a few million rows, my first approach would have been to get a SELECT * and parse the Resultset by columns.

Unfortunately I found no way to parse it this way, as the designated way intends to go through it row by row (while(rs.next()) {} etc).

I don't like this way, as it would create 70 large arrays, I would have had only one per time to reduce memory usage.

So here are my main questions:

  1. Is there a way?
  2. Should I either create a query for every column and parse them (one array at a time but 70 queries) or
  3. Should I just get the whole ResultSet and parse it row by row, writing them into 70 arrays?

Greetings and thanks in advance!

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Maltram
  • 79
  • 2
  • 11
  • Do you really mean _other database_ or just _other table_? – jlordo Dec 28 '12 at 09:43
  • is it relevant? but you are right, it is, in fact, just another table – Maltram Dec 28 '12 at 09:47
  • it's very relevant, because if it's just another table, you can use the database to do the job. It will be very efficient and you don't need any java code at all (except if you want java to execute the sql statements). – jlordo Dec 28 '12 at 09:53

2 Answers2

2

Why not just page your queries ? Pull out 'n' rows at a time, perform the transformation, and then write them into the new database.

This means you don't pull everything up in one query/iteration and then write the whole lot in one go, and you don't have the inefficiencies of working row-by-row.

My other comment is perhaps this is premature optimisation. Have you tried loading the whole dataset, and seeing how much memory it would take. If it's of the order of 10's or even 100's of megs, I would expect the JVM to handle that easily.

I'm assuming your transformation needs to be done in Java. If you can possibly do it in SQL, then doing it entirely within the database is likely to be even more efficient.

Community
  • 1
  • 1
Brian Agnew
  • 268,207
  • 37
  • 334
  • 440
  • I was not worried about the size of the resultset but rather the size of the 70 arrays (which I asume have to be smaller than the resultset). But Your approach suggests a lot of querries, which results in a higher traffic. Right now I am not certain which resource is the bottle neck (traffic or memory) – Maltram Dec 28 '12 at 09:54
  • My approach suggests a tuneable number of queries. If you page, then you can select the size of the page (e.g. 500 rows?) such that you load a given amount of data in, and then write each set back in one batch operation. Changing the page size determines a) the amount of memory consumed per query b) the number of writes back into the db – Brian Agnew Dec 28 '12 at 11:18
0

Why don't you do it with mysql only.

use this query :

create table <table_name> as select <column_name_on_which_you_want_transformation>/3 from <table name>;
Brian Agnew
  • 268,207
  • 37
  • 334
  • 440
Sushil Jain
  • 477
  • 1
  • 5
  • 13
  • Because "/3" was just an example and the transformation is much more complex... I thought about it but I'm not sure if it is possible. I will give it another try, though – Maltram Dec 28 '12 at 09:51
  • Ok..Do let us know if you still face any problem. – Sushil Jain Jan 09 '13 at 09:32