Java program should support multiple Databases and variable input

Question

I'm writing a Java Program right now, which reads files and writes the content of these files (after some modifications) into an relational database.

My problem right now is that the program should support a wide range of databases and not only one.

So in my program I create SQL statements and commit them to the DB - no problem. (SAP HANA) Now I want so add another DB (MySQL) and have to slightly change the SQL syntax of the query before committing.

My solution right now is copying the code block, that creates the statements and make the DB specific changes to it. But that obviously can't be it.(to many databases -> 80% code never used) I probably need some kind of mapper, that converts my sql to a dialect, that the chosen DB understands.

Now, I found out about Hibernate and other mappers, but I don't think they fit my needs. The Problem is that they expect an java object (pojo) and convert them. But since I don't know what kind of data my Program is gonna load, I can not create static objects for each column for example.

Sometimes I need to create 4 columns, sometimes 10. sometimes they are Integer, sometimes Strings / varchar. And all of the time they have different names. So all tutorials I found on hibernate are starting from a point where the program is certain what kind of data is going to be inserted into the db which my program is not. Moreover I need to insert a large number of lines per table (like a billion+) and I think it might be slow to create a object for each insert.

I hope anyone understands my problem and can give me some hints. maybe a mapper, that just converts sql without the need to create a object before.

thank you very much! : )

edit: to make it more clear: the purpose of the programm is to fill up a relational db with data that is stored / discribes in files like csv and xml ). so the db is not used as a tool to store the data but storing the data there is the main aim. I need a realtional db filled up with data that the user provides. and not only one db, but different kinds of rdbs

Well, you could use reflection to create objects at runtime; if you're dealing with a billion+ rows, the extra overhead there isn't going to be a problem. The question I'd have at that point, though, is: is your backend db dynamic too? Why do you think you need this functionality? Usually, by the time you get to full production-sized dbs, the base design has been locked in (for a number of reasons). I fully expect that there _is_ a library to do this for you, probably even one that'll talk to/use Hibernate (or one of the other mappers). — Clockwork-Muse, Feb 20 '14 at 13:37
The dbs are plain. I create the tables on the fly. if that is what you mean by dynamic. I'm quite new to this... — user3332816, Feb 20 '14 at 13:50
Hey, at a billion rows you don't just want to blindly create a new db, that's going to kill you/your system. When a database gets to that size, the location/way the db _actually physically_ stores the data becomes important (SQL itself largely abstracts this from queries, but it may be important on the backend). Most systems also have (RDBMS-specific) `LOAD` utilities - this skips `INSERT`s, but usually requires the table already be created. — Clockwork-Muse, Feb 20 '14 at 14:02
Besides the writing into the db throug JDBC the programm also supports the export of csv Files to load into the db. So in that case the programm only creates the table, but the inserts are done by the bulk loader of the rdbms . sadly that is not an option for all rdbms, so the JDBC way is also needed : ( — user3332816, Feb 20 '14 at 14:16

light_303 · Answer 1 · 2014-02-20T13:34:36.157

1

I think you are describing a perfect use for a file system. Or if you want to go with a filesystem abstraction:

have a look at the apache jackrabbit project

edited Feb 20 '14 at 13:34

answered Feb 20 '14 at 13:24

light_303

2,101
2
18
35

My read of the question says: There is no business side. – SebastianH Feb 20 '14 at 13:31
thanks for the reply. I think I did a terrible job on discribing my programm's purpose: I want to insert data from some files(csv , xml) into several relational databases. So jackrabit in my understanding does not help me here. because the aim is not to store the data somehow, but so fill relational databases with it. So filling up the DB is the purpose not a tool . I hope that helps understanding – user3332816 Feb 20 '14 at 13:41

score 0 · Accepted Answer · answered Feb 20 '14 at 13:38

So basically you want to write a tool that writes a arbitrary text file (some kind of csv I assume) into an arbitrary database system? Creating tables and content on the fly, depending on the structure of the text tile?

Using a high level abstraction layer like hibernate is not gonna take you anywhere soon. What you want to do is low level database interaction. As long as you dont need any specific DBMS dependent features you should go a long way with ANSI sql. If that is not enough, I dont see an easy way out of this. Maybe it is an option to write your own abstraction layer that handles DBMS specific formating of the SQL statments. Doesn't sound nice though.

A different thing to think about is the large number of lines per table (like a billion+). Using single row INSERT statements is not a good idea. You have to make use of efficient mass data interfaces - which are strongly DBMS dependent! Prepared statements is the least measure here.

thanks ! you did understand what I'm trying to do. I create the tables and fill them on the fly. Dependent on the files the programm reads. (csv , xml mostly) I'm using batch insert right now which is way faster than commiting every insert seperatly. I will check out ansi sql. — user3332816, Feb 20 '14 at 13:49
Maybe [this question](http://stackoverflow.com/questions/1714461/ansi-sql-manual) holds some helpful hints to ANSI SQL. — SebastianH, Feb 20 '14 at 15:19

Java program should support multiple Databases and variable input

2 Answers2