-3

Here the prepared statement is supposed to insert a line of data from a CSV file into a database row using arrays. The problem is that the lines from the CSV file change a lot so I end up with the array out of bounds error since the prepared statement may call an array index that the line reader does not have.

I've tried different prepared statement methods but none can help here

Path p = Paths.get("ppMonthly.csv");
BufferedReader b;

b = Files.newBufferedReader(p);
String line = "";
while((line = b.readLine()) !=null){
line = b.readLine();
String [] tokens = line.split(",");


PreparedStatement pstm = connection.prepareStatement("INSERT INTO        PropertyPrice             (SaleID,Price,Date,Post,Property,OldNew,Duration,PAON,SAON,Street,Locality,TownCity,District,County,PPD,Records) " +
        "VALUES(?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?);");
pstm.set(1, tokens.length);
pstm.setString(2, tokens[1]);
pstm.setString(3, tokens[2]);
pstm.setString(4, tokens[3]);
pstm.setString(5, tokens[4]);
pstm.setString(6, tokens[5]);
pstm.setString(7, tokens[6]);
pstm.setString(8, tokens[7]);
pstm.setString(9, tokens[8]);
pstm.setString(10, tokens[9]);
pstm.setString(11, tokens[10]);
pstm.setString(12, tokens[11]);  
pstm.setString(13, tokens[12]);
pstm.setString(14, tokens[13]);
pstm.setString(15, tokens[14]);
pstm.setString(16, tokens[16]);
java.lang.ArrayIndexOutOfBoundsException: 16
Mark Rotteveel
  • 100,966
  • 191
  • 140
  • 197
Trill
  • 7
  • 2
  • By the error, one line in the csv has only 15 elements, the 16 elements is missing – Butiri Dan Aug 17 '19 at 21:17
  • Maybe one idea is to get the tokens.length to build the query by its value – Aviza Aug 17 '19 at 21:19
  • Do you have control over the CSV input? If so, it would probably be best to put a null/blank/default value for any missing fields. – KellyM Aug 17 '19 at 21:20
  • Arrays start at index 0. – AndiCover Aug 17 '19 at 21:20
  • 1
    Why do you set as the 1st parameter of the PreparedStatement the length of the array? – forpas Aug 17 '19 at 21:21
  • Do you mean the number of fields vary by line within a single file? Or do you mean your files vary, some files having *x* fields in all their rows with other files having *y* fields in all their rows? – Basil Bourque Aug 18 '19 at 01:11
  • You have an off-by-one error. Java arrays are 0-based, while JDBC parameters are 1-based. Make sure you really shouldn't start with `tokens[0]` instead of `tokens[1]`. In addition, you jump from `tokens[14]` to `tokens[16]`. – Mark Rotteveel Aug 18 '19 at 07:10

2 Answers2

0

The answer depends in what do you want when the lines are different.

You could check the length of obtained tokens array and if it is different of what you want you could skip the insertion of that line or place empty values in fields that you couldn't obtained.

Juan Ramos
  • 557
  • 4
  • 14
-1

Field count varies within files

If you mean a single file can vary in the number of fields, that is a violation of the CSV specification, RFC 4180.

Section 2.4:

… Each line should contain the same number of fields throughout the file. …

You should kick that data back to whomever is publishing it. And educate that publisher about the standard. CSV is such a simple idea, it is amazing how many ways people have found to screw it up.

Field count varies between files

If you mean files vary in their number of fields, then I assume there is a pattern. If some files has 12 fields, and other files have 16 fields, then I would hope you know which fields are in each. If so, read the first line of each file, count the number of fields, and then branch your code. Have one routine in your code that knows the specific fields of the 12-field file. Have another routine in your code that knows the specific fields of the 16-field file.

If the input files are predictable this way, I suggest asking the publisher of the data to include wording in the files’ title to indicate which sort of data it contains. This way you could avoid having to sample and count the fields.

If your files are unpredictable, then no magic solution. You could look at sample values within the file and guess their type, thereby making a guess about the meaning of the columns, but I would consider doing so to be unreasonable. I would reject this faulty data feed back to its publisher.

CSV-processing libraries

Lastly, I have two suggestions:

  • Use a library for reading/writing CSV rather than roll-your-own. You have a choice of a few good libraries. I have successfully made use of Apache Commons CSV.
  • Consider separating the tasks of (a) reading and validating your CSV data, from (b) writing to the database.
Community
  • 1
  • 1
Basil Bourque
  • 303,325
  • 100
  • 852
  • 1,154