1

I am quiet new to java and below is my java code. When I execute this java program I am getting an exception as

java.lang.IndexOutOfBoundsException: Index: 4, Size: 4

I have already find out the reason why i am getting error.

When i open this csv file in normal text editor then i dont see any issue with the data. But when i try to open the file in VI editor in Ubuntu then i can see there is ^M line character and this is causing the exception. When i edit the file and remove the ^M and run the program again then its working fine and inserting data into table.

It is the line break on Windows PCs which is being read as ^M in VIM based editors and i am getting this file from windows and i am reading this in ubuntu.

Here is the screenshot where i can see ^M and it is at the index 4.

I see replaceAll function in java but i dont know how to use it and where exactly i need to use it. I only need to remove ^M and read the file.. Please help

I tried with condition String line = line1.replaceAll("^M",""); but still getting same exception. I am not sure is there any other way to handle this in exception or other logic

Symonds
  • 184
  • 1
  • 2
  • 15
  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackoverflow.com/rooms/203899/discussion-on-question-by-symonds-how-to-fix-arrayindexoutofboundsexception-issu). – Samuel Liew Dec 09 '19 at 10:18
  • ok is it possible just to handle this exception and gave a print message that 'file is corrupt' when this issue occured ? – Symonds Dec 10 '19 at 07:41
  • Please read "How to create a [mcve]". Then use the [edit] link to improve your question (do not add more information via comments). Otherwise we are not able to answer your question and help you. You should include all relevant pieces, so that people can easily reproduce your issue. Dont explain what your code is doing, give us a few lines of code, and ideally input data, so that it can be reproduced easily. – GhostCat Dec 10 '19 at 13:51
  • Beyond that, we cant tell you what your requirements are. There are plenty of different solutions here. You could write **robust** parsing code, that understands such subtle issues around control characters, or you could yes, just flat out throw an exception and go "failed to parse file because X". But nobody here can tell you what you *should* do in order to resolve *your* assignment or task. – GhostCat Dec 10 '19 at 13:52

1 Answers1

0

Based on comments and stack trace, editing the answer;


So here's what's happening; You have a file with certain number of entries in each row that you are doing some action upon.

But due the contol m character (^M), the java code is not behaving as expected.

Your lines where the control M character is observed is basically split into two separate lines when you are reading it from the bufferedReader.readLine() method.

Now ideally, your file would have the number of columns that is already know to you.
But for the lines with control M character, not the columns have been split (as per explaination above).

In my opinion, you can do either of the two;

  1. You can remove the control M characters from your file, either manually or through any linux operation (Refer: remove ^M characters from file using sed)
  2. Change the for loop to run for a limit that is based on columns List instead of the heading List, since columns list is a more appropriate list representing the dynamically split line of the file.
for (int i = 0; i < columns.size(); i++) {

If you go for Option 2, you may also need to change the logic in your loop. Since I am not aware of your DB model and file, I guess you are better equipped to do so.

Shivam Puri
  • 1,578
  • 12
  • 25
  • so i dont need to change in for loop and if i add this in while loop does it fix without specifying for which column field or index it should fix ^M? i thought of putting this condition in for loop ...will it nort work ? value = value.replaceAll("\\s{2,}", " "); – Symonds Dec 07 '19 at 16:21
  • The place where I am asking you to add the code will change all occurrences of the ^M string in that line of the file, irrespective of the "column field". But if you think adding this code in the for loop, to be repeatedly called for all column fields is more efficient for some reason, then please, go ahead! – Shivam Puri Dec 07 '19 at 16:28
  • can you please tell if this condition value = value.replaceAll("\\s{2,}", " "); works and remove ^M in for loop ? and i will also try your code change and check which gives better perforamance.thanks – Symonds Dec 07 '19 at 16:31
  • Your regex \\s{2,} does not seem valid. What pattern are you trying to find here? – Shivam Puri Dec 07 '19 at 16:41
  • actually i found on google and i thought of it will check the index position...what pattern i i should mention if i i want to use this condition in for loop..can you please tell how to edit this value condition...i just want to verfiy both in for loop and while loop performance...thanks – Symonds Dec 07 '19 at 16:55
  • Just write in double quotes whatever exact string you need to be replaced with an empty string. No need to write a regex. If I understand the question correctly, "^M" is sufficient. – Shivam Puri Dec 07 '19 at 16:57
  • i tried in my code both in for loop and while loop the changes which you mentioned and getting the same error again java.lang.IndexOutOfBoundsException – Symonds Dec 09 '19 at 08:08
  • i tried in my code both in for loop and while loop the changes which you mentioned and getting the same error again java.lang.IndexOutOfBoundsException ....i think we might need to handle in exception or any other way this line carriage return..do u have any idea – Symonds Dec 09 '19 at 08:19
  • There are no ^M characters is a string resulting from `readLine()`. Your answer makes no sense. – user207421 Dec 09 '19 at 08:30
  • Thanks @user207421 for pointing this out. Changed my answer based on latest updates to the question. – Shivam Puri Dec 09 '19 at 10:53
  • Hi Shivam, so i think its not special character problem..i tried with all possibility with replaceALL() function but getting same error..i think we have more headings (5) than columns (4)..may be we need to perform own range-check like you said ...i tried to change i < columns.size() but still getting same error..do u know where else i need to make change ? or i think can we handle this in exception ? My DB table has simply 12 columns and no constraints defined and this file also normally has 12 columns but need to delete this special ^M character – Symonds Dec 09 '19 at 11:43
  • I have also mentioned the db table structure now except the last 3 columns the other column name are exactly the same from file name – Symonds Dec 09 '19 at 11:59
  • Your best bet is to understand why it's failing. Do that and then you'll be able to come up with how you can fix this. Read my answer and let me know if you are not getting the logical cause of the array index out of bound exception. Once that is done, you try to come up with a solution. Let's discuss further if that doesn't work! – Shivam Puri Dec 09 '19 at 16:00
  • You can always go with the non programmatic solution as explained in option 1 of my answer. – Shivam Puri Dec 09 '19 at 16:01
  • once i gave up and not able to fix with the code then definately like you said i will choose option 1...i understand that because of ^M its spliting the column, adding additional heading and not matching the exact size of the columns and for this reason array index out of bounds issue coming but really dont understand how to fix it ...i understand that i want to make the spliting dynamic and instead heading.size might need to use columns.size but where else i need to change that not understanding – Symonds Dec 09 '19 at 16:24
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/203960/discussion-between-shivam-puri-and-symonds). – Shivam Puri Dec 10 '19 at 07:54