0

I know that is an error that has already been asked many times, but I can't find where I have the problem. The error It is shown is the following:

Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 393, Size: 393
at java.util.ArrayList.rangeCheck(ArrayList.java:653)
at java.util.ArrayList.get(ArrayList.java:429)
at scraping.complementos_juegos.main(complementos_juegos.java:305)

There are many things I don't understand. The first line, where it shows Index: 393, Size: 393, what does that mean? The index and size of the array?

Let's go to the code:

1.- I scrap more than 2.700 links that are saved in an array called all_links. As I want to store a lot of information I am using a bidimensional ArrayList called listaEmpresaA:

ArrayList<ArrayList<String>> listaEmpresaA = new ArrayList<ArrayList<String>>();

    String [] paises = {"USA"};

    int total_columnas = 2 + (paises.length*3);

    //CREATING THE COLUMS
     for(int i =0; i< total_columnas; i++){
                listaEmpresaA.add(new ArrayList<String>());
            }

     //DEFINITION OF THE ROWS


     //<--------------- START OF THE HEADER DEFINITION

     listaEmpresaA.get(0).add("Juego");
     listaEmpresaA.get(1).add("URL");



     for (z=0 ; z<paises.length; z++) {
         for (int j=2; j<total_columnas ; j=j+3 ) {
             listaEmpresaA.get(j).add(paises[z]);
             listaEmpresaA.get(j+1).add(paises[z] + " Gold");
             listaEmpresaA.get(j+2).add(paises[z] + " sin Gold");
         }
     }

    int filas = 1; //JUST TO KNOW THE AMOUNT OF ROWS I HAVE

     //<--------------- FINISH OF THE HEADER DEFINITION


    //<--------------- STARTING OF THE SCRAPING FOR EACH LINK

    int contador_juegos = 1;

    for (String link : all_links) {

     String urlPage = "https://www.microsoft.com" + link;
     System.out.println(contador_juegos + ".- Comprobando entradas de: "+urlPage);

     if (getStatusConnectionCode(urlPage) == 200) {

         Document document = getHtmlDocument(urlPage);

         Elements entradas = document.select("div.page-header div.m-product-detail-hero-product-placement div.context-product-placement-data");

         for (Element elem : entradas) {
             String titulo = elem.getElementsByClass("c-heading-2").text();

             System.out.println(titulo+"\n");
             listaEmpresaA.get(0).add(titulo);
             listaEmpresaA.get(1).add(urlPage);

         }

         entradas = document.select("div.price-info");

         for (Element elem : entradas) {
             String titulo = elem.getElementsByTag("s").text();

             System.out.println("Precio base: " + titulo+"\n");
             listaEmpresaA.get(2).add(titulo);

         }

         entradas = document.select("div.price-info");

         for (Element elem : entradas) {
             String titulo = elem.getElementsByClass("price-disclaimer").text();

             System.out.println("Precio para los miembros sin GOLD: " + titulo+"\n");
             listaEmpresaA.get(3).add(titulo);


         }

         entradas = document.select("dd.cli_upsell-options div.cli_upsell-option");

         // Paseo cada una de las entradas
         for (Element elem : entradas) {
             String titulo = elem.getElementsByClass("price-disclaimer").text();

             System.out.println("Precio para los miembros GOLD: " + titulo+"\n");
             listaEmpresaA.get(4).add(titulo);

         }

         filas++;

     }

     contador_juegos++;
    }


    //<--------------- FINISH OF THE SCRAPING FOR EACH LINK BAZAR USA   

2.- Create the Excel and store the information from listaEmpresaA arrayList to the Excel.

try {
         //create .xls and create a worksheet.
         FileOutputStream fos = new FileOutputStream("D:\\mierda.xls");
         HSSFWorkbook workbook = new HSSFWorkbook();
         HSSFSheet worksheet = workbook.createSheet("XboxOne");

            int l=0;

                //CREATING EXCEL ROWS
             for (int f=0; f< filas ; f++) {
                HSSFRow fila = worksheet.createRow(f);

                //CREATING EXCEL COLUMNS
                for(int c=0;c<total_columnas;c++){
                       HSSFCell celda = fila.createCell(c);
                       celda.setCellValue(listaEmpresaA.get(c).get(f)); //<----- THIS IS THE LINE 305 WHERE I HAVE THE ERROR
                       l++;

                }      
             }

        //Save the workbook in .xls file
         workbook.write(fos);
         fos.flush();
         fos.close();
     } catch (FileNotFoundException e) {
         e.printStackTrace();
     } catch (IOException e) {
         e.printStackTrace();
     }

I have many questions and I would really appreciate if you could give me some tips so that I can find the solution my-self:

1.- I don't understand why it is shown the problem at this point Index: 393, Size: 393 when the program has been running until the index 2730 of the total links (2.751 links in total). The last data shown on the console is:

2730.- Comprobando entradas de: https://www.microsoft.com/en-us/store/p/star-wars-pinball-season-1-bundle/brz3mqfjnlmw Star Wars™ Pinball Season 1 Bundle

Precio base:

Precio para los miembros sin GOLD:

Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 393, Size: 393 at java.util.ArrayList.rangeCheck(ArrayList.java:653) at java.util.ArrayList.get(ArrayList.java:429) at scraping.complementos_juegos.main(complementos_juegos.java:305)

2.- When I use the for-each bucle, I have realized that it is not exactly following the order set on the array. I don't know why.

3.- The program spends 1 hour just to scrap information from one store, and I want to store more than 50, ¿is there a way to reduce this time? I have read something about "HashMap" but I don't know how to use them. Anyway if it is a better solution I will take a look.

Thanks in advance!

Have a good day.

JetLagFox
  • 240
  • 4
  • 10
  • 3
    `Index: 393, Size: 393` means you tried to access index 393 of an array that has 393 elements in total. A size of 393 means the last accessible index is *392*, because arrays are zero-indexed in Java. – JonK Jan 09 '17 at 11:45
  • `listaEmpresaA.get(j+1).add(paises[z] + " Gold");` => but you are looping to the size of the list. So don't use `j+1` and `j+2` or loop up to `j < list.size() - 2`... – assylias Jan 09 '17 at 11:47
  • @JonK It's strange because I don't have any array with that length. The length of `all_links` is 2.751 and the bidimensional ArrayList should be 5 columns * 2.751 rows. Totally lost with this. – JetLagFox Jan 09 '17 at 12:20
  • @assylias Thanks for your response. I have run that part of the code my-self but I don't see why is wrong the code. The result is this [link](https://fotos.subefotos.com/6f2b8b2422a69e65cd586149ae8c1051o.png), creating 5 columns. – JetLagFox Jan 09 '17 at 12:25
  • Instead of using `j, j+1, j+2` you maybe wanted `j-2, j-1, j`... – assylias Jan 09 '17 at 12:42
  • @assylias This way you mean? [link] (https://fotos.subefotos.com/c88d9a5779df982d2a1cdd3fd59dbeeao.png). – JetLagFox Jan 09 '17 at 12:52
  • @JetLagFox yes that should at least fix one "out of bounds exception" - but then I don't know what your code is supposed to do so maybe that's not what you want. – assylias Jan 09 '17 at 12:53
  • @assylias Let's see, thanks anyway. I'm running the code to see if it doesn't show me that error. – JetLagFox Jan 09 '17 at 12:59
  • @assylias It keeps throwing the same error. – JetLagFox Jan 09 '17 at 14:03
  • I hadn't realised that you had indicated where line 305 is. `listaEmpresaA.get(c).get(f)`: either c is too large or f is. just add a print statement before that line with the values of c and f and you will see when the exception is thrown - are you 100% sure that all the inner lists have the same "filas" number of elements? By the way, the changes I suggested in a previous comment are required anyway. – assylias Jan 09 '17 at 14:19
  • @assylias Hi again. I have tried what you asked me and here is the values for "c" and "f". I have tried reducing the for condicional to only 10 elements, otherwise I will be still waiting to the program finish running. Here are the results: [link](https://fotos.subefotos.com/184c2dbc6633e57e69e15c50fd4e80c3o.png). As you can see I keep having the same problem even I reduce the for conditional, so I supposed that the type integer for c and f variables is not the problem. Something I kept doing wrong when I create the Excel, but I can't see where, for me is correct. – JetLagFox Jan 09 '17 at 15:18
  • OK so your problem is obviously that some of the inner lists are smaller than your `filas` variable. – assylias Jan 09 '17 at 15:31
  • @assylias To be honest I don't see where is the problem. I am still giving different values and its nonsense. Really really annoyed, even running the code mentally and writing the values in a paper I don't see anything wrong. I don't understand what's happening. – JetLagFox Jan 09 '17 at 16:36
  • You can try to create a [mcve] and ask a separate question with that new, shorter code. Or you can try to follow the steps on [this blog post](https://ericlippert.com/2014/03/05/how-to-debug-small-programs/). – assylias Jan 09 '17 at 17:25

0 Answers0