-2

I have a data for some statistical calculations,

        item1, item2, sim 
    ...
        9,471,0.9889886
        19,591,0.98890734
        453,15,0.98842293
        10,20,0.98759043
        68,713,0.9847893
        71,582,0.9836043
        95,13,0.98339003
        42,190,0.9832126
        1,52,0.9828053
        102,275,0.981985
        110,1115,0.9810662
        203,116,0.98054993
        1098,14,0.98008114
        13,56,0.9965508
        7,22,0.9963229
        69,96,0.9959787
        896,79,0.9959084
    ...

nearly 20k rows.

In Java i want to populate an array like below from this data

array[col1][col2] = value

The problem is 'col1' and 'col2' values are not sequential, ordered and there are gaps between min-max values.

As i expected if I want to fill this data in a java array it is giving me 'IndexOutOfBoundsException' as i expected.

I tried some ArrayList, Matrix but they have been started indis from zero and it lose meaning of data. I want to compare this data with itself again in an iteration and want some calculations distance etc.

Which Java Collection API do you refer for this?

Edit: I pasted my code for the comments

public class CSVFileReader {

  public static final double[][] readItemSimilarityFromFile(File f)
      throws IOException {

    try (final BufferedReader br = new BufferedReader(new FileReader(f))) {

        // we have 20k line
        final int[][] matrix = new int[20000][20000];

        while ((String line = br.readLine()) != null) {
            int values = line.split(",", -1);
            matrix[values[0]][values[1]] = values[2];
        }
    }

    br.close();

    return matrix;

  }

}
Yilmazerhakan
  • 1,765
  • 1
  • 13
  • 21
  • 1
    "Questions seeking debugging help ("**why isn't this code working**?") must include the desired behavior, *a specific problem or error* and *the shortest code necessary to reproduce it in the question itself*. Questions without **a clear problem statement** are not useful to other readers. See: [How to create a Minimal, Complete, and Verifiable Example](https://stackoverflow.com/help/mcve)" – gparyani Dec 14 '17 at 07:40
  • @gparyani I searched in web. I tried many many codes and solutions. So i couldnt share all of them. It is a common problem for me as newbie Java coder. In PHP you can fill an array without unsequential data. So i want to ask how this could be in Java? – Yilmazerhakan Dec 14 '17 at 07:46
  • What got you the exception in the question? – gparyani Dec 14 '17 at 07:47
  • @gparyani I explained above if i want to use col1 and col2 as array indices it gives 'IndexOutOfBoundsException' exception because in java you must define array size and array indices must be start from zero and sequential. So i asked which Collection or method i must use? – Yilmazerhakan Dec 14 '17 at 07:50
  • 1
    @Yilmazerhakan It is roughly a prerequisite for this type of questions that you at least post some code. You say that you got an `IndexOutOfBoundsException`, but this exception won't occur without some runnable code. So, show your attempt, at least one of them. – MC Emperor Dec 14 '17 at 07:50
  • Start with solution at https://stackoverflow.com/questions/44416151/java-most-efficient-way-to-read-in-a-csv-file-with-various-data-types, then paste your code when you get stuck – tkruse Dec 14 '17 at 07:51
  • I added code below question. I asked for suggestions not solution. Why minuses? – Yilmazerhakan Dec 14 '17 at 08:02
  • If @OldCurmudgeon's answer for this question in the way I want, then the question I am asking is not wrong or incomplete. – Yilmazerhakan Dec 14 '17 at 08:21

1 Answers1

1

You could use a sparse matrix.

class Sparse<T> {
    Map<Integer, Map<Integer, T>> matrix = new HashMap<>();

    public T get(int row, int col) {
        Map<Integer, T> r = matrix.get(row);
        return r != null ? r.get(col) : null;
    }

    public void set(int row, int col, T t) {
        Map<Integer, T> theRow = matrix.get(row);
        if (theRow == null) {
            theRow = new HashMap<Integer, T>();
            matrix.put(row, theRow);
        }
        theRow.put(col, t);
    }
}

public void test(String[] args) {
    Sparse<Double> s = new Sparse<>();
    s.set(9, 471, 0.9889886);
    s.set(19, 591, 0.98890734);
    s.set(453, 15, 0.98842293);
    s.set(10, 20, 0.98759043);
    s.set(68, 713, 0.9847893);
    s.set(71, 582, 0.9836043);
    s.set(95, 13, 0.98339003);
    s.set(42, 190, 0.9832126);
    s.set(1, 52, 0.9828053);
    s.set(102, 275, 0.981985);
    s.set(110, 1115, 0.9810662);
    s.set(203, 116, 0.98054993);
    s.set(1098, 14, 0.98008114);
    s.set(13, 56, 0.9965508);
    s.set(7, 22, 0.9963229);
    s.set(69, 96, 0.9959787);
    s.set(896, 79, 0.9959084);
}
OldCurmudgeon
  • 64,482
  • 16
  • 119
  • 213