5

I used to use matrix in octave to store data from data set, in Java how can I do that? Assume I have 10-20 columns and large data, I don't think

int [][]data; 

would be the best option. Is nested map the only solution?

Alba Mendez
  • 4,432
  • 1
  • 39
  • 55
tnaser
  • 181
  • 2
  • 4
  • 11
  • Map is the most fitted solution and the easiest to use – Adel Boutros Jan 23 '12 at 17:25
  • 1
    Why not `int[][]`? When you say _store_, you mean **writing this data to the disk** or just **storing/handling it on the memory**? – Alba Mendez Jan 23 '12 at 17:25
  • 1
    @AdelBoutros really? IMHO, if you want a simple matrix, a 2D array is the way to go. If you use a `Map`, you'll have to check that _all nested maps have the same size_, which is not necessary on 2D arrays. – Alba Mendez Jan 23 '12 at 17:29
  • Not sure I agree with the nested-array-solution; the reasons are adaptiveness of the size, and that I don't really know how large the set of data is gonna be and how nested arrays are handled in main memory. Obviously that is not too much of an issue probably since I doubt he'll build a nested array that hase more than the 4GB main memory that are probably somewhat common place by now. Nevertheless, it might be a performance issue if the code is supposed to be run on a machine that has relevant restrictions on memory-an embedded environment for example. I'd go for SJuan76's option either way. – G. Bach Jan 23 '12 at 17:42
  • **1)** As I said in [my answer](http://stackoverflow.com/a/8976303/710951) (which combines those two approaches) SJuan76's option is good when you have dispersed “points”; if you have contineous data (e.g. every coordinate on the matrix has data) it's definitively better to store them ordered in a 2D-array. **2)** He doesn't say if he wants the matrix to be adaptative or fixed, so the correct IMHO is to give him the two ways. **Don't you think so?** – Alba Mendez Jan 23 '12 at 18:02

7 Answers7

5

You could create a class Coordinate that takes an X and Y values and properly implement hashCode and equals.

Then create a HashMap<Coordinate, Data> and work with it.

SJuan76
  • 24,532
  • 6
  • 47
  • 87
  • +1 You're proposing a simple, effective and adaptative (you don't need to know the size of the matrix) way to do it. – Alba Mendez Jan 23 '12 at 17:36
  • One thing: why to make your own `Coordinate` if you can use [`java.awt.Point`](http://docs.oracle.com/javase/6/docs/api/java/awt/Point.html)? See [my answer](http://stackoverflow.com/a/8976303/710951). – Alba Mendez Jan 23 '12 at 18:13
  • 1
    Didn't think of it, but anyway I do not like relying in awt if I do not need to (possible problems if you launch java in a system without GUI -yes I know they are solvable, but why take the risk-) – SJuan76 Jan 23 '12 at 18:16
  • There isn't any problem. Classes such as `Point` or `Color` are always present, and they can be used only as utilities (to make geometry, store data, ...) even if there's no GUI in your program. The problem comes when you _try to create_ a GUI (i.e. a `Frame`) on a non-graphical system. **I think it's safe to use `Point`.** – Alba Mendez Jan 23 '12 at 18:25
4

Depends on what you need to do. If you know the size of the lists, then an array is definitely ideal since it means you will have instant access (read/write time) to any position in the array, this is very useful for speed.

Maps are better if you dont know the size and it needs to be able to adapt.

And finally, as I discovered in a previous question, if you have a TON of data, and a lot of it will be "0" you might want to also consider using a Sparse Martrix

Community
  • 1
  • 1
gnomed
  • 5,483
  • 2
  • 26
  • 28
2

This answer merges some of gnomed's answer and SJuan76's answer contents.

  1. At a quick glance, I'd suggest you to use bidimentional arrays such as int[][].
    It's not a very huge amount of data (we're speaking of ≈500 ints) so it's not a bad idea.

    Advantages: It's the simpler, ideal (from the data-structuring side) way to go,
    especially if every “slot” of the matrix contains data.

    The inconvenient: You have to know the size of the matrix before constructing it.
    Anyway, you can resize it later using the Arrays utilities.

  2. If you want more effective handling of the data, you can use a single point-map.
    That is, the key of every entry is a java.awt.Point that defines where is the value located.

    Advantages: It's more effective than having a 2D array,
    especially if part of your matrix doesn't contain data.
    And it's adaptative; you don't need to know any sizes to construct/resize it.

    The inconvenient: If every “slot” of your matrix contains data,
    you'll loose (a lot of) space and performance. A 2D-array is more effective then.

  3. Want more? If your data is really huge you can use a sparse matrix.
    See this question for more details.

Community
  • 1
  • 1
Alba Mendez
  • 4,432
  • 1
  • 39
  • 55
0

I would not discard multidimensional arrays so far: have you tried them? Are you finding specific limitations? IMHO as long as your data fits in memory, arrays can be good.

If your data is very sparse though, you may want to look at maps indeed.

Related question btw: Making a very large Java array

Community
  • 1
  • 1
Savino Sguera
  • 3,522
  • 21
  • 20
  • I used them , but I was reading about arrays limitations and don't know if this apply to maps also – tnaser Jan 23 '12 at 17:41
0

You can use multidimensional arrays or you can try any pairs like HashMap

nidhin
  • 6,661
  • 6
  • 32
  • 50
0

I think multi-dimentional arrays are the best choice! They should serve your purpose. If your data set is only integers, int [] [] is an ideal choice.

Purushottam
  • 624
  • 1
  • 8
  • 18
0
  • Well, if your indices are small integers, you can certainly use nested arrays.
  • In a matrix class, you may want to use a plain array, like so: (assuming n is the number of columns)
double get(int i, int j) { return data[i*n + j]; }
pron
  • 1,693
  • 18
  • 28