I have a data set saved as a text file that basically contains a vectors stored line by line. My vector is 10k in dimensions and I have 250 such vectors. Each vector entry is a double. Here's an example:
Vector 1 -> 0.0 0.0 0.0 0.439367 0.0 .....10k such entries
Vector 2 -> 0.0 0.0 0.0 0.439367 0.0 0.0 0.0 0.0 .....10k such entries
...
...
Vector 250 -> 0.0 1.203973 0.0 0.0 0.0 .....10k such entries
Now if I do the math, this should take up 10k X 16bytes X 250 space (assuming each vector entry is a double taking up 16bytes of space) which is ~40MB of space. However I see that the file size is shown as 9.8MB only. Am I going wrong somewhere?
The thing is I am using this data in my Java code. The space complexity of my algorithm is O(no of entries in the vector X no of entries). Even when I run my code by allocating like 4GB of memory, I still run out of heap space. What am I missing?
Thanks. Andy