0

I can't open that specific file (got it from here, it is inside the gist.tar.gz), I can however open the query file for example and read it properly. What's wrong? Maybe the problem lies in the fact that the file is too big for me? But I thought that if this was the case I could open it and then receive a bad_alloc or something.

Here is what happens:

samaras@samaras-A15:~/parallel/rkd_forest/code$ ./rkd_sam 
I/O error : Unable to open the file ../Datasets/gist/gist_base.fvecs
samaras@samaras-A15:~/parallel/rkd_forest/code$ cd ../Datasets/gist/
samaras@samaras-A15:~/parallel/rkd_forest/Datasets/gist$ ls
gist_base.fvecs  gist_groundtruth.ivecs  gist_learn.fvecs  gist_query.fvecs

Here is my code (should be OK):

FILE* fid;
fid = fopen(filename, "rb");
if (!fid)
  printf("I/O error : Unable to open the file %s\n", filename);

Here are the permissions of the file: enter image description here

and its size is 3.8 GB (3,844,000,000 bytes) and I know that this dataset is too big for this computer.

As a result I moved to another machine, but I am getting the very same problem.

The memory there (it is 64 bits, while my pc runs on 32 bits):

gsamaras@geomcomp:~/Desktop/code$ free -mt
             total       used       free     shared    buffers     cached
Mem:          3949       3842        106          0        173       3186
-/+ buffers/cache:        483       3466
Swap:        10867         59      10808
Total:       14816       3901      10914

std::cerr << "Error: " << strerror(errno) << std::endl;

gave

Error: Value too large for defined data type


printf("|%s|\n", filename);

gave

|../Datasets/gist/gist_base.fvecs|

and the value is taken from cmd and in the code I am doing this:

readDivisionSpacefvecs<FT>(test, N, D, argv[8]); // in main()

and then

void readDivisionSpacefvecs(Division_Euclidean_space<T>& ds, int& N, int& D, char* filename) {
  FILE* fid;
  fid = fopen(filename, "rb");
  printf("|%s|\n", filename);
  if (!fid) {
    printf("I/O error : Unable to open the file %s\n", filename);
    std::cerr << "Error: " << strerror(errno) << std::endl;
  }
  ...
}

I also tried to move the folder which contains the dataset, but I got the same result!

Community
  • 1
  • 1
gsamaras
  • 71,951
  • 46
  • 188
  • 305

2 Answers2

2

The error you get is EOVERFLOW, which is you read the open manual page means

pathname refers to a regular file that is too large to be opened. The usual scenario here is that an application compiled on a 32-bit platform without -D_FILE_OFFSET_BITS=64 tried to open a file whose size exceeds (1<<31)-1 bytes; see also O_LARGEFILE above. This is the error specified by POSIX.1-2001; in kernels before 2.6.24, Linux gave the error EFBIG for this case.

What that means is that you're on a 32-bit platform and try open a file that's simply to big to be handle without special considerations.

Either recompile your program with -D_FILE_OFFSET_BITS=64 or use the open call directly with the O_LARGEFILE flag.

Some programmer dude
  • 400,186
  • 35
  • 402
  • 621
  • It worked, but unfortunately, I could not satisfy me purpose. http://stackoverflow.com/questions/29447736/same-memory-limitation-to-a-bigger-machine – gsamaras Apr 04 '15 at 14:45
0

It seems that the program tries to open the file in the current directory (code), while the file is in another one (Datasets/gist'). You did not provide the value of thefilename` variable, but it should contain the full path to the file for correct work.

You can try

cd ../Datasets/gist/
../../code/rkd_sam

This should work if filename contains only the basename without any path.

The size of the file does not matter at all as far as fopen() is concerned.

Mike Bessonov
  • 676
  • 3
  • 8
  • Tried that, same result. – gsamaras Apr 04 '15 at 13:47
  • Please provide the code setting the `filename` and print the value of this variable before `fopen()`. – Mike Bessonov Apr 04 '15 at 13:50
  • Edited Mike, see my post. – gsamaras Apr 04 '15 at 13:52
  • Ok, the code assumes that at the moment of `fopen()` call the current working directory is `code`, which is probably not the case for some reason (e.g. a call to `chdir()` somewhere). As a quick fix, you can try specifying the absolute path, like `filename="/home/gsamaras/Desktop/Datasets/gist/gist_base.fvecs";`. This should work; you can find out what goes on with relative paths later. – Mike Bessonov Apr 04 '15 at 14:09