-1

Language: C++
System: Windows 7 x64
Memory: 8GB RAM

I wanna new a large one-dimensional array in my 64-bit application, which contains 60000*60000=3600000000 unsigned short type elements.
Purpose to read a very big picture which is 60k*60k pixels and turn that into one-demensional array to take further process. I certainly could split that picture and read them separately , but in production environment, i do have 128GB and more RAM to consume.

1. static method

unsigned short array [3600000000]; it shows error: size of array 'a' is negative

2.malloc approach:

unsigned long long bytes = 3600000000 * sizeof(unsigned short);  
unsigned short *arr;  
arr = (unsigned short *)malloc(bytes);//almost 6.7GB memory

In my pc with 8GB RAM, address of arr is 0x0 when debug at line malloc
in other pc with 16GB RAM, the address of arr is valid, but if i assign value to each item in arr like

#include <iostream>
#include <string.h>
using namespace std;
int main()
{
    unsigned short *arr;
    arr = (unsigned short *)malloc(3600000000 * sizeof(unsigned short));//memory space can be allocate to arr, about 6.7GB
    if (arr == NULL){
        cout << "failed"<< endl;
    }
    memset(arr, 1, sizeof(arr));
    cout << arr << endl;
}

Interruption will occured at some weird memory location arr is 0x11103630A52B112.
in my pc x64 with 8GB RAM, it print "failed"
in other pc x64 with 16GB RAM, the address of arr is valid, but if i assign value to each item in arr with for loop, interruption will occured at some wired memory location arr is 0x11103630A52B112
how could i new a very very big one-dimentional array

1.(static method 8GB) it shows error: size of array 'a' is negative
2.(malloc approach 8GB)error log with my 8GB and memset statement

oneDimensionalArray.cpp: In function 'int main()':
oneDimensionalArray.cpp:10:47: warning: unsigned conversion from 'long long int' to 'size_t' {aka 'unsigned int'} changes value from '7200000000' to '2905032704' [-Woverflow]
     arr = (unsigned short *)malloc(3600000000 * sizeof(unsigned short));//memory space can be allocate to arr, about 6.7GB
                                    ~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
oneDimensionalArray.cpp:10:35: warning: argument 1 value '2905032704' exceeds maximum object size 2147483647 [-Walloc-size-larger-than=]
     arr = (unsigned short *)malloc(3600000000 * sizeof(unsigned short));//memory space can be allocate to arr, about 6.7GB
                             ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from folderPath\mingw\installed\lib\gcc\mingw32\8.2.0\include\c++\cstdlib:75,
                 from folderPath\mingw\installed\lib\gcc\mingw32\8.2.0\include\c++\ext\string_conversions.h:41,
                 from folderPath\mingw\installed\lib\gcc\mingw32\8.2.0\include\c++\bits\basic_string.h:6391,
                 from folderPath\mingw\installed\lib\gcc\mingw32\8.2.0\include\c++\string:52,
                 from folderPath\mingw\installed\lib\gcc\mingw32\8.2.0\include\c++\bits\locale_classes.h:40,
                 from folderPath\mingw\installed\lib\gcc\mingw32\8.2.0\include\c++\bits\ios_base.h:41,
                 from folderPath\mingw\installed\lib\gcc\mingw32\8.2.0\include\c++\ios:42,
                 from folderPath\mingw\installed\lib\gcc\mingw32\8.2.0\include\c++\ostream:38,
                 from folderPath\mingw\installed\lib\gcc\mingw32\8.2.0\include\c++\iostream:39,
                 from oneDimensionalArray.cpp:1:
folderPath\mingw\installed\include\stdlib.h:503:40: note: in a call to allocation function 'void* malloc(size_t)' declared here
 _CRTIMP __cdecl __MINGW_NOTHROW  void *malloc (size_t) __MINGW_ATTRIB_MALLOC;
                                        ^~~~~~
failed

3.(malloc approach 16GB) An exception was raised: write access rights conflict, arr is 0x11103630A52B112 at assign statament arr[i] = 1; in for loop

Bueryoung
  • 1
  • 1
  • 2
  • 2
    Memory mapping? – Richard Jun 27 '19 at 03:03
  • What's your question? The array has not big size, it has negative size, since 3600000000 > MAX_INT and therefore overflows. –  Jun 27 '19 at 03:04
  • `for(long long i=0;i – Borgleader Jun 27 '19 at 03:08
  • 4
    `C or C++, Windows 7 x64, 8GB RAM` -- This does not say if your application is 32-bit or 64-bit. This only describes the operating system, not the application running. So is your application 32-bit or 64-bit? If it's 32-bit, all of that RAM you're talking about is going to waste. – PaulMcKenzie Jun 27 '19 at 03:16
  • Also, please choose a language C or C++ (you are including ``, and that does not exist for `C`). For C++, and assuming your program is 64-bit, `std::vector arr(3600000000, 1);` will either throw an exception, or just work (but maybe slow on the allocation). No need for all of this checking you're doing now, and no need to write the loop setting all the elements to 1. – PaulMcKenzie Jun 27 '19 at 03:31
  • " at some wired memory location" is likely " at some weird memory location" – chux - Reinstate Monica Jun 27 '19 at 03:33
  • @dyukha: 3600000000 does not have type `int`. It has type `long` or `long long` depending on whether it fits in `long` on this C implementation, which looks like it's a buggy one (probably MSVC). – R.. GitHub STOP HELPING ICE Jun 27 '19 at 03:39
  • @R.., is it starting from C++11? https://en.cppreference.com/w/cpp/language/integer_literal I definitely remember how writing such literals overflows even when the variable has big enough type. But it was before C++11. –  Jun 27 '19 at 03:46
  • 2
    @dyukha: This goes at least all the way back to C99 and I think even to C89, but there was no `long long` in C89 so a large literal was still a problem then. Not sure about corresponding C++ versions. – R.. GitHub STOP HELPING ICE Jun 27 '19 at 03:51
  • @PaulMcKenzie How to determine my application is 32-bit or 64-bit, it supposed to be an 64-bit applocation. C++ preferd, and is your ```std::vector``` worked? From my perspective, i need to make my container full of pixels data later, so i need some way to operate each element in this array. – Bueryoung Jun 27 '19 at 06:11
  • @Bueryoung Going by the post and the pointer value, it seems to be 64-bit app. But this is something (determining the bit-ness of your application) you **must** be able to determine. Otherwise you're developing an application and not knowing the architecture, and that would be considered unacceptable in a professional (even a non-professional) environment. – PaulMcKenzie Jun 27 '19 at 06:15
  • @PaulMcKenzie ok, 64-bit application. Could we move back to make my 3600000000 array work? – Bueryoung Jun 27 '19 at 06:35
  • [See this](http://www.ideone.com/u7E9R9). There is no issue with using a `vector` of that size. Why not do that, instead of resorting to `C` and `malloc`? All of these issues you had that have been pointed out by the answers given are solved by using C++ (std::vector). No compiler errors, no need for loops to initialize the elements, etc. As I stated before, either `std::vector` will throw an exception, or it will just work correctly. Your list of "methods" in your post totally missed the obvious one -- `std::vector`. – PaulMcKenzie Jun 27 '19 at 06:47
  • @PaulMcKenzie What is this website stand for, i could not open it or link to the other websit full of game advertisements. Could you please provide contents here? Thx. – Bueryoung Jun 27 '19 at 07:17
  • @PaulMcKenzie i tried `std::vector` in my VSCode, but it still appear some error. ```vector nVec; for(size_t i = 0; i < 3600000000; ++i){ nVec.push_back(i); } ``` error log: `workspace\oneDimensionalArray terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.` Does it due to 36000000 beyound size_t? – Bueryoung Jun 27 '19 at 10:46
  • 1
    @Bueryoung This is why you should be aware of the bitness of your application. I took the code from the link and made sure the compilation was for 64-bit within Visual Studio. There were no issues (the debug mode to a while, but release was ok). You were probably building a 32-bit program, not 64 bit. – PaulMcKenzie Jun 27 '19 at 12:02
  • Also, there is no need for the loop and push_back. Just a single declaration: `std::vector arr(360000000,1);` Second, what is `arr.max_size();`? – PaulMcKenzie Jun 27 '19 at 12:20

2 Answers2

9

You have the line:

unsigned long long bytes = 3600000000 * sizeof(unsigned short);

This looks correct for the amount of memory to allocate but you also use the same value to iterate over the array:

for(long long i = 0; i<bytes;i++){
    arr[i] = 1;
}

This is wrong, there are only 3600000000 elements in the array, not 3600000000 * sizeof(unsigned short). You are writing past the end of the allocated memory.

Blastfurnace
  • 18,411
  • 56
  • 55
  • 70
  • 3
    N.B: This would not have happened with a std::vector – Borgleader Jun 27 '19 at 03:10
  • @Blastfurnace How about my memset statement to assignment value to this array. Is that possible to make that length of 3600000000 array work? – Bueryoung Jun 27 '19 at 06:31
  • @Bueryoung 1) It's not a good idea to change the question. Removing the original code kind of invalidates this answer. 2) No, `memset(arr, 1, sizeof(arr));` is not the solution. The `memset()` function writes `byte` values to memory and `unsigned short` is most likely two-bytes each. In addition `sizeof(arr)` is the size of the **pointer** variable, not the number of elements allocated or size of the allocated memory block. – Blastfurnace Jun 27 '19 at 06:50
  • @Blastfurnace yes, u r right, sizeof(arr) euquals 4 and i need to make enough space to contain elements. e.g. ten unsigned short number, need 10*sizeof(unsigned short) memory,is this right? `unsigned short *arr; arr = (unsigned short *)malloc(10 * sizeof(unsigned short)); if (arr == NULL){ cout << "failed"<< endl; } cout << "sizeof(arr): " << sizeof(arr) << endl; memset(arr, 0, sizeof(unsigned short)*1); cout <<"arr[0]: " <<*arr << endl; ` sorry for the format of code, could not have a new line with two space surrounded by ' ' – Bueryoung Jun 27 '19 at 07:59
1

In addition to @Blastfurnace good observation about using too large an index:

Potential overflow in "other pc with 16GB RAM, the address of arr is valid"

If size_t is 32-bit, malloc(bytes) converts bytes, which is more than SIZE_MAX to a much smaller value than 3600000000 * sizeof(unsigned short). A valid pointer is return, but not for the desired size.

Instead of unsigned long long bytes, use size_t bytes, enable all warnings and guard that the math 3600000000 * sizeof(unsigned short) does not overflow.

#if 3600000000 > SIZE_MAX/sizeof(unsigned short)
#error overflow
#endif
size_t bytes = 3600000000 * sizeof(unsigned short);

read a very big picture which is 60k*60k pixels

To solve OP's higher problem, do not allocate 3600000000 * sizeof(unsigned short) bytes for a 60k*60k unsigned short array.

Instead allocate 60k times for 60k unsigned short arrays.

(OP is using C++, but appears to want to use malloc(). new would be more C++ like.)

size_t rows = 60000;
size_t cols = 60000;
unsigned short = initial_value;
unsigned short **pix = (unsigned short **) mallow(sizeof *pix * rows);
assert(pix);

for (size_t r = 0; r < rows; r++) {
  pix[r] = (unsigned short *) mallow(sizeof *pix[r] * cols);
  assert(pix[r]);
  for (size_t c = 0; c < rows; c++) {
    pix[r][c] = initial_value;
  }
}
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • yes, i got 2905032704 from my program in that index of arr. – Bueryoung Jun 27 '19 at 03:51
  • 1
    @Bueryoung The index of `arr` found is not the salient issue. Size of `size_t` (32 or 64 bit) and use `for(size_t i = 0; i<3600000000;i++){ arr[i] = 1; }` – chux - Reinstate Monica Jun 27 '19 at 03:59
  • i change the way to ```memset```, but it still not work. It seems not due to index, how to fix that? – Bueryoung Jun 27 '19 at 06:43
  • @chux -- Honestly, your method is not a good one to allocate for a 2D matrix. You're allocating each row separately, [when you could/should do something like this](https://stackoverflow.com/questions/21943621/how-to-create-a-contiguous-2d-array-in-c/21944048#21944048) – PaulMcKenzie Jun 27 '19 at 12:22
  • @PaulMcKenzie You referenced code has `nrows*ncols` and it is that product that 1 ) may overflow or 2) may be too large for a single `new` of `short`. These problems are implied in OP's post. Allocating individual rows is _good_ to prevent hitting that limit. Yes for small 2D arrays, a single allocation is faster – chux - Reinstate Monica Jun 27 '19 at 12:26