1

I have written a program for external sorting according to the book Programming Pearls, the biggest array is char all_nums[10,000,000]; and it needs 10M stack memory (not really big). But this program don't run well (I use clang3.5 and gcc4.8 compile it and run it at Ubuntu 14.04), it get the segmentation fault (core dumped)error. But when I decrease the size of array to char all_nums[1,000,000];, it runs well.

The whole code is under here https://gist.github.com/xuefu/9aecc7f2b8ae3ab0ce55.

  1. OS can limits the running process's memory ? Is the same as of the stack memory and heap memory?
  2. How to get rid of this memory limits ?
  3. How to store the bits in C , like the class bitset in C++ ?

The main sort code is the function disk_sort():

void disk_sort()
{
  char all_nums[MAX_SCOPE]; 
  char buf[MAX_BUF];
  char *ch;
  FILE *fp;
  int n, j;

  fp = fopen(FILE_NAME, "r");

  for(n = 0; n < MAX_SCOPE-1; ++n)
  {
    all_nums[n] = '0';
  }

  all_nums[MAX_SCOPE-1] = '\0';

  while(fgets(buf, MAX_BUF, fp) != NULL)
  {
    sscanf(buf, "%d\n", &n);
    all_nums[n]++;
  }

  fclose(fp);
  fp = fopen(FILE_RESULT, "a");

  n = 0;
  while(all_nums[n] != '\0')
  {
    if(all_nums[n] != '0')
    {
      ch = itostr(n, &j);
      ch[j++] = '\n';
      ch[j] = '\0';
      for(int i = 0; i < all_nums[n] - '0'; ++i)
      {
        fwrite(ch, sizeof(char), j, fp);
      }
      free(ch);
      ch = NULL;
    }
    ++n;
  }
  fclose(fp);
}
phuclv
  • 37,963
  • 15
  • 156
  • 475
xuefu
  • 61
  • 7
  • A stackoverflow question? Funny. I expect you to use google though: "ubuntu process stacksize" will do. – Peter - Reinstate Monica Jul 11 '14 at 15:27
  • Try manually setting your stack and heap sizes when you compile `gcc -Wl,--stack=xxxxx -Wl,--heap=yyyyy ...` When you allocate all_nums you may be exceeding the max size. – Elias Jul 11 '14 at 15:28
  • @Elias Huh? No heap. And no exceeding heap size with 10^7, except on my watch or toaster. – Peter - Reinstate Monica Jul 11 '14 at 15:29
  • “`10M` stack memory (not really big)” er… yes, this is big. 8MB is a common default limit for the stack size (additionally to what Peter Schneider said, you may want to search for the bash built-in `ulimit`). – mafso Jul 11 '14 at 15:29
  • @mafso 8k used to be common. – Peter - Reinstate Monica Jul 11 '14 at 15:31
  • @Elias Ok, thanks. And How to store the bits in C , like the class bitset in C++ ? – xuefu Jul 11 '14 at 15:39
  • @PeterSchneider: Sorry for the spoiler :) And for the 8k… I assumed a machine running Ubuntu 14.04, for which 64-bit systems with 8MB default stack size probably aren't uncommon. But of course, “common” was an oversimplification… xuefu: This sounds unrelated to your original question. – mafso Jul 11 '14 at 15:42
  • Google is your friend, I am sure you can find lots of smart people who have already implemented a version of bitset in C. – Elias Jul 11 '14 at 15:45
  • @mafso 8k was the megamax compiler on an Atari ST iirc. – Peter - Reinstate Monica Jul 11 '14 at 15:48

2 Answers2

3

Using the ulimit command you can get the maximum stack size property which is currenly set.

/sujith>ulimit -a
core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 192070
max locked memory       (kbytes, -l) 1024000
max memory size         (kbytes, -m) 20907448
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 1024
cpu time               (seconds, -t) unlimited
max user processes              (-u) 192070
virtual memory          (kbytes, -v) 48826320
file locks                      (-x) unlimited

How ever you are not allocated to create stack variable larger than that value.

The fix:

you can increase the stack size by following command.

ulimit -s <size>
ulimit -s unlimited

Increase stack size in Linux with setrlimit

Community
  • 1
  • 1
Sujith Gunawardhane
  • 1,251
  • 1
  • 10
  • 24
  • 1
    This is great for identifying the problem, but it doesn't solve it. – Elias Jul 11 '14 at 15:29
  • Resolution added Elias. Thanks. – Sujith Gunawardhane Jul 11 '14 at 15:40
  • 2
    Even if the stack is unlimited in size, the OS may still kill the program for a large stack allocation. The problem is that when you grow the stack, the program only changes its stack pointer (a register) and does not notify the OS that the stack has grown. When the memory is accessed, the program will fault as there is no memory mapped to the addresses and the OS will determine (in this case) whether the stack should grow to include the faulting address or instead kill the program (i.e., seg fault). So touch large allocations in the direction of stack growth. – Brian Jul 11 '14 at 16:24
2

This could be a very long answer so I will try to make it short and give links to more in depth material.

In short: You do not have enough stack memory to allocate an array that big.
To store something that large you should probably be allocating it on the heap using something along the lines of

char *all_nums = new char[MAX_SCOPE];

or better

std::shared_ptr<char> = std::make_shared<char>(MAX_SCOPE);

To adjust the stack size take a look at this question: Change stack size for a C++ application in Linux during compilation with GNU compiler as it offers several ways to accomplish the task depending on your tool chain.

In long:

The operating system can limit the memory usage of a process also administrators can configure limits for users (in both cases see ulimit for *nix). Overall each process will be limited by how much memory can be addressed and this is related to how virtual memory is handled by the operating system (see http://en.wikipedia.org/wiki/Virtual_address_space). This is not the specific issue here.

Looking at your code example and questions you first have to differentiate between stack and heap memory. A longer explanation is available here: http://www.learncpp.com/cpp-tutorial/79-the-stack-and-the-heap/ In general the heap is used to store larger variables that may not fit in the stack but is unmanaged (in the case of c++) and you would use new or malloc (or others) to allocate heap memory. Stack memory is used for shorter term local storage (your array in your example is on the stack). Stack memory is more limited is size and is allocated when you declare a variable and deallocated automatically at the end of scope.

Hope this helps!

Community
  • 1
  • 1
J.W.
  • 31
  • 3