0

I have created a c program, that does everything in a single process. For example, it sequentially reads file by file, and outputs something. I had to use a HUGE array called vectors, and so I declared it static (because it was giving me a seg fault). static double vectors[100000][10000].

Now I need to create the same output of the previous program using multiple concurrent processes.

What I have so far:

pid_t pids[argc - 1];
int pid;
for (e=1; e < argc; e++)
{
     pid = fork();
     if (pid < 0)
     {
          //error
     }
     else if (pid > 0)
     {
           pids[e-1] = pid
     }
     else
     {
           printf("The child process of %d is started\n", pids[e-1]);
           printf("The child process of %d is finished\n", pids[e-1]);
     }
}

for (int i  = 0 ; i < argc - 1 ; i++) 
{
int status;
waitpid(pids[i], &status, 0);
printf("Process %d is finished\n", pids[i]);
}

Now i'm just trying to see if the outputs of the child processes interleave which means they will run concurrently.

So far i'm getting "Killed" message when I run the above, but once I comment out the static vectors array, it runs fine. Why is that?

The output when it does run is really weird, basically I have 0 for pids elements.

Any help would be very appreciated. Thank you.

M. Averbach
  • 121
  • 10
  • since you call fork, the childs and the parent have separate memory. There are now several static double vectors (1 per process). You won't have concurrency ahah. You should use threads instead, because they share the same memory. – Pierre Emmanuel Lallemant Feb 04 '16 at 22:31
  • I have to use processes, not threads. – M. Averbach Feb 04 '16 at 22:33
  • http://stackoverflow.com/questions/13274786/how-to-share-memory-between-process-fork . You have to use shm functions to share memory. – Pierre Emmanuel Lallemant Feb 04 '16 at 22:34
  • What is argc when you tried running it? The way you have it, if argc is 5, then it'll fork a child. Then *both the parent and the child* will fork again. Then *all of those 4* will fork again. Then *all of those 8* will fork again. – user253751 Feb 04 '16 at 22:57
  • argc is 4 in my case. I need one child process per file. – M. Averbach Feb 04 '16 at 23:07

1 Answers1

1

Your process gets killed by the OOM-Killer (Out of memory).

static double vectors[100000][10000] needs about 100000*10000*8 bytes of memory, which makes about 8GB. This memory is not physically allocated until something is written to it (memory overcommitment). If you fork() n times and write to these pages in each process, the memory needed is about n*8GB, which quickly exceeds your physical memory + swap, I assume. dmesg should show you a message regarding this.

A solution is, to create a shared map with mmap() before fork()ing and make all processes work on the same array (if that is what you need):

double *vectors = mmap(NULL, 10000*100000*sizeof(double), 
                       PROT_READ|PROT_WRITE, MAP_SHARED|MAP_ANONYMOUS, -1, 0);

and instead of

vectors[a][b]

access

vectors[a*10000+b]
Ctx
  • 18,090
  • 24
  • 36
  • 51
  • @immibis Yes, but the OP stated in a comment that he has to use processes, not threads... whatever... – Ctx Feb 04 '16 at 22:59
  • @Ctx, thanks. Yes, I need to use processes, not threads. I basically need a seperate process per file, and the output needs to match the output of my original program. – M. Averbach Feb 04 '16 at 23:10
  • @Ctx since you said that the memory doesn't get allocated until something is written to the array, why am I getting the out of memory kill? Right now, i'm not writing anything to the array, its simply declared and thats it. – M. Averbach Feb 04 '16 at 23:30
  • @M.Averbach It depends a bit on your overcommit-setting. You can try setting `/proc/sys/vm/overcommit_memory` to 1, then the OOM-killer should only kick in when you really write to the array. From your description it sounds like your system uses heuristic overcommit handling (setting 0), which allows overcommitting only to a reasonable point. The fork()s exceed this point resulting in the killing of processes. – Ctx Feb 04 '16 at 23:35
  • @Ctx honestly, it must be my computer. I just tried running it via my school server, and it ran fine. Thanks for your help. – M. Averbach Feb 04 '16 at 23:39
  • @Ctx can I ask you one more question, please? Instead of using static, I malloc'd vectors instead, so it is still 8GB. But now the "Killed" went away. Why is that? I have 4GB of RAM on my machine. – M. Averbach Feb 05 '16 at 13:50
  • @M.Averbach These are different types of mappings which are differently handled by the kernel with regards to overcommitment. The behaviour should be identical when setting /proc/sys/vm/overcommit_memory to 1 (then with both methods your processes should _not_ be killed until the memory is really used for writing) – Ctx Feb 05 '16 at 14:15