3

I have a script and at some part I fork some processes to do a task and the main process waits for all children to complete.
So far all ok.
Problem: I am interested to get the max time that each child process spend while working on what it had to do.
What I do now is just look at the logs where I print the times spend at each action the child process did and try to figure out more or less the times.
I know that the only way to get something back from a child process is via some sort of shared memory but I was wondering for this specific problem is there a "ready"/easy solution?
I mean in order to get the times back and the parent process prints them in a nice fashion in one place.
I thought there could be a better way than just checking all over the logs....

Update based on comments:
I am not interested in the times of the child processes i.e. which child took most time to finish. Each child process is working on X tasks. Each of the tasks takes at worse case Y secs to finish. I am looking to find the Y i.e. the most time it took for a child process to finish one of the X tasks

Jim
  • 18,826
  • 34
  • 135
  • 254
  • 1
    technically the child will execute from the time you call fork() to whenever you wait() on its exit. you could just keep track of those two times. – Marc B Dec 16 '14 at 18:59
  • @MarcB:It is not that I want which child was faster.But each child goes over a series of tasks. I would like to know which task had the biggest time to finish. Or to put it differently what was the max time to finish a task. Right now I can see that in the logs (well sort of... all over the place) – Jim Dec 16 '14 at 19:01
  • 3
    exactly. so for every child you fork, record when you forked it, record when you wait() reaped it, and the difference will be how long the child executed for. whichever child has the largest difference is the one that executed longest. – Marc B Dec 16 '14 at 19:01
  • @MarcB:May be I am not explaining properly. Each child goes over X tasks in a loop. A task may finish in 0.5 secs or in Y secs (worse case). I would like to know that worse case of Y secs for the X tasks. I.e what is Y – Jim Dec 16 '14 at 19:05
  • gotcha. you could used shared memory, sockets, etc... to communicate parent<->child. plenty of methods for IPC. you just have to choose one. – Marc B Dec 16 '14 at 19:06
  • @MarcB:So I know exactly the worse case processing time of a single task going over X tasks – Jim Dec 16 '14 at 19:06
  • @MarcB:Yes I figured that I would need something like that, I was wondering if such an API was available since I was hoping it might not be a rare case – Jim Dec 16 '14 at 19:07
  • SysV message queues (`msgctl` and such) – ikegami Dec 16 '14 at 19:08
  • @ikegami:I'll look into that. I am looking for the easiest non-intrusive approach as it is not something I am willing/can afford to come up with a solution that I need to spend too much time coding/testing – Jim Dec 16 '14 at 19:12
  • Don’t rely on child communication. Simply have the master parent keep track of when each child starts and stops. This will only give you wall time, of course, not actual cpu time. – tchrist Dec 16 '14 at 19:17
  • @tchrist:But I don't want the max time among children per se.I want max time among X tasks processed by a child. – Jim Dec 16 '14 at 19:21
  • @Jim Please edit your question to reflect that. Right now it seems like you're asking for the total run time of each child process. – ThisSuitIsBlackNot Dec 16 '14 at 19:51
  • @ThisSuitIsBlackNot:Updated OP – Jim Dec 16 '14 at 22:15
  • Have a look here... http://stackoverflow.com/questions/13274786/how-to-share-memory-between-process-fork – Mark Setchell Dec 16 '14 at 22:35

1 Answers1

1

The biggest limitation of fork() is that it doesn't do IPC as easily as threads. Aside from trapping when a process starts and exits, what you're doing otherwise has a whole segment of the perl documentation.

What I would suggest is that what you probably want is a pipe and connect it to the child.

Something like this (not tested yet, I'm on a Windows box!)

use strict;
use warnings;

use Parallel::ForkManager;

my $manager = Parallel::ForkManager -> new ( 5 ) ; 
pipe ( my $read_handle, my $write_handle );

for ( 1..10 ) {
    $manager -> start and next; 
    close ( $read_handle ); 
    print {$write_handle} "$$ - child says hello!\n";
    $manager -> finish; 
}
close ( $write_handle ); 

while ( <$read_handle> ) { print; }

$manager -> wait_all_children();
Sobrique
  • 52,974
  • 7
  • 60
  • 101
  • But the main process could go in the while loop to read from the pipe, then nothing could be there as the children have not finished anything and the go block in wait_all_children. Right? So I wouldn't be able to get anything back. Am I misunderstanding this? – Jim Dec 16 '14 at 22:18
  • 2
    Parallel::ForkManager has provided an automated way to return results for quite a while now: https://metacpan.org/pod/Parallel::ForkManager#RETRIEVING-DATASTRUCTURES-from-child-processes – ysth Dec 16 '14 at 22:53
  • All the children are writing to the same handle, which can lead to junk. Use the mechanism provided by P::FM instead of this. – ikegami Dec 17 '14 at 04:26
  • @Jim, There are problems, but that isn't one. The loop won't end until all `$write_handle` and all dups of it are closed, which won't happen before all children have exited. – ikegami Dec 17 '14 at 04:29
  • 1
    Re "The biggest limitation of `fork()` is that it doesn't do IPC as easily as threads." That's also one of its biggest assets. It almost forces you use rigorous designs. – ikegami Dec 17 '14 at 04:31
  • @ikegami:what are the problems then? – Jim Dec 17 '14 at 23:02
  • @Jim, See my first comment on this answer. – ikegami Dec 18 '14 at 00:51