I'm trying to prototype a method of forking processes over the span of a directory, where each time a new directory or file is encountered, a fork occurs, and the child process will either navigate the new directory (and fork in itself) or interact with a certain file.
Two Problems: Some of the fork processes seem to produce a redundancy where they go over the same files multiple times (the printf statements in the child passages were added to test this). I suspect this might be because of a fundamental misunderstanding I have of how the directory pointers work.
The second problem that I can't even really think of where to start is the fact that "Child Processes ID" printf statement gets printed multiple times, as if some of the fork processes were somehow backtracking. Why is that happening?
int main(int argc, char** argv){
int i = 0;
char inputDir[1024];
getcwd(inputDir, sizeof(inputDir));
int childPid[1024];
for(i = 0; i<1024; i++){
childPid[i]=0;
}
DIR *indirPtr = opendir(inputDir);
struct dirent *dp;
int pid;
char path[1024];
i = 0;
printf("Initial ID: %d\n", getpid());
printf("Child Processes IDs: ");
while((dp = readdir(indirPtr)) != NULL)
{
if(dp->d_type == DT_DIR && strcmp(dp->d_name, ".") != 0 && strcmp(dp->d_name, "..") != 0)
{
//printf("This is a dir: %s\n", dp->d_name);
pid = fork();
if(pid ==0){
// I am a child
//printf("%d, "getID()),
//snprintf(path, sizeof(path), "%s/%s", inputDir, dp->d_name);
//printf("OPENING PATH %s\n", path);
closedir(indirPtr);
indirPtr = opendir(dp->d_name);
}else{ // I am a parent
childPid[i++] = pid;
printf("%d, ", pid);
}
}
if(dp->d_type == DT_REG && strstr(dp->d_name, ".csv") != NULL)
{
printf("This is a file: %s\n", dp->d_name);
pid = fork();
if(pid ==0)
{
// I am a child
// SORT()
}else
{
// I am a parent
childPid[i++] = pid;
printf("%d, ", pid);
}
}
}
wait();
return;
}
Edit:
The code itself is supposed to navigate a given directory (for now it gets the current working directory to make testing quicker), forking child processes when a new directory is encountered, or when a desired file is indicated (in this case, a .csv).
The process then checks if it is a child or a parent, and if a child, it will in turn navigate that directory (if it was forked over a directory) by closing the directory pointer it copied to the memory space, then opens the pointer to the new path it was forked to, or perform a sorting function on a .csv file (if forked over a .csv file; for now it just prints if it found the file to begin with).
I noticed the solution to the second problem is due to a need for stdout flush (thank you to all who mentioned output buffer problem!). The first problem still seems to linger with directories being visited multiple times.
So, the test directory is ~project1/I Am A Rock, with test files movie_metadata.csv in project1 and ACSV.csv in 'I Am A Rock' folder respectively.
The received terminal output is the following (after adding the fflush(stdout)'s where necessary):