1

I have the following files/directories in my current directory. test_folder1 is a directory and there is one more directory in that directory. My C code is supposed to print all the files/directories in the current directory recursively. However, it only prints the current directory and one level down subdirectory, it does not go beyond that. Please help.


Current Directory:

a.out    at.c     dt    dt.c    main.c    README    test.c    test_folder1.

Subdirectory of test_folder1:

ahmet.txt  mehmet.txt  test_folder2.

Subdirectory of test_folder2:

mahmut.txt

This for mac terminal C code.

#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <dirent.h>
#include <sys/stat.h>
#include <limits.h>

void depthFirst(DIR *dir){
  struct dirent *sd;
  char path[PATH_MAX];

  if(dir == NULL){
    printf("Error, unable to open\n");
    exit(1);
  }

  while( (sd = readdir(dir)) != NULL){
    if(strcmp(sd->d_name, ".") != 0 && strcmp(sd->d_name, "..") != 0){
      printf("%s\n", sd->d_name);
      realpath(sd->d_name,path);

      if(isdirectory(path)){
        depthFirst(opendir(sd->d_name));
      }               
    }
  }
}

int isdirectory(char *path) {
  struct stat statbuf;
  if (stat(path, &statbuf) == -1)
    return 0;
  else
    return S_ISDIR(statbuf.st_mode);
}

int main(int argc, char *argv[]){
  if(argc<2){
    printf("No arguments");
    DIR *dir;
    dir = opendir(".");
    depthFirst(dir);
    closedir(dir);
  }

This is the output

README
main.c
test.c
test_folder1
ahmet.txt
mehmet.txt
test_folder2
a.out
at.c
dt
dt.c
vitruvius
  • 15,740
  • 3
  • 16
  • 26
omer muhit
  • 65
  • 8

2 Answers2

3

At the point where you're calling realpath(sd->d_name, path) for test_folder2, your current working directory is still . rather than test_folder1, so realpath() is using for ./test_folder2 rather than ./test_folder1/test_folder2.

As a result, path is the absolute path to a would-be ./test_folder2 and not ./test_folder1/test_folder2, and so your stat() call fails, meaning that test_folder2 is not a directory and therefore depthFirst() isn't called for it.

What you need to do is:

  • Upon entry to depthFirst(), save the current working directory (getcwd()) in some local variable and change directory (chdir()) to the directory you have as a parameter.
  • Before exiting depthFirst(), change directory back to the previous working directory.

You may want to have depthFirst() receive a path as a string and do the opendir() call by itself.

root
  • 5,528
  • 1
  • 7
  • 15
  • I totally understand what you saying and yes you are right. I tried to print out paths and as you said it is showing ./test_folder2 instead of ./test_folder1/test_folder2. But I still couldn't find how to fix it. Can you please explain how to fix it a little bit more. I am not so experienced at unix stuff. Thank you in advance. – omer muhit Sep 09 '19 at 07:02
1

Let's suppose you have the following structure (program starts outside of dir_origin with dir_origin as argument):

dir_origin
    README.md
    dir_p0:
        file1.txt
        dir_p1:
            file_1_1.txt
            file_1_2.txt
        dir_p2:
            single.txt
    some_other_files.txt

at this point (first recursive call):

if(isdirectory(path)){
                                depthFirst(opendir(sd->d_name));

You're trying to operate on dir_p0 but the process still working in the directory dir_origin/.., so you need to enter the parent of the directory you want to process first (which is dir_origin), you can do this by calling chdir on the parent directory before every recursive call to the depthFirst() and restore the working directory after the recursive call by calling chdir again with ..

Another solution to avoid changing working directory is to keep building the full path for subdirectories by joining the current path, file separator ('/') and the sub-directory to be processed before the recursive call.

XBlueCode
  • 785
  • 3
  • 18
  • Note that if `stat()` says it is a directory, but `lstat()` says it is a symlink, doing `chdir()` into the directory is OK, but doing `chdir("..")` is not guaranteed to get you back again. Look up `cd -P` and `cd -L` under POSIX [`cd`](http://pubs.opengroup.org/onlinepubs/9699919799/utilities/cd.html). Also look up [`fchdir()`](http://pubs.opengroup.org/onlinepubs/9699919799/functions/fchdir.html). Note that you can also deal with the problem by prefixing the name read by `readdir()` with the path leading to that directory. – Jonathan Leffler Sep 09 '19 at 07:42
  • I agree, then I think it's better to save the absolute path of the current directory before calling `chdir` then returning to that directory using the absolute path instead of the `..` – XBlueCode Sep 09 '19 at 07:48
  • Determining the current directory is an expensive operation. Also, changing directory in a multi-threaded program is a no-no. Granted, this isn't multi-threaded (let us be thankful for small mercies!), but I think that building the path name is actually more reliable — or using `fchdir()` carefully. If you're curious, try it. If you find I'm wrong, let me know. – Jonathan Leffler Sep 09 '19 at 07:51
  • Actually, I have tried both approaches before when I had to re-write `ls` command as a school project, and yeah the second approach was significantly better. but I had to combine both because of the PATH_MAX, so I used `chdir` only when the relative path gets bigger than PATH_MAX – XBlueCode Sep 09 '19 at 08:02
  • Ah — you've been here before, then. Total path name longer than PATH_MAX is an interesting edge case — at that point, it gets tricky! PATH_MAX can vary by file system, and of course you could traverse different file systems as you go (though mount points are usually fairly near the top of the directory hierarchy). Symlinks can also send things haywire; emulating `realpath()` can get quite interesting. – Jonathan Leffler Sep 09 '19 at 08:08