13

Good night everyone. I'm having trouble with probably some simple recursive function. The problem is to recursively list all files in a given folder and its subfolders.

For the moment, I've managed to list files in a directory using a simple function:

fs.readdirSync(copyFrom).forEach((file) => {
  let fullPath = path.join(copyFrom, file);

  if (fs.lstatSync(fullPath).isDirectory()) {
    console.log(fullPath);
  } else {
    console.log(fullPath);
  }
});

I've tried various methods like do{} ... while() but I can't get it right. As I'm a beginner in javascript, I finally decided to ask for help from you guys.

Puka
  • 1,485
  • 1
  • 14
  • 33
  • 1
    You can use `recursive-readdir` package for this. https://www.npmjs.com/package/recursive-readdir – Kirill May 01 '18 at 18:35
  • I would say Good day to you from another timezone and that you should maybe also read this: https://meta.stackexchange.com/questions/2950/should-hi-thanks-taglines-and-salutations-be-removed-from-posts – Endless May 01 '18 at 18:36
  • Thanks @Kirill, I'll give it a try but as recursive functions seemed to be quite common I would have liked to understand how to do it :) – Puka May 01 '18 at 18:40

4 Answers4

30

Just add a recursive call and you are done:

 function traverseDir(dir) {
   fs.readdirSync(dir).forEach(file => {
     let fullPath = path.join(dir, file);
     if (fs.lstatSync(fullPath).isDirectory()) {
        console.log(fullPath);
        traverseDir(fullPath);
      } else {
        console.log(fullPath);
      }  
   });
 }
Jonas Wilms
  • 132,000
  • 20
  • 149
  • 151
4

Using console.log in this way displays the path and that's great, but what if you want to do something more meaningful with the paths? For example, maybe you want to collect all of them in an array and pass them off for processing elsewhere...

This process of beginning with a seed state and expanding a sequence of values while the state changes is called an unfold.

const { join } =
  require ('path')

const { readdirSync, statSync } =
  require ('fs')

const unfold = (f, initState) =>
  f ( (value, nextState) => [ value, ...unfold (f, nextState) ]
    , () => []
    , initState
    )

const None =
  Symbol ()

const relativePaths = (path = '.') =>
  readdirSync (path) .map (p => join (path, p))

const traverseDir = (dir) =>
  unfold
    ( (next, done, [ path = None, ...rest ]) =>
        path === None
          ? done ()
          : next ( path
                 , statSync (path) .isDirectory ()
                     ? relativePaths (path) .concat (rest)
                     : rest
                 )
    , relativePaths (dir)
    )

console.log (traverseDir ('.'))
// [ a, a/1, a/1/1, a/2, a/2/1, a/2/2, b, b/1, ... ]

If this is your first time seeing a program like this, unfold will feel very overwhelming. Below, is a simplified example of unfold used to generate an array of the lowercase alphabet

const unfold = (f, init) =>
  f ( (x, next) => [ x, ...unfold (f, next) ]
    , () => []
    , init
    )

const nextLetter = c =>
  String.fromCharCode (c.charCodeAt (0) + 1)

const alphabet =
  unfold
    ( (next, done, c) =>
        c > 'z'
          ? done ()
          : next ( c              // value to add to output
                 , nextLetter (c) // next state
                 )
    , 'a' // initial state
    )

console.log (alphabet)
// [ a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z ]

If you're still stuck, the techniques I've demonstrated here are explained in greater detail in answers to similar questions

Generally, it's preferred to use the asynchronous functions in the fs module as this prevents the program from hanging in the event of long disk read times or network delay. Unfolding plays nicely with asynchrony as demonstrated in these other Q&A's

Mulan
  • 129,518
  • 31
  • 228
  • 259
  • Thanks for the really detailed anwser, I'll give it a try tomorrow :) – Puka May 01 '18 at 19:57
  • 1
    @YouDeserveThat my pleasure. I've recently learned about [anamorphisms](https://en.wikipedia.org/wiki/Anamorphism) and I see use cases for them everywhere now! Let me know if you have any questions, I'm happy to help :D – Mulan May 01 '18 at 19:58
1

I am using following getFilesTree function. This function recursively list all files in a directory and its subdirectories except hidden folders and files (start's with .).

import {readdir} from 'node:fs/promises';
import {join, resolve} from 'node:path';
import {parse} from 'node:path';

export async function getFilesTree(dir) {
    return await Promise.all(
        (await readdir(dir, {withFileTypes: true}))
            .filter(child => !child.name.startsWith('.')) // skip hidden
            .map(async (child) => {
                const base = parse(child.name).base;
                const path = resolve(dir, child.name);
                return child.isDirectory() ?
                    {base, path, children: await getFilesTree(join(dir, child.name))} :
                    {base, path};
            }),
    );
}

Function itself is very similar to the recursive-readdir library. The results then look something like this:

[
    {
        "base": "file.js",
        "path": "/Volumes/Work/file.js"
    },
    {
        "base": "css",
        "path": "/Volumes/Work/css",
        "children": [
            {
                "base": "index.css",
                "path": "/Volumes/Work/css/index.css"
            },
            {
                "base": "code.css",
                "path": "/Volumes/Work/css/code.css"
            }
        ]
    }
]

Sometimes there is no need to have the structured data, then you can use generator instead:

import {readdir} from 'node:fs/promises';
import {resolve} from 'node:path';

async function * getFiles(dir) {
    for (const dirent of await readdir(dir, {withFileTypes: true})) {
        const res = resolve(dir, dirent.name);
        if (dirent.isDirectory()) {
            yield * getFiles(res);
        } else {
            yield res;
        }
    }
}

for await (const file of getFiles('content')) {
    console.log(file);
}
OzzyCzech
  • 9,713
  • 3
  • 50
  • 34
0

Piggybacking off Jonas Wilms answer - that turned out to be really slow for a large number of files. This is faster (though it is async).

const getFilesnamesRecursive = (dir: string, foundFiles: string[]) => {
const files = fs.readdirSync(dir, {withFileTypes: true});
files.forEach((entry: any) =>{        
    if (entry.isDirectory()){                    
        getFilesnamesRecursive(path.join(dir, entry.name)), foundFiles, logger);
    }
    else{            
        foundFiles.push(path.join(dir, entry.name);
    }
});

}

The other is slower because it was using readdir to get each filename then checking if it was a directory. This way is faster because it users the option withFileTypes: true to return each entry as a Dirent object, instead of just the filename.

Stephanie
  • 151
  • 8