To verify any of this works, we first recreate the directory structure in the original question. I'm using unique file contents so we can verify file contents are properly matched with their corresponding keys -
$ mkdir -p level_1/level_2/level_3_1 level_1/level_2/level_3_2/level_4
$ echo "file_1_1 content" > level_1/file_1_1
$ echo "file_1_2 content" > level_1/file_1_2
$ echo "file_3_1_1 content" > level_1/level_2/level_3_1/file_3_1_1
$ echo "file_3_1_2 content" > level_1/level_2/level_3_1/file_3_1_2
$ echo "file_3_2_1 content" > level_1/level_2/level_3_2/file_3_2_1
$ echo "file_3_2_2 content" > level_1/level_2/level_3_2/file_3_2_2
$ echo "file_4_1 content" > level_1/level_2/level_3_2/level_4/file_4_1
$ echo "file_4_2 content" > level_1/level_2/level_3_2/level_4/file_4_2
Now our function, dir2obj
which makes an object representation of a file system, starting with a root path
-
const { readdir, readFile, stat } =
require ("fs") .promises
const { join } =
require ("path")
const dir2obj = async (path = ".") =>
(await stat (path)) .isFile ()
? String (await readFile (path))
: Promise
.all
( (await readdir (path))
.map
( p =>
dir2obj (join (path, p))
.then (obj => ({ [p]: obj }))
)
)
.then (results => Object.assign(...results))
// run it
dir2obj ("./level_1")
.then (console.log, console.error)
If your console is truncating the output object, you can JSON.stringify
it to see all keys and values -
// run it
dir2obj ("./level_1")
.then (obj => JSON.stringify (obj, null, 2))
.then (console.log, console.error)
Here's the output -
{
"file_1_1": "file_1_1 content\n",
"file_1_2": "file_1_2 content\n",
"level_2": {
"level_3_1": {
"file_3_1_1": "file_3_1_1 content\n",
"file_3_1_2": "file_3_1_2 content\n"
},
"level_3_2": {
"file_3_2_1": "file_3_2_1 content\n",
"file_3_2_2": "file_3_2_2 content\n",
"level_4": {
"file_4_1": "file_4_1 content\n",
"file_4_2": "file_4_2 content\n"
}
}
}
}
Refactor with generics
The program above can be simplified by extracting out a common function, parallel
-
// parallel : ('a array promise, 'a -> 'b promise) -> 'b array promise
const parallel = async (p, f) =>
Promise .all ((await p) .map (f))
// dir2obj : string -> object
const dir2obj = async (path = ".") =>
(await stat (path)) .isFile ()
? String (await readFile (path))
: parallel // <-- use generic
( readdir (path) // directory contents of path
, p => // for each descendent path as p ...
dir2obj (join (path, p))
.then (obj => ({ [p]: obj }))
)
.then (results => Object.assign(...results))
Including the root object
Notice the output does not contain the "root" object, { level_1: ... }
. If this is desired, we can change the program like so -
const { basename } =
require ("path")
const dir2obj = async (path = ".") =>
( { [basename (path)]: // <-- always wrap in object
(await stat (path)) .isFile ()
? String (await readFile (path))
: await parallel
( readdir (path)
, p => dir2obj (join (path, p)) // <-- no more wrap
)
.then (results => Object.assign(...results))
}
)
dir2obj ("./level_4") .then (console.log, console.error)
The root object now contains the original input path -
{
"level_4": {
"file_4_1": "file_4_1 content\n",
"file_4_2": "file_4_2 content\n"
}
}
This version of the program has a more correct behavior. The result will always be an object, even if the input path is a file -
dir2obj ("./level_1/level_2/level_3_2/level_4/file_4_2")
.then (obj => JSON.stringify (obj, null, 2))
.then (console.log, console.error)
Still returns an object -
{
"file_4_2": "file_4_2 content\n"
}
Rewrite using imperative style without async-await
In a comment you remark on the "unreadable" style above, but I find boilerplate syntax and verbose keywords highly unpalatable. In a style I suspect you'll recognize as more familiar, take notice of all the added chars -
const dir2obj = function (path = ".") {
return stat(path).then(stat => {
if (stat.isFile()) {
return readFile(path).then(String)
}
else {
return readdir(path)
.then(paths => paths.map(p => dir2obj(join(path, p))))
.then(Promise.all.bind(Promise))
.then(results => Object.assign(...results))
}
}).then(value => {
return { [basename(path)]: value }
})
}
Our variables are more difficult to see because we have words like "function", "return", "if", "else", and "then" interspersed through the entire program. Countless {}
are added just so the keywords can even be used. It costs more to write more — let that digest for a moment.
It's slightly better with the parallel
abstraction, but not much, imo -
const parallel = function (p, f) {
return p
.then(a => a.map(f))
.then(Promise.all.bind(Promise))
}
const dir2obj = function (path = ".") {
return stat(path).then(stat => {
if (stat.isFile()) {
return readFile(path).then(String)
}
else {
return parallel
( readdir(path)
, p => dir2obj(join(path, p))
)
.then(results => Object.assign(...results))
}
}).then(value => {
return { [basename(path)]: value }
})
}
When we look back at the functional-style program, we see each character printed on the screen as representative of some program semantic. p ? t : f
evaluates to t
if p
is true, otherwise f
. We don't need to write if (...) { ... } else { ... }
every time. x => a
takes x
and returns a
because that's what arrow functions do, so we don't need function (x) { ... }
or "return" every time.
I originally learned C-style languages having {}
everywhere was a familiar feeling. Over time, I can look at p ? t : f
or x => a
and instantly understand exactly what the mean and I've come to appreciate not having all the other words and arcane symbols in my way.
There's an added benefit to writing program's in an expression-based style, though, too. Expressions are so powerful because they can be composed with one another to create more complex expressions. We begin to blur the lines between program and data, where everything is just pieces that can be combined like Lego. Even functions (sub-programs) become ordinary data values that we manipulate and combine, just like any other data.
Imperative programs rely on side-effects and imperative statements cannot be combined with one another. Instead, more variables are created to represent intermediate state, which means even more text on the screen and more cognitive load in the programmer's mind. In imperative style, we're forced to think about programs, functions, statements, and data as different kinds of things, and so there is no uniform way to manipulate and combine them.
Related: async and await are not statements
Still, both variants have the exact same behavior as the functional-style program. Ultimately the program's style is left to you, the programmer. Choose any style that you like best.
Similar problem
To gain more intuition on how to solve this kind of problem, please see this related Q&A