Sure, you could crawl the filesystem and collect meta info about each file and directory, and then store it into a database. But that info cannot be a substitute for the filesystem's info, only a 'copy' of it.
I would have (at least) 2 tables: Files
and Directories
. A file would have a dir_id. A directory would have both a dir_id for itself and a parent_id for walking up the directory tree. The top of the tree (the "root") would be a zero or null.
Soft links, hard links, devices, mounts, /proc, etc, would add challenges, but maybe you don't care about them?
size
needs to be a BIGINT
. permissions
, if encoded, might fit in SMALLINT UNSIGNED
, or could be stored as a string. user
and group
could be either the id or the string; you may need to have an id:name table for users and one for groups. For date
, consider TIMESTAMP
or maybe DATETIME
; keep in mind that the OS may be doing something closer to TIMESTAMP
to deal with timezones. (Windows is possibly different.)
If you are storing copies of the files, then I recommend using another table, linked to Files
via file_id
. But, beware, a LONGBLOB
is limited to 4GB, and there are other settings that make it difficult to store anything bigger than 16MB. So, I might suggest chunking anything bigger than 64KB, compressing the chunks, etc. (This probably requires yet another table.)
As for reconstructing the full path from the "hierarchy" that I suggested above, it is only a small amount of code. It could be done in your app code, or in a Stored Procedure. With MySQL 8.0 or MariaDB 10.2, "CTEs" are available to facilitate drilling down a tree.
(Yes, I have done most of these things in a couple of projects in the past.)