What you are looking for is also called recursive directory traversing. Which means, you're going through all directories and list subdirectories and files in there. If there is a subdirectory it is traversed as well and so on and so forth - so it is recursive.
As you can imagine this is somewhat a common thing you need when you write a software and PHP supports you with that. It offers one RecursiveDirectoryIterator
so that directories can be recursively iterated and the standard RecursiveIteratorIterator
to do the traversal. You can then easily access all files and directories with a simple iteration, for example via foreach
:
$rootpath = '.';
$fileinfos = new RecursiveIteratorIterator(
new RecursiveDirectoryIterator($rootpath)
);
foreach($fileinfos as $pathname => $fileinfo) {
if (!$fileinfo->isFile()) continue;
var_dump($pathname);
}
This example first of all specifies the directory you want to traverse. I've been taking the current one:
$rootpath = '.';
The next line of code is a little bit long, it does instantiate the directory iterator and then the iterator-iterator so that the tree-like structure can be traversed in a single/flat loop:
$fileinfos = new RecursiveIteratorIterator(
new RecursiveDirectoryIterator($rootpath)
);
These $fileinfos
are then iterated with a simple foreach
:
foreach($fileinfos as $pathname => $fileinfo) {
Inside of it, there is a test to skip all directories from being output. This is done by using the SplFileInfo
object that is iterated over. It is provided by the recursive directory iterator and contains a lot of helpful properties and methods when working with files. You can as well for example return the file extension, the basename information about size and time and so on and so forth.
if (!$fileinfo->isFile()) continue;
Finally I just output the pathname that is the full path to the file:
var_dump($pathname);
An exemplary output would look like this (here on a windows operating system):
string(12) ".\.buildpath"
string(11) ".\.htaccess"
string(33) ".\dom\xml-attacks\attacks-xml.php"
string(38) ".\dom\xml-attacks\billion-laughs-2.xml"
string(36) ".\dom\xml-attacks\billion-laughs.xml"
string(40) ".\dom\xml-attacks\quadratic-blowup-2.xml"
string(40) ".\dom\xml-attacks\quadratic-blowup-3.xml"
string(38) ".\dom\xml-attacks\quadratic-blowup.xml"
string(22) ".\dom\xmltree-dump.php"
string(25) ".\dom\xpath-list-tags.php"
string(22) ".\dom\xpath-search.php"
string(27) ".\dom\xpath-text-search.php"
string(29) ".\encrypt-decrypt\decrypt.php"
string(29) ".\encrypt-decrypt\encrypt.php"
string(26) ".\encrypt-decrypt\test.php"
string(13) ".\favicon.ico"
If there is a subdirectory that is not accessible, the following would throw an exception. This behaviour can be controlled with some flags when instantiating the RecursiveIteratorIterator
:
$fileinfos = new RecursiveIteratorIterator(
new RecursiveDirectoryIterator('.'),
RecursiveIteratorIterator::LEAVES_ONLY,
RecursiveIteratorIterator::CATCH_GET_CHILD
);
I hope this was informative. You can also Wrap this up into a class of your own and you can also provide a FilterIterator
to move the decision whether a file should be listed or not out of the foreach
loop.
The power of the RecursiveDirectoryIterator
and RecursiveIteratorIterator
combination comes out of its flexibility. What was not covered above are so called FilterIterator
s. I thought I add another example that is making use of two self-written of them, placed into each other to combine them.
- One is to filter out all files and directories that start with a dot (those are considered hidden files on UNIX systems so you should not give that information to the outside) and
- Another one that is filtering the list to files only. That is the check that previously was inside the foreach.
Another change in this usage example is to make use of the getSubPathname()
function that returns the subpath starting from the iteration's rootpath, so the one you're looking for.
Also I explicitly add the SKIP_DOTS
flag which prevents traversing .
and ..
(technically not really necessary because the filters would filter those as well as they are directories, however I think it is more correct) and return as paths as UNIX_PATHS
so the strings of paths are always unix-like paths regardless of the underlying operating system Which is normally a good idea if those values are requested via HTTP later as in your case:
$rootpath = '.';
$fileinfos = new RecursiveIteratorIterator(
new FilesOnlyFilter(
new VisibleOnlyFilter(
new RecursiveDirectoryIterator(
$rootpath,
FilesystemIterator::SKIP_DOTS
| FilesystemIterator::UNIX_PATHS
)
)
),
RecursiveIteratorIterator::LEAVES_ONLY,
RecursiveIteratorIterator::CATCH_GET_CHILD
);
foreach ($fileinfos as $pathname => $fileinfo) {
echo $fileinfos->getSubPathname(), "\n";
}
This example is similar to the previous one albeit how the $fileinfos
is build is a little differently configured. Especially the part about the filters is new:
new FilesOnlyFilter(
new VisibleOnlyFilter(
new RecursiveDirectoryIterator($rootpath, ...)
)
),
So the directory iterator is put into a filter and the filter itself is put into another filter. The rest did not change.
The code for these filters is pretty straight forward, they work with the accept
function that is either true
or false
which is to take or to filter out:
class VisibleOnlyFilter extends RecursiveFilterIterator
{
public function accept()
{
$fileName = $this->getInnerIterator()->current()->getFileName();
$firstChar = $fileName[0];
return $firstChar !== '.';
}
}
class FilesOnlyFilter extends RecursiveFilterIterator
{
public function accept()
{
$iterator = $this->getInnerIterator();
// allow traversal
if ($iterator->hasChildren()) {
return true;
}
// filter entries, only allow true files
return $iterator->current()->isFile();
}
}
And that's it again. Naturally you can use these filters for other cases, too. E.g. if you have another kind of directory listing.
And another exemplary output with the $rootpath
cut away:
test.html
test.rss
tests/test-pad-2.php
tests/test-pad-3.php
tests/test-pad-4.php
tests/test-pad-5.php
tests/test-pad-6.php
tests/test-pad.php
TLD/PSL/C/dkim-regdom.c
TLD/PSL/C/dkim-regdom.h
TLD/PSL/C/Makefile
TLD/PSL/C/punycode.pl
TLD/PSL/C/test-dkim-regdom.c
TLD/PSL/C/test-dkim-regdom.sh
TLD/PSL/C/tld-canon.h
TLD/PSL/generateEffectiveTLDs.php
No more .git
or .svn
directory traversal or listing of files like .builtpath
or .project
.
Note for FilesOnlyFilter
and LEAVES_ONLY
:
The filter explicitly denies the use of directories and links based on the SplFileInfo
object (only regular files that do exist). So it is a real filtering based on the file-system.
Another method to only get non-directory entries ships with RecursiveIteratorIterator
because of the default LEAVES_ONLY
flag (here used too in the examples). This flag does not work as a filter and is independent to the underlying iterator. It just specifies that the iteration should not return branchs (here: directories in case of the directory iterator).
';` `}` `}?>` – T.Todua Mar 28 '13 at 12:09