0

There are some modules in project which are being renamed or newly created or copied directly. Now I want to delete old directory files. So I want to find all files with their path which are having same name for clean up . (count > 2). That can be css, tpl, php or js files.

i.e.

Main\Games\troy.php
Main\Games\Child Games\troy.php
Main\Games\Sports\troy.php

If search is done on Main directory then search should return all 3 files and their paths. How to find duplicate files by PHP.

That will be useful also for finding duplicate files with same name in your drive like mp3, 3gp files.

Somnath Muluk
  • 55,015
  • 38
  • 216
  • 226

1 Answers1

0
function find_duplicate_files() {
    $names = scandir_recursive( 'D:\Main' );
    $files = array();
    foreach( $names as $name ) {
        if( count( $name ) > 1 ) {
            $files[] = $name;
        }
    }
    print_r( $files );
}

Function scandir_recursive() recursively parses the specified directory tree and creates an associative array whose keys are the file names found in all subdirectories, and whose values are the corresponding paths.

function scandir_recursive( $dir, &$result = array() ) {
    $dir = rtrim($dir, DIRECTORY_SEPARATOR);

    foreach ( scandir($dir) as $node ) {
        if ($node !== '.' and $node !== '..') {
            if (is_dir($dir . DIRECTORY_SEPARATOR . $node)) {
                scandir_recursive($dir . DIRECTORY_SEPARATOR . $node, $result);
            } else {
                $result[$node][] = $dir . DIRECTORY_SEPARATOR . $node;
            }
        }
    }
    return $result;
}

// It will output like

Array
(
    [0] => Array
        (
            [0] => D:\Main\Games\troy.php
            [1] => D:\Main\Games\Child Games\troy.php
            [2] => D:\Main\Games\Sports\troy.php 
        )

    [1] => Array
        (
            [0] => D:\Main\index.php
            [1] => D:\Main\Games\index.php
        )
)

From which we can identify which are duplicate files. It is useful when your code base is having large number of files. ( And I have used it a lot for finding duplicate music mp3 files :P )

Somnath Muluk
  • 55,015
  • 38
  • 216
  • 226
  • Instead of scandir_recursive, I suggest you take a look at the recursive directory iterator PHP has: http://php.net/class.recursivedirectoryiterator - you then can concentrate more on the counting logic then on the recursive traversal. – hakre Nov 20 '12 at 11:03
  • @hakre: Can you give example. I am not getting your point. IS there any way to find by inbuilt functions. – Somnath Muluk Nov 20 '12 at 11:10
  • Examples are given in the PHP manual link and also in this answer at the bottom: http://stackoverflow.com/a/12236744/367456 – hakre Nov 20 '12 at 11:18