4

I've pretty much convinced myself that this is just leaky abstraction rearing it's ugly head, but I figured before filing a bug report I might probe Stack Overflow for a more insightful answer...

I'm writing a class in PHP to assist autoloading. I hate using the ridiculous, path-oriented class names like the Zend framework uses, so I've instead written a class that creates an object to recursively go through a directory and its subdirectories, find all PHP source files, and parse them for class definitions, which are then cached and reused until an autoload fails, which prompts the object to update its index.

I like to use built-in PHP classes wherever possible, so for the indexed paths I've used the SplFileInfo class.

Some of the operations required to update the index of classes require that I search the array of SplFileInfo objects my object holds, which is where I have run into a little trouble with using the comparison operator on the SplFileInfo object.

Simply put, searching for any file always returns true. I was a little baffled by it at first, but I've tried it on two machines and had a friend try it on his--same result. Seemingly no matter what, when you compare two SplFileInfo objects using the comparison (==) operator, it returns true, even if they point to different files located in completely directories. Whether I use a loop to iterate over each element in the array and compare that way, use in_array(), or array_search(), it always returns true and gives me a reference to a completely different file.

I poked around a little further and found that the identify operator (===) always returns false, even when two SplFileInfo objects point to exactly the same file, initialized with the same path string.

For my code, this means that when I go to update the class index to see if there are any new files, even if a file is new, the indexer object thinks it already knows about that file and moves on. I've made this to work by comparing the pathname strings, but that sort of defeats the purpose of using the SplFileInfo class to begin with.

The PHP manual says that extension developers can overload the comparison operators for objects, which is why I sort of assumed that SplFileInfo was smart enough to resolve relative paths and properly compare it with other relative paths, e.g. ./file and file are equal. Turns out, it was only returning true in those instances because it always returns true when comparing two SplFileInfo objects!

This isn't linked to the code I'm working on. I tried a scratch source file with some new SplFileInfo objects, and compared them. It returns true whether the objects point to different files, the same file, and files that don't exist.

Since this isn't the default behavior for comparison operators on objects in PHP, there has to be something in the definition of the class in the extension that is causing this.

Does anyone have any idea why it's behaving like this, or have any insight?

Andrew Noyes
  • 5,248
  • 1
  • 18
  • 14
  • +1, although I question your autoloading approach, I find the question intriguing nonetheless. – Decent Dabbler Jan 25 '10 at 05:38
  • Yeah I imagine most people would. But this is almost identical to how symfony autoloads its core classes. – Andrew Noyes Jan 25 '10 at 05:44
  • 1
    BTW: the strict comparison operator `===` behaviour you mention is simply expected behaviour, because, although they internally may hold the same data, the objects themself are not references to the same object. They are distinct instances. – Decent Dabbler Jan 25 '10 at 05:55
  • BTW 2: can you point me to the manual about the comparison operator overloading. I can't seem to find it, but be much interested in reading it. – Decent Dabbler Jan 25 '10 at 05:56
  • http://us.php.net/manual/en/language.operators.comparison.php It's in the table about comparing different types. – Andrew Noyes Jan 25 '10 at 05:58
  • @fireeyedboy: I know I am question-necromancing, but I stumbled upon this question in a similar persut and was wandering why you raised objections to the autoloading approach? – DudeOnRock Feb 03 '13 at 20:59
  • 1
    @DudeOnRock To be honest; I'm not entirely sure anymore why I objected to this approach. I think it probably was because I value pseudo-namespaces over having no namespaces at all. But that is not necessarily related to the autoloading. Perhaps I overlooked the cache part that Andrew was talking about, the first time around, and thus concluded that lazy-loading / loading-on-demand would be more efficient. – Decent Dabbler Feb 03 '13 at 21:31
  • 1
    @DudeOnRock Or... not knowing the exact caching implementation; if an existing class is modified, the caching mechanism must have some way of knowing this. And I can't think of a mechanism that doesn't involve having to iterate all files each time to check this, of the top of my head. Making this a possibly very inefficient approach. – Decent Dabbler Feb 03 '13 at 21:35

1 Answers1

2

Andrew,

I've done some experimenting myself with the SPL classes, and surpisingly SplFileInfo is not the only SPL class that exhibits this behaviour. ArrayIterator, for instance, reacts the same way too. I presume more (if not all) SPL classes have this behaviour.

Although I don't have an answer as to why this behaviour exists, I do have a workaround for your specific case. You may have come up with this yourself already, but I thought I'ld share it anyway:

class MyFileInfo extends SplFileInfo
{
    private $_realPath;

    public function __construct( $path )
    {
        parent::__construct( $path );
        $this->_realPath = $this->getRealPath();
    }

}
Decent Dabbler
  • 22,532
  • 8
  • 74
  • 106
  • Yep, that was how I solved it (rather than comparing the actual pathnames of the object, which could differ for the same file). `$file1->getRealPath() === $file2->getRealPath();` – Andrew Noyes Jan 25 '10 at 06:45
  • I'm just gonna go ahead and accept this as an answer since nobody else is saying anything. – Andrew Noyes Feb 04 '10 at 15:04