10

I have been handed over a large undocumented code of a application written in php as the original coder went AWOL. My task is to add new features but I can't do that without understanding the code.I started poking around. honestly, I am overwhelmed by the amount of source code. I have found:

  • Its well written based upon MVC architecture, DB persistence, Templating & OOP
  • modular, there is concept of URL based routing,basic templating
  • Uses custom written php framework which has no documentation.And there no source control history(oops!)
  • there over 500 files, with each file containing hundreds of line of code. And every file has 3-4 require_once statements which include tons of other files, so its kinda hard to tell which function/class/method is coming from where

Now I am looking for some techniques that I use to understand this code. for example, consider the following code snippet:

class SiteController extends Common {

private $shared;
private $view;


protected function init(){


    $this->loadShared();
    $this->loadView();


}

private function loadShared(){
    $this->shared = new Home();
}

private function loadView(){
    $this->view = new HomeView();
}

I want to know

  • where HomeView() & Home() are defined? Where does $this->shared & this->view come from? I checked the rest of the file, there is no method named shared or view. so obviously, they coming from one of hundreds of classes being included using require_once() But which one? how can I find out?
  • Can I get a list of all the functions or methods that are being executed? If yes, then how?
  • this class SiteController overrides a base Common class. But I unable to find out where is this Common class is located. How to tell?

Further, Please share some techniques that that be used to understand existing code written in php?

Mogsdad
  • 44,709
  • 21
  • 151
  • 275
CuriousMind
  • 33,537
  • 28
  • 98
  • 137
  • Do you know what IDE was used before? You might find it handy to use the same IDE (or any IDE really that might help organize this or provide a good class view). – Brad Apr 06 '11 at 21:23
  • No, I don't know that. I was using vim till now. after suggestions, I will try Apatana. Thanks – CuriousMind Apr 06 '11 at 23:19

8 Answers8

10

First, in this kind of situation, I try to get an overview of the application : some kind of global idea of :

  • What the application (not the code !) does
  • How the code is globally organized : where are the models, the templates, the controllers, ...
  • How each type of component is structured -- once you know how a Model class works, others will typically work the same way.


Once you have that global idea, a possibility to start understanding how the code works, if you have some time before you, is to use a PHP Debugger.
About that, Xdebug + Eclipse PDT is a possibility -- but pretty much all modern IDEs support that.

It'll allow you to go through the generation of a page step by step, line by line, understanding what is called, when, from where, ...

Of course, you will not do that for the whole application !
But as your application uses a Framework, there are high chances that all parts of the application work kind of the same way -- which means that really understanding one component should help understanding the other more easily.


As a couple of tools to understand what calls what and how and where, you might want to take a look at :

  • The inclued extension (quoting) : Allows you trace through and dump the hierarchy of file inclusions and class inheritance at runtime
  • Xdebug + KCacheGrind will allow you to generate call-graphs ; XHProf should do the same kind of thing.
  • Using your IDE (Eclipse PDT, Zend Studio, phpStorm, netbeans, ...), ctrl+click on a class/method should bring you to its declaration.


Also note that an application is not only code : it often find very useful to reverse-engineer the database, to generate a diagram of all tables.

If you are lucky, there are foreign keys in your database -- and you'll have links between tables, this way ; which will help you understand how they relate to each other.

Pascal MARTIN
  • 395,085
  • 80
  • 655
  • 663
2

You need an IDE. I use netbeans for PHP and it works great. This will allow you to find out where the homeview/home classes are by right clicking and selecting a "find where defined" option or something similar.

You can get a list. This is called the stack. Setting up a debugger like xdebug with the IDE will allow you to do this.

k to the z
  • 3,217
  • 2
  • 27
  • 41
2

grep is the only thing makes me survive such codez

Your Common Sense
  • 156,878
  • 40
  • 214
  • 345
1
  • Look inside of the script where you found this code snippet for additional included or required pages that PHP imported into the main script. Those scripts should define those classes that are being instantiated.
  • Sorry, not sure if you can find which functions/methods have been executed. I know you can find if they exist, and you can find the generated output of them... but not sure if they have been executed.
  • It is important to note that SiteController doesn't override, the Common class, but it extends, or builds on top of it, like how a building is built on a foundation. The Common class is the foundation. Again, check the included and required scripts to see where Common was defines.

Hope that helps,
spryno724

Oliver Spryn
  • 16,871
  • 33
  • 101
  • 195
1

I would start with:

  • throwing exception at certain points to see a stacktrace where the call originated.
  • grep for Class Common for example
  • create a directory listing to get a feeling for the organization of the software
  • use get_included_files(); to see what is actually used for a certain call
  • Start documenting what I find out
  • Start working with an IDE, like NetBeans, Eclipse or Zend Studio
  • Figuring out class hierarchies with maybe this "php: determining class hierarchy of an object at runtime" approach
Community
  • 1
  • 1
Nick Weaver
  • 47,228
  • 12
  • 98
  • 108
1

You seem to realize that you can't read/digest every file, so you've got to focus on the important ones. Looks like you've started that process with SiteController.

Hopefully between reading the requires and using your IDE you can chase down the Home() and HomeView()

There might be a few key XML files that dictate the mappings from URLs to controller files, so you'll want to figure out how they work also.

I've worked with a poorly documented (but decently working) custom framework before, and your situation seems pretty similar. I found things pretty smooth once I understood the main controller and basically formed an understanding for how URL requests were processed.

jon_darkstar
  • 16,398
  • 7
  • 29
  • 37
1

Start from the entry point of the application (usually index.php) and go deeper on what gets called when.

Give PHPstorm a go, it's an ide with excellent code analyzing features, can go to definition of any class and variable, show inheritance hierarchy, find usages and many other useful stuff.

I'll also plug my own tool:

http://raveren.github.io/kint/

It's works with zero set up and is extremely useful to get a grip on what's going on where. Use Kint::trace(); to see a pretty execution backtrace and d(get_defined_vars()); to see what is defined in the current context and eventually you'll get there.

Screenshot:

Kint screenshot
(source: github.io)

Glorfindel
  • 21,988
  • 13
  • 81
  • 109
raveren
  • 17,799
  • 12
  • 70
  • 83
1

1) You can use a search tool such as grep to find code, including definitions. But on a big code base, grep is slow, and it gives a lot of false positives because it has no understanding of the PHP language.

Our Search Engine is a GUI-based tool that indexes your source code to achieve extremely fast lookup, indexing by the langauge elements (variable names, constants, keywords, strings, ..) and allowing to formulate queries that honor the langauge structure (e.g., it ignores whitespace and comments unless you say you want to see them). A query shows hits in a hit window, and a click takes you to the file/line in which the hit occurs. With some tiny bit of additional configuration, you can go from the code window into your favorite editor.

2) Sometimes you want to know where specific functionality exists, but you have no clue what to search for. Here a test coverage tool can really help. Simple set up test coverage for the (working) application, and exercise the functionality manually; what is "covered" is potentially the code you care about. Exercise something which is NOT the feature; what is covered is NOT the code you want. This is way easier than trying to run a debugger to find the code of interest. Our PHP Test Coverage tool can provide you this coverage, and not only show you the covered code in GUI, but also do that "coverage subtraction" so that you can see just the relevant code.

Ira Baxter
  • 93,541
  • 22
  • 172
  • 341