19

I want to parse a PHP source file, into an AST (preferably as a nested array of instructions).

I basically want to convert things like

f($a, $b + 1)

into something like

array( 'function_call',
    array(
        array( 'var', '$a' ),
        array( 'expression',
            array(
                array( 'binary_operation',
                    '+',
                    array ('var', '$b'),
                    array( 'int', '1' )
                )
            )
        )
    )
)

Are there any inbuilt PHP library or third party libraries (preferably in PHP) that would let me do this?

Dogbert
  • 212,659
  • 41
  • 396
  • 397

5 Answers5

26

I have implemented a PHP Parser after I figured out that there was no existing parser. It parses the PHP code into a node tree.

Community
  • 1
  • 1
NikiC
  • 100,734
  • 37
  • 191
  • 225
  • I just tried out your parser, and I think it may be good enough for my use. I have one problem though - there's no license file in your project, and there's no mention of anything related to that, so I'm not sure if I could use this in my project. – Dogbert May 27 '11 at 15:30
  • @Dogbert: Yes, I haven't added one yet ;) It is MIT/BSD. So you should be able to use it in pretty much any project. – NikiC May 27 '11 at 15:37
  • @NikiC: If it doesn't contain the license, it isn't anything. Nobody will pay any attention what you claim it is, here. – Ira Baxter Sep 02 '14 at 18:54
  • 8
    @IraBaxter Thank you for this most insightful of comments. Now look at the dates on the above comments and at the LICENSE file in the linked repository. – NikiC Sep 02 '14 at 19:06
  • @NikiC: Good that you have it. Even better that have clarified here after this long time, that it is in place. You're welcome. – Ira Baxter Sep 02 '14 at 19:46
  • Hi, @NikiC, thank you for writing this nice parser. However, I don't know how to use it, should I compile it along with PHP7? Thank you! – naizheng TAN Sep 28 '15 at 21:03
8

HipHop

You can use Facebook's HHVM to dump the AST.

apt-get install hhvm

# parse specified file and dump the AST
hhvm --parse arg  

This worked for HipHop (the old PHP to C++ compiler) - back in the days of 2013!


HHVM

Update 2015

--parse is not supported.

You will get an error: HHVM The 'parse' command line option is not supported.

See https://github.com/facebook/hhvm/blob/c494c3a145008f65d349611eb2d09d0c33f1ab23/hphp/runtime/base/program_functions.cpp#L1111

Feature Request to support the CLI option again: https://github.com/facebook/hhvm/issues/4615


PHP 7

PHP 7 will have an AST, see the related RFC.

There are two extensions, which provide access and expose the AST generated by PHP 7:

Jens A. Koch
  • 39,862
  • 13
  • 113
  • 141
  • This outputs `Error in command line: unrecognised option '--parse'` for me. I'm on Ubuntu 14.04, installed the `hhvm` package from the repo at http://dl.hhvm.com/ubuntu as explained [here](https://github.com/facebook/hhvm/wiki/Prebuilt-packages-on-Ubuntu-14.04), `hhvm --version` outputs HipHop VM 3.6.1. When I call `hhvm --help` there does indeed not seem to be a `--parse` option. – Malte Skoruppa Apr 24 '15 at 10:03
  • 1
    Yo! This dates back to HipHop in 2013. Updated my answer to reflect the current situation. – Jens A. Koch Apr 24 '15 at 10:11
2

Well, you can look at the answers from Parsing and Printing PHP Code and Generating PHP code (from Parser Tokens): basically PEAR's PHP_Beautifier package at http://pear.php.net/package/PHP_Beautifier can be extended to do what you want, but it sounds like it requires some heavy lifting.

And if you're not constrained to PHP then http://www.eclipse.org/pdt/articles/ast/PHP_AST.html walks you through using the Eclipse PHP module's AST parser.

Community
  • 1
  • 1
Femi
  • 64,273
  • 8
  • 118
  • 148
  • Some of the links in the answers to [Is there a static code analyzer like Lint for PHP files](http://stackoverflow.com/q/378959) may also be helpful. – Jorik May 27 '11 at 14:36
2

Pfff is an OCaml library for parsing and manipulating PHP code. See the manual of Pfff for more details.

reprogrammer
  • 14,298
  • 16
  • 57
  • 93
1

No, there is no such feature built-in. But you can use the Tokenizer to create it.

KingCrunch
  • 128,817
  • 21
  • 151
  • 173