4

I'm trying to create an app to search my company's ColdFusion codebase. I'd like to be able to do intelligent searches, for example: find where a function is defined (and not hit everywhere the function is called). In order to do this, I'd need to parse the ColdFusion code to identify things like function declarations, function calls, database queries, etc.

I've looked into using lex and yacc, but I've never used them before and the learning curve seems very steep. I'm hoping there is something already out there that I could use. My other option is a mess of difficult-to-maintain regex-spaghetti code, which I want to avoid.

Peter Boughton
  • 110,170
  • 32
  • 120
  • 176
Kip
  • 107,154
  • 87
  • 232
  • 265
  • Kip, this is something I've been interested into also (and something I want to integrate into CFE), so I was wondering if you've made any useful progress? – Peter Boughton Oct 18 '09 at 13:43
  • @Peter Boughton: Actually I was asking this on behalf of a co-worker. See my answer below--he used the parser in the source code to CFEclipse. I don't know if that would be at all legal to redistribute, but we were using it for an internal development tool. – Kip Oct 18 '09 at 18:34
  • Well you'd need to check the precise wording of the license, but if it's derived from EPL code (the CFEclipse source) then it would simply also need to be distributed with an EPL license. However, it is the current CFE parser that I want to create a replacement for, so unless you've done a big overhaul on it then it wouldn't be what I wanted anyway. – Peter Boughton Oct 18 '09 at 19:02

3 Answers3

3

I used the source to CFEclipse, since it is open source and has a parser. Not sure about the legality of this if we were selling/redistributing it, but we're only using it for an internal tool.

Kip
  • 107,154
  • 87
  • 232
  • 265
  • I believe it uses the [MIT License](https://github.com/cfeclipse/cfeclipse/blob/master/org.cfeclipse.cfml/License.txt). – John Apr 17 '14 at 20:52
2

Writing parsers for real langauges is usually difficult because they contain constructs that Lex and Yacc often don't handle well, e.g., the langauge isn't LALR(1). ColdFusion might be easier than some because of its XML-like style.

If you want to build a sophisticated parser quickly, you might consider using our DMS Software Reengineering Toolkit which has GLR parsing support.

If you want to avoid writing your own or hacking all those Regexps, you could consider our Source Code Search Engine. It has language-sensitive parsers and can search across very large source code bases very quickly. One of its "language sensitive" parsers is AdhocText, which is designed to handle "generic" programming languages such as those you might find in a random programming book; it even understands XML-like tags such as ColdFusion has. You can download a evaluation version from the link provided to try it.

EDIT 4/3/2010: A recent feature added to the SCSE is the ability to tag definitions and uses separately. That would address the OP's desire to find the function definition rather than all the calls.

Ira Baxter
  • 93,541
  • 22
  • 172
  • 341
  • The Reengineering Toolkit may indeed be a good tool to start from, it's just a shame it doesn't list CF support among [its out-of-the-box front ends](http://www.semanticdesigns.com/Products/FrontEnds/index.html?Home=DMSDomains). – CrazyPyro Feb 22 '11 at 19:25
  • 1
    @CrazyPro: yeah, that's indeed a shame, because then you wouldn't have to build a parser at all. But we can do only so much on a finite budget :-} The point is that if you must build your own parser, this is a good foundation. – Ira Baxter Feb 22 '11 at 20:53
  • 1
    @CrazyPyro: Well, it took us awhile to get around to this. Now (July 2020) we have CF support available as an out-of-the-box front end. – Ira Baxter Aug 25 '20 at 03:52
0

None existed. Since ColdFusion is more like scripts than code, I'd imagine it'll be hard to write a parser for it.

ColdFusion Builder can parse CFM/CFC to an outline in Eclipse. Maybe you can do some research on whether a CF Builder plugin can do what you want to do.

Henry
  • 32,689
  • 19
  • 120
  • 221
  • 3
    Being script-like doesn't mean it is hard to write a parser for it. Any langauge is represented by a set of strings. Parsers parse sets of strings described implicitly by the procedural code that comprises the parser, or explicitly by the grammar rules that drive the parser if so designed. Defining ColdFusion to a grammar-driven parser generator is more a matter of getting a good description of ColdFusion than anything else. – Ira Baxter Aug 27 '09 at 03:46