0

I am trying to convert a ColdFusion website to PHP using regex. I haven't found something to do it so I do it myself.

Lucky me, the coldfusion website is basic in term of programmation and regex can do the job.

However, I am stuck somewhere. How can I extract something like this ?

  1. <cfset mailfrom="hey">
  2. <cfset mailfrom='hey'>
  3. <cfset mailfrom=hey>
  4. <cfset mailfrom = hey>
  5. <cfset mailfrom="<hey>">

I did try the following pattern :

preg_match_all('/<cfset (.*)( )=(| |\'|")(.*)(|\'|")>/iU', $this->Content, $customTags, PREG_SET_ORDER);

It work sometime, sometime it don't. My ColdFusion code may be on 1 line or 1000 lines. So sometime you'll see something like this <cfset mailfrom="hey"><cfset fdsfsd="3333">

As soon as I know the full string (<cfset mailfrom="hey">) and what to replace with ($mailfrom = "hey";) I would be able to parse it without problem so this mean the regex has to give me at least the variable name and value.

Thanks.

EDIT :

I use this :

preg_match_all('/<cfparam\s*([^>]*)\s*\/?>/i', $this->Content, $customTags, PREG_SET_ORDER);

To match <cfparam name="" default="">. I guess I could do it the same way but parse(.) = (.) (Var = value)

But the problem here is this regex cannot match < and > in the value zone.

David Bélanger
  • 7,400
  • 4
  • 37
  • 55
  • try this **[thread](http://stackoverflow.com/questions/10486704/how-do-i-display-content-grabbed-from-external-websites/10487210#10487210)** Please try this, regex probably is not a good option in this case. – Saurabh Tiwari May 15 '12 at 16:55

2 Answers2

0

Regex is not doing so well when it comes to hierarchies, and this is exactly the reason for your problem. Consider using DOM, its implementation is awkward in PHP, but that's exactly the task for it.

Alex Andrienko
  • 506
  • 5
  • 4
0

Something like this? /cfset\s*(?<key>[^=]+)\s*=\s*["'](?<value>[^"']+)["']/i

For example, to string <cfset mailfrom="hey"> $match["key"] is mailfrom and $match["value"] is key. But don't forget to: you should not do parsing [X]HTML with regular expressions.

Community
  • 1
  • 1
The Mask
  • 17,007
  • 37
  • 111
  • 185