1

im trying to write regex to search and replace image path and keeping same file name existing path to my remote server path.

user is uploading HTML file i want while uploading replace path before saving in db.

I wrote some Regex its not seems to be working im not good in regex i found on internet.

Please find below code snippet...

for example this my HTML content.

<cfsavecontent variable="htmlcont">

<html>
<head>
</head>
<body>
<a href='http://www.w3schools.com'><img src='http://www.w3schools.com/images/w3schools.png' alt='W3Schools.com' class='img-responsive'></a>
<div class="w3-row">
<div class="w3-third w3-center">
<h2>JPG Images</h2>
<img alt="Mountain View" src="pic_mountain.jpg" style="width: 304px; height: auto" class="img-responsive">
</div>
<div class="w3-third w3-center">
<h2>GIF Images</h2>
<img alt="" src="html5.gif" style="width: 128px; height: auto" class="img-responsive">
</div>
<div class="w3-third w3-center">
<h2>PNG Images</h2>
<img alt="Graph" src="pic_graph.png" style="width: 170px; height: auto" class="img-responsive">
</div>
</div>

<table class="lamp"><tr>
<th style="width:34px">
<img src="/images/lamp.jpg" alt="Note" style="height:32px;width:32px"></th>
<td>Always specify the width and height of an image. If width and height are not specified, the page will flicker while the image loads.
</td>
</tr></table>
<hr>

<table class="lamp"><tr>
<th style="width:34px">
<img src="/images/lamp.jpg" alt="Note" style="height:32px;width:32px"></th>
<td>Add &quot;border:0;&quot; to prevent IE9 (and earlier) from displaying a border around the image.</td>
</tr></table>
<hr>
</body>
</html>
</cfsavecontent>

im trying to replace

http://www.w3schools.com/images/w3schools.png

TO

http://myremoteServer.com/234001/images/w3schools.png

OR

<img alt="Graph" src="pic_graph.png" style="width: 170px; height: auto" class="img-responsive">

TO

<img alt="Graph" src="http://myremoteServer.com/234001/images/pic_graph.png" style="width: 170px; height: auto" class="img-responsive">

CODE:

<cfset regxv =  'src="\K[^"]*(?=")' />
<cfset resluthtml = REReplace (htmlcont,regxv, "http://myremoteServer.com/234001/images/") />


<cfdump var="#resluthtml#" label="resluthtml" >
Mykola
  • 3,343
  • 6
  • 23
  • 39
IBM
  • 252
  • 1
  • 12
  • Please consider using jSoup instead of Regex. See: http://www.bennadel.com/blog/2358-parsing-traversing-and-mutating-html-with-coldfusion-and-jsoup.htm – Henry Jan 14 '16 at 21:12

1 Answers1

1

Capturing with ColdFusion's built-in engine (Jakarta ORO, Perl) is a pain in the ***. So let's just use some Java magic here (POSIX):

<cfset regex    = createObject("java", "java.util.regex.Pattern").compile('<img [^>]*src=["'']([^"'']*)["'']')>

<cfset result   = createObject("java", "java.lang.StringBuilder").init()>
<cfset matcher  = regex.matcher(htmlcont)>
<cfset last     = 0>

<cfloop condition="matcher.find()">

    <cfset result.append(
        htmlcont.substring(
            last,
            matcher.start()
        )
    )>

    <cfset token = matcher.group(
        javaCast("int", ( matcher.groupCount() gte 1 ? 1 : 0 ))
    )>

    <!--- go with your replace logic here, token is the value of the [src] attribute --->
    <cfset token = ("http://myremoteServer.com/234001/images/" & listLast(token, "/"))>

    <cfset result.append(token)>

    <cfset last = matcher.end()>
</cfloop>

<cfset result.append(
    htmlcont.substring(last)
)>
<cfset result = result.toString()>

<cfdump var="#result#">
Alex
  • 7,743
  • 1
  • 18
  • 38