3

I was wondering if coldfusion has a build-in function to find email addresses in a string.

I am trying to read through a query output ex. "John Smith jsmith@example.com" and get out only the email.

I did something like this in the past where I was counting the spaces of the string and after the second string i was wiping out all the characters on the left which it was keeping the email address alone.

Though this can work in my situation, it is not safe and almost guarantee bugs and misuse of data that may come in in a different format such as "John jsmith@example.com" which in this case I will wipe away all the information.

Geo
  • 3,160
  • 6
  • 41
  • 82

2 Answers2

14

Regex is probably the easiest way. There is an ultimate regex for email that is quite large. This should cover most valid emails. This doesn't cover unicode for example. Note that the maximum TLD length is 63 (see this SO question & answer).

<cfset string="some garbae@.ca garbage@ca.a real@email.com another@garbage whatever another@email.com oh my!">

<cfset results = reMatchNoCase("[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,63}", string)>

<cfdump var="#results#">
Community
  • 1
  • 1
BKK
  • 2,073
  • 14
  • 17
1

You can use this UDF from cflib.org from Ray Camden. It works great for me

<cfscript>
/**
 * Searches a string for email addresses.
 * Based on the idea by Gerardo Rojas and the isEmail UDF by Jeff Guillaume.
     * New TLDs  
     * v3 fix by Jorge Asch
                     * 
 * @param str    String to search. (Required)
 * @return Returns a list. 
 * @author Raymond Camden 
 * @version 3, June 13, 2011 
 */
function getEmails(str) {
    var email = "(['_a-z0-9-]+(\.['_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*\.    ((aero|coop|info|museum|name|jobs|travel)|([a-z]{2,3})))";
var res = "";
var marker = 1;
var matches = "";

matches = reFindNoCase(email,str,marker,marker);

while(matches.len[1] gt 0) {
    res = listAppend(res,mid(str,matches.pos[1],matches.len[1]));
    marker = matches.pos[1] + matches.len[1];
    matches = reFindNoCase(email,str,marker,marker);        
}
return res;
}
</cfscript>
Matt Busche
  • 14,216
  • 5
  • 36
  • 61
  • 1
    Note that this uses the specific TLD list approach, which since the TLD list is in constant change now, its safest to move to a character count approach. – BKK Dec 21 '12 at 23:36