0

I am using Rails 3.1.1 and I would like to recognize (maybe using a regex) if a string "contains"/"is"/"represents" one of the following:

  • an email address
  • a Web site URL
  • a number

I am trying to implement a method that, given a string, returns:

  • email if the string is something like my_nick@email_provider.org
  • website if the string is something like www.web_address.org
  • number if the string is something like 123
  • null otherwise

Is it possible? How can I make that?

Backo
  • 18,291
  • 27
  • 103
  • 170
  • 2
    I think the answer is reasonably obvious. Create a function that tests if input is an email; if so return "email". Then test if a website; if so return website. Then number. Finally return null. You can find individual solutions to each case by googling. – ean5533 Jan 11 '12 at 13:30

2 Answers2

1

Here's some code for you:

def whatami(input)
  return :email   if input =~ EmailRegex 
  return :website if input =~ WebsiteRegex
  return :number  if input =~ NumberRegex
  nil
end

You can look up individual regexes for each case above -- or perhaps you can find non-regex solutions for some cases.

undur_gongor
  • 15,657
  • 5
  • 63
  • 75
ean5533
  • 8,884
  • 3
  • 40
  • 64
1

It's definitely possible. But, wait, here's a Perl regex for email only. Are you sure want to continue on this path? :-)

/(?(DEFINE)
   (?<address>         (?&mailbox) | (?&group))
   (?<mailbox>         (?&name_addr) | (?&addr_spec))
   (?<name_addr>       (?&display_name)? (?&angle_addr))
   (?<angle_addr>      (?&CFWS)? < (?&addr_spec) > (?&CFWS)?)
   (?<group>           (?&display_name) : (?:(?&mailbox_list) | (?&CFWS))? ;
                                          (?&CFWS)?)
   (?<display_name>    (?&phrase))
   (?<mailbox_list>    (?&mailbox) (?: , (?&mailbox))*)

   (?<addr_spec>       (?&local_part) \@ (?&domain))
   (?<local_part>      (?&dot_atom) | (?&quoted_string))
   (?<domain>          (?&dot_atom) | (?&domain_literal))
   (?<domain_literal>  (?&CFWS)? \[ (?: (?&FWS)? (?&dcontent))* (?&FWS)?
                                 \] (?&CFWS)?)
   (?<dcontent>        (?&dtext) | (?&quoted_pair))
   (?<dtext>           (?&NO_WS_CTL) | [\x21-\x5a\x5e-\x7e])

   (?<atext>           (?&ALPHA) | (?&DIGIT) | [!#\$%&'*+-/=?^_`{|}~])
   (?<atom>            (?&CFWS)? (?&atext)+ (?&CFWS)?)
   (?<dot_atom>        (?&CFWS)? (?&dot_atom_text) (?&CFWS)?)
   (?<dot_atom_text>   (?&atext)+ (?: \. (?&atext)+)*)

   (?<text>            [\x01-\x09\x0b\x0c\x0e-\x7f])
   (?<quoted_pair>     \\ (?&text))

   (?<qtext>           (?&NO_WS_CTL) | [\x21\x23-\x5b\x5d-\x7e])
   (?<qcontent>        (?&qtext) | (?&quoted_pair))
   (?<quoted_string>   (?&CFWS)? (?&DQUOTE) (?:(?&FWS)? (?&qcontent))*
                        (?&FWS)? (?&DQUOTE) (?&CFWS)?)

   (?<word>            (?&atom) | (?&quoted_string))
   (?<phrase>          (?&word)+)

   # Folding white space
   (?<FWS>             (?: (?&WSP)* (?&CRLF))? (?&WSP)+)
   (?<ctext>           (?&NO_WS_CTL) | [\x21-\x27\x2a-\x5b\x5d-\x7e])
   (?<ccontent>        (?&ctext) | (?&quoted_pair) | (?&comment))
   (?<comment>         \( (?: (?&FWS)? (?&ccontent))* (?&FWS)? \) )
   (?<CFWS>            (?: (?&FWS)? (?&comment))*
                       (?: (?:(?&FWS)? (?&comment)) | (?&FWS)))

   # No whitespace control
   (?<NO_WS_CTL>       [\x01-\x08\x0b\x0c\x0e-\x1f\x7f])

   (?<ALPHA>           [A-Za-z])
   (?<DIGIT>           [0-9])
   (?<CRLF>            \x0d \x0a)
   (?<DQUOTE>          ")
   (?<WSP>             [\x20\x09])
 )

 (?&address)/x

Copied from here.

Community
  • 1
  • 1
Sergio Tulentsev
  • 226,338
  • 43
  • 373
  • 367
  • I think that is for a *valid* email though. More simply, you could check for an @ sign and you know it's an email address, and if not (as @ean5533 commented) then check for a word, then a number, then nil. – ian Jan 11 '12 at 17:17
  • @Iain: quick and dirty, yes, could work, if his app allows such relaxed checks :-) – Sergio Tulentsev Jan 11 '12 at 17:19