1

I need to find email addresses in a block of plain text. Regexps regularly hang. Is there some java library that can locate emails in text string?

From a later comment: I need to find email in a string to replace it w. hyperlink.

Freiheit
  • 8,408
  • 6
  • 59
  • 101
Stepan Yakovenko
  • 8,670
  • 28
  • 113
  • 206
  • 3
    you are talking about your own emails right? in times of PRISM I just had to ask... – Najzero Jun 19 '13 at 13:31
  • You are searching text data which happens to be an email. How is this text data formatted or stored? Do you have it in a flat file? In a database? – Freiheit Jun 19 '13 at 13:33
  • 1
    Do you mean finding valid email addresses in plain text or search email text for some string? – Wim Deblauwe Jun 19 '13 at 13:39
  • Are you sure it's the regex handling that is resource-greedy (sorry for the unintended pun here)? You could probably improve performance by using a constant Pattern (see also Anirudh's answer for a Pattern example) and only one instance of Matcher that iterates over the find() method until text is parsed. If your corpus is very large, you could use a buffered reader and reinitialize your Matcher every line... Need to see some code here. – Mena Jun 19 '13 at 14:23
  • I identify emails to make href. – Stepan Yakovenko Jun 19 '13 at 14:39
  • Can you share a sample input and what you want the output to look like? – Freiheit Jun 20 '13 at 14:51

1 Answers1

2

An email address is valid if you can send a message to it

Use \S+@\S+(warning: even space is valid) regex to search for emails.

Then you should send a message to that email and wait for response from the user.

If email address is valid you would receive a response,if not then you can assume that the email address is invalid.This is the only correct way to validate an email address.


have a look at

Community
  • 1
  • 1
Anirudha
  • 32,393
  • 7
  • 68
  • 89