If you are just looking to catch most emails this regex might work.
I got this regex from here How to validate an email address using a regular expression?
They talk about the much more complicated RFC822 email regex
#!/usr/bin/env ruby
input = $stdin.readlines # ctrl + D after paste
input.each do |f|
puts f if f[/^[a-zA-Z0-9_.+\-]+@[a-zA-Z0-9\-]+\.[a-zA-Z0-9\-.]+$/]
end
# test input
# foo@bar.com
# www.cnn.com
# test.email@go.com
# turdburgler@mcdo.net
# http://www.google.com
To write emails to a file:
#!/usr/bin/env ruby
file = File.open("emails.txt", "w")
input = $stdin.readlines # ctrl + D after paste
input.each do |f|
file.write(f) if f[/^[a-zA-Z0-9_.+\-]+@[a-zA-Z0-9\-]+\.[a-zA-Z0-9\-.]+$/]
end
file.close
Just to be clear, this is a ruby script which should be ran like this.
Save the script as a file, ie email_parser.rb
.
chmod +x email_parser.rb
./email_parser.rb # this will wait for stdin, here you paste the list in to the terminal
When the terminal is hanging waiting, paste the list of emails in, then press ctrl + D to tell the program that this is the EOF. The program will then run through the list of emails/urls and parse. The output of this will be a file if using the updated script. The file will be in the same folder you ran the script and be called emails.txt