-1

I have a data as follows:

emails
my email id: xxx.x@gmail.com
email to: bb_b@yahool.com
mailto: hj-hk@grk.co
you can send email to ghhd@test.co
gggh@gh.tom

I only want to extract the word containing "@" as follows:

email
xxxx@gmail.com
bbb@yahool.com
hjhk@grk.co
ghhd@test.co
gggh@gh.tom

Till now I was doing it manually for each row using

substring(data[1,1], 14)

But clearly this is the worst thing one can do when data size is as large as 900k. Any help will be highly appreciated. TIA.

user3642360
  • 762
  • 10
  • 23

1 Answers1

1

You could use regexpr.

regmatches(d$emails, regexpr("(\\S*\\@\\S+\\.\\S*)", d$emails))
# [1] "xxxx@gmail.com" "bbb@yahool.com" "hjhk@grk.co"    "ghhd@test.co"  
# [5] "gggh@gh.tom"   

Data

d <- structure(list(emails = c("my email id: xxxx@gmail.com", "email to: bbb@yahool.com", 
"mailto: hjhk@grk.co", "you can send email to ghhd@test.co", 
"gggh@gh.tom")), row.names = c(NA, -5L), class = "data.frame")
jay.sf
  • 60,139
  • 8
  • 53
  • 110
  • I just found that your code is not working if the email is having any special characters like . or _ or -. Can you please help me. I have modified my question with emails containing special characters – user3642360 Apr 24 '19 at 23:45
  • @user3642360 See update. – jay.sf Apr 25 '19 at 04:37