Extract data from a column and populate extraction as new variable in existing dataframe

Question

I have a dataframe of email addresses I need to split up by address and domain. I found tidyr and its separate command, but when I run separate, I either add a dataframe to my dataframe, called "new_var," or it prints out the correctly separated data into the console.

I need the separated data to be added as new columns to my existing dataframe.

I am using something like

separate(email_data, EMAIL_ADDRESS, into=c("address","domain"), sep="@", remove=FALSE)

I need the result to add two columns to my 'email_data' DF, one named address, and one named domain.

I looked through here and elsewhere, I tried to add use paste( instead of c( , but that didn't do it.

Any help is appreciated.

Thank you !

Are you assigning the resulting data frame back to a variable? Also, you need to supply some anonymized sample data so [your example is reproducible](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610), even if just something like `email_df <- data.frame(email = 'name@domain.com', stringsAsFactors = FALSE)` — alistaire, May 14 '18 at 19:01

score 1 · Answer 1 · answered May 14 '18 at 17:51

The two answers supplied were helpful (and appreciated), but neither got me exactly what I needed, which is partially my fault. All I really need is the domain portion of the email address.

I was able to extract it from the email_address field and give it its own column with the following:

email_data$domain1 <- substring(email_data$EMAIL_ADDRESS, 
regexpr("@", email_data$EMAIL_ADDRESS) + 1)

substring(text, start, stop)
text = email_address field
start = +1 character after @ symbol
stop = blank b/c I want everything after the @ symbol

score 0 · Answer 2 · answered May 14 '18 at 16:39

0

Here an example from a former machine learning problem:

merc1 <- merc %>% separate(category_name,into=c("cn1","cn2","cn3"),sep="/",extra="drop")is your input column character ?

Peter

answered May 14 '18 at 16:39

Peter Hahn

148
8

Prany · Answer 3 · 2018-05-14T17:06:14.883

0

You can use below code

library(stringr)    
email_data <- str_split_fixed(email_data$EMAIL_ADDRESS, "@", 2)
colnames(email_data) <- c("Address","Domain")

I have tested this and this will work.

Edit :Adding an example

Name <- c('testname', 'testname1234')
EMAIL_ADDRESS <- c('pk@sss.com', 'qwert@tyuu.com')
Init_frame <- data.frame(Name,EMAIL_ADDRESS )
Init_frame

email_data <- data.frame(EMAIL_ADDRESS)
library(stringr)
email_data <- str_split_fixed(email_data$EMAIL_ADDRESS, "@", 2)
colnames(email_data) <- c("Address","Domain")
email_data

Init_frame <- data.frame (Name,email_data)
Init_frame

edited May 14 '18 at 17:06

answered May 14 '18 at 16:47

Prany

2,078
2
13
31

Part of the way there, yes, thank you. This is one variable in a df with many. I need it to look at the email address field, and add two new columns to the DF, one is the address, one is the domain. – Adam_S May 14 '18 at 16:51
1

In this case create the email address in a temporary df, split it and add to the main dataframe. For eg like this. ---- Name <- c('testname', 'testname1234') EMAIL_ADDRESS <- c('pk@sss.com', 'qwert@tyuu.com') Init_frame <- data.frame(Name,EMAIL_ADDRESS ) Init_frame email_data <- data.frame(EMAIL_ADDRESS) library(stringr) email_data <- str_split_fixed(email_data$EMAIL_ADDRESS, "@", 2) colnames(email_data) <- c("Address","Domain") email_data Init_frame <- data.frame (Name,email_data) Init_frame – Prany May 14 '18 at 17:01

Extract data from a column and populate extraction as new variable in existing dataframe

3 Answers3