-3

suppose, our data-frame is as follows-

(1, Mr. John, 20000) (2, Mr. Leo, 50000) (3, Miss Anne, 30000) (4, Mrs. Gerald, 35000)

I want to extract only(Mr., Miss, Mrs.) from the 'names'column and store it in a vector, how can i do this?

  • 1
    Hi chinmaya kalo and welcome to SO ! When asking a question it is best practice to provide a reproducible example and format the code in your question. You can find all the ways to do so here : [how-to-make-a-great-r-reproducible-example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) . (+1 to upvote your question to 0). – cbo Sep 22 '20 at 07:37
  • Hi Chinmaya, if you found any of the answers useful, please mark it so. – Karthik S Sep 22 '20 at 10:32

3 Answers3

1

Does this help?

> df <- data.frame(id = c(1,2,3,4), name = c('Mr. John', 'Mr. Leo', 'Miss Anne', 'Mrs. Gerald'), sal = c(20000, 50000, 30000, 35000), stringsAsFactors = 0)
> df
  id        name   sal
1  1    Mr. John 20000
2  2     Mr. Leo 50000
3  3   Miss Anne 30000
4  4 Mrs. Gerald 35000
> vec <- gsub('(^M.+)\\s([A-z].+)', '\\1', df$name)
> vec
[1] "Mr."  "Mr."  "Miss" "Mrs."
Karthik S
  • 11,348
  • 2
  • 11
  • 25
0

An alternative approach using dplyr (and the data frame created by Karthik):

vec <- as.vector(separate(df, name, sep = " ", into = "title", extra = "drop")[2]) 

Where df is your data frame, name is whatever name you have for your names column. You use sep to decide how to split the string up, into lets you choose the name of your new column (if you were keeping it as a column), extra lets you choose whether or not to display a warning (you are getting rid of surname so you would get a warning otherwise). The [2], shows you just want to keep the second column, which is the one you newly created. as.vector converts it to a vector.

If you wanted to add separate the names column into two columns (i.e title and surname) and keep them inside your data frame, you could do:

df2 <- separate(df, name, sep = " ", into = c("title", "surname"))
BubbleMaus
  • 96
  • 6
0

Do this, I don't know if sapply could be an option here as well, your data frame is also incorrectly defined, I suppose df is the df you have...

df=data.frame(id=c(1,2,3,4), name=c("Mr. John","Mr. Leo","Miss Anne", "Mrs. Gerald"), 
value=c(20000,50000,30000,35000))

splitted_name=strsplit(df$name," ")
a=character(0)
for (i in splitted_name)  a=append(a,i[1])
print(a)