1

I have a guest list that has a last name in one column and then in another column I have the first names or the full names (first space last) of each person in the family. I am wanting to get the other column to just have the first names.

gsub(guest.w$Last.Name,"",guest.w$Party.Name.s.)

That would work perfectly if I just had one row but how do it do it for each row in the dataframe. Do I have to write a for loop? Is there a way to do it in parallel similarly to the way pmax() relates to max().

My problem is similar in a way to a previously asked question by JD Long but that question was a piece of cake compared to mine.

Example

:

Smith; Joe Smith, Kevin Smith, Jane Smith
Alter; Robert Alter, Mary Alter, Ronald Alter

Becomes

Smith; Joe, Kevin, Jane
Alter; Robert, Mary, Ronald

Community
  • 1
  • 1
Farrel
  • 10,244
  • 19
  • 61
  • 99

3 Answers3

1

Using hadleys adply:

library(plyr)
df <- data.frame(rbind(c('Smith', 'Joe Smith, Kevin Smith, Jane Smith'), c('Alter', 'Robert Alter, Mary Alter, Ronald Alter')))
names(df) <- c("last", "name")
adply(df,1,transform, name=gsub(last, '', name))

You will probably need to clean up the spaces in your new vector.

Eduardo Leoni
  • 8,991
  • 6
  • 42
  • 49
0

you probably need to do some "wrapping" around your expression in order to get the apply() function working:

  • If your working on a data.frame you should use apply() (and not sapply())
  • you must create a function for apply (with a return clause)
  • working on data.frame line as function input is a bit tricky - they are converted into vectors and loose some properties (you can't use the $ sign to call named fields) so it's better to convert it first into a list

The final result looks something like this:

df <- rbind(c('Smith', 'Joe Smith, Kevin Smith, Jane Smith'), c('Alter', 'Robert Alter, Mary Alter, Ronald Alter'))
colnames(df) = c('Last.Name', 'Party.Name.s.')
apply(df,1,function(y) {y = as.list(y);return(gsub(y$Last.Name, "", y$Party.Name.s.))}) 
GDP
  • 8,109
  • 6
  • 45
  • 82
Izzy
  • 16
  • 1
-2

I am not sure it will work on a dataframe, but you could try one of the apply functions:

`y1 <- sapply(dataframe, gsub(guest.w$Last.Name,"",guest.w$Party.Name.s.))`
twolfe18
  • 2,228
  • 4
  • 24
  • 25
  • sapply(guest.w,gsub(guest.w$Last.Name,"",guest.w$Party.Name.s.)) No. I tried that Error in match.fun(FUN) : 'gsub(guest.w$Last.Name, "", guest.w$Party.Name.s.)' is not a function, character or symbol In addition: Warning message: In gsub(guest.w$Last.Name, "", guest.w$Party.Name.s.) : argument 'pattern' has length > 1 and only the first element will be used – Farrel Jan 16 '10 at 21:59