0

I've got the following code in Google Spreadsheets: =JOIN(" ",B2:E2)

It merges three columns: 'First Name', 'Middle Initial', 'Last Name' in order to get me a new column, which I call 'Unique Name'

Now, Google Sheets doesn't let me join as an arrayformula (=ARRAYFORMULA(JOIN(" ",$B$2:$E)) returns error: "JOIN range must be a single row or a single column."

So, I'm trying to figure out how to do this in R.

My dataframe (I think I'm using that term correctly) name in R is "NSCH", and the columns are named 'First.Name', 'Middle.Initial', 'Last.name', 'Name.Suffix'

Other search terms: How to do a (google sheets|excel) "join" (function|formula) in R

Josh
  • 311
  • 3
  • 11
  • Bah I also didnt realize but you pointed to another stack overflow answer that addressed your question - **in your own answer**. Don't do this - it already has an answer! – Chris May 26 '16 at 00:03
  • Hi Cris, thanks for your comment. The answer that @thelatemail found is perfect - but I think that one reason I wasn't able to find it was that it uses the word "concatenate" instead of "join". (Meanqhile, the answer that I found myself took me some time to find, and contains extraneous information that was not relevant to the question). Given that, is it still a duplicate question? – Josh May 27 '16 at 21:49
  • Hi Josh - yes it is, concatenate is another (better) name for join, so your answer is a duplicate. The core of your question is "how do I concantenate two strings in r", not "how do I replicate the google join function" (even though the latter is how you asked). I would also say that the answer you linked is one of the simpler ones on the site - it is hard for me to see any extraneous information. – Chris May 31 '16 at 20:47

2 Answers2

1

I've found the following syntax How to merge two columns in R with a specific symbol?

DF$C = paste(DF$A, DF$B, sep="_")

Where DF = Dataframe name, and [A|B|C] = Column name

For my data frame, this would be:

NSCH$Unique.Name = paste(NSCH$First.Name, NSCH$Middle.Initial, NSCH$Last.Name, sep=" ")

In order to see the change in the data frame in my version of RStudio, I have to run the View (note the capital V) command:

View(NSCH)

I also had four possible items instead of three in my dataset, so to get a column with the true UniqueName, which joins all four columns I used the following syntax:

NSCH$Unique.Name = paste(NSCH$First.Name, NSCH$Middle.Initial, NSCH$Last.Name, NSCH$Name.Suffix, sep=" ")
View(NSCH)
Community
  • 1
  • 1
Josh
  • 311
  • 3
  • 11
  • 2
    I don't know why you are referring to yourself in the second person - answering your own questions is fairly encouraged here, no need to pretend it is not what is going on – Chris May 25 '16 at 23:28
  • 1
    You could also save a bit of typing by doing something like `do.call(paste, c(NSCH[c("First.Name", "Middle.Initial", "Last.Name")], sep="_") )` – thelatemail May 25 '16 at 23:56
1

Slightly easier might be unite from tidyr. Something like this:

library(tidyr)

unite(NSCH,
      Unique.Name,
      First.Name, Middle.Initial, Last.name, Name.Suffix,
      sep = ' ')
Mhairi McNeill
  • 1,951
  • 11
  • 20