0

I have a dataframe

  Fruit  Amount  Price
  Apple    2      20
  Banana   NA     59
  Grapes   NA     NA

I want to create a "key" column by concatenating them with ~ between the columns. Output I need -

  1. Apple~2~20
  2. Banana~~59
  3. Grapes~~

Currently using paste(fruit, amount,price, sep="~", collapse=NULL) But I get the outputs with NA-

  1. Banana~NA~59
  2. Grapes~NA~NA Any help on any other function that can be used ?
Rushabh Patel
  • 2,672
  • 13
  • 34

3 Answers3

0

Try this-

output <- paste(fruit, amount,price, sep="~", collapse=NULL)
gsub("NA",'',output)

Input data:

fruit <- c('Apple','Banana','Grapes')
amount <- c(100,20,NA)
price <- c(NA,60,10)
Rushabh Patel
  • 2,672
  • 13
  • 34
  • Rstudio went not responding after I ran this. The real dataset has more than 300,000 rows. Waited for 15 minutes for it torespond back, when it was finally functioning normally, however the NAs were still there. Hadn't been replaced by the blanks. I wonder if this due to the huge size of the dataframe. – Aditya Biradar Oct 10 '19 at 09:15
0

You may replace the NAs with empty strings and then work with this modified data.frame

df <- read.table(text = "Fruit Amount Price
Apple 2 20
Banana NA 59
Grapes NA NA", header = TRUE, stringsAsFactors = FALSE)

df[is.na(df)] = ""
apply(df, 1, paste, collapse = "~")

# "Apple~2~20" "Banana~~59" "Grapes~~"  
Ape
  • 1,159
  • 6
  • 11
0

Here's a solution that doesn't require changing the original dataset or generating an additional one:

data$key<-apply(data,1,function(x) paste(ifelse(is.na(x),"",x),collapse="~"))

   Fruit Amount Price         key
1  Apple      2    20 Apple~ 2~20
2 Banana     NA    59  Banana~~59
3 Grapes     NA    NA    Grapes~~
iod
  • 7,412
  • 2
  • 17
  • 36
  • So the problem I'm facing is, the dataframe I've used in the problem is just a dummy example. The real one consists of more than 15 columns out of which I need to select 7 columns to concatenate this way ( may not be consecutive). The number of rows as well is more than 300,000. Using this function concatenates all the columns rather than specific ones. – Aditya Biradar Oct 10 '19 at 09:12
  • That's easy, just replace `data` inside the `apply` to `data[COLS]` where COLS is a vector of strings of all the column names you want to concatenate. – iod Oct 10 '19 at 12:34