1

If the two columns in my dataframe are:

species <- c("Dengue", "Dengue", "Dengue", "Dengue", "Dengue", "Dengue", "Dengue", "Dengue") 

And

strain <- c(1, NA, 2, NA, NA, 3, 4, 5)

How do I get a column that combines the two to say Dengue 1, etc.?

M--
  • 25,431
  • 8
  • 61
  • 93
Martin
  • 31
  • 4

2 Answers2

1

We can use unite

library(dplyr)
library(tidyr)
library(stringr)
df1 %>% 
     unite(species, species, strain)

If the NA needs to remain as NA, use str_c

df1 %>%
   transmute(species = str_c(species, strain, sep="_")) %>%
   fill(species)

If it is to filter out the NAs, then do the filter first

df1 %>%
   filter(!is.na(strain)) %>%
   transmute(species = str_c(species, strain, sep="_"))

data

df1 <- data.frame(species, strain)
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Thanks @akrun. I tried and got the following error message.......no applicable method for 'unite_' applied to an object of class "character" Also, what is the name of the dataset in the code and the name of the new variable? – Martin Jan 14 '20 at 20:49
  • @Martin I created the dataset using your vectors only – akrun Jan 14 '20 at 20:53
  • @Martin For the `NA` cases, what do you want to return – akrun Jan 14 '20 at 20:59
  • yes indeed. I tried it on my dataset and got the error message. I see your point. I don't want the NA on the new variable. I would also like the new variable to be called agent and the name of the dataframe can remain as df1. Thanks – Martin Jan 14 '20 at 21:04
  • @Martin Sorry, I can't reproduce the error with `packageVersion('tidyr')# [1] ‘1.0.0’` – akrun Jan 14 '20 at 21:06
  • Thanks @akrun . I have edited the code to suit the one you gave and now got the following error message Error: Evaluation error: could not find function "str_c". My package version is '0.8.2'. I suppose I need to update the version or is it not safe? – Martin Jan 14 '20 at 21:15
  • @Martin sorry, I forgot to add that. It is `library(stringr)` – akrun Jan 14 '20 at 21:20
1

You can use ifelse to suppress NA in your final output:

paste0(species, ifelse(is.na(strain),"",strain))

 #>  [1] "Dengue1" "Dengue"  "Dengue2" "Dengue"  "Dengue"  "Dengue3" "Dengue4" "Dengue5"
M--
  • 25,431
  • 8
  • 61
  • 93