0

I tried to rbind a row including a string to a data frame which didnt work. Code:

df1 = data.frame(id=1:5, name=c("peter","kate","lisa","daniel","paul"))
df2 = data.frame(id=5:1, age=c(11,24,25,67,2))
df3 = merge(df1,df2)
df3 = rbind(df3, c(6, "hannah", 30))
df3
str(df3)

Result:

> df1 = data.frame(id=1:5, name=c("peter","kate","lisa","daniel","paul"))
> df2 = data.frame(id=5:1, age=c(11,24,25,67,2))
> df3 = merge(df1,df2)
> df3 = rbind(df3, c(6, "hannah", 30))
Warning message:
In `[<-.factor`(`*tmp*`, ri, value = "hannah") :
  ungültiges Faktorniveau, NA erzeugt
> df3
  id   name age
1  1  peter   2
2  2   kate  67
3  3   lisa  25
4  4 daniel  24
5  5   paul  11
6  6   <NA>  30
> 
> str(df3)
'data.frame':   6 obs. of  3 variables:
 $ id  : chr  "1" "2" "3" "4" ...
 $ name: Factor w/ 5 levels "daniel","kate",..: 5 2 3 1 4 NA
 $ age : chr  "2" "67" "25" "24" ...

I looks like R made the name column a Factor column which is why it doesnt accept the string value. How do I solve this? Which is more advisable: convert the whole column into a string column (if this exists) or convert the new string into a factor? How to do this?

Thanks!

Julian
  • 591
  • 5
  • 14
  • 3
    Create the `data.frame` with `stringsAsFactors = FALSE` – akrun May 10 '18 at 18:53
  • If you don't know the complete list of potential values up front, then working with a `character` (string) column will be easier than working with a `factor` column. As akrun says, use `stringsAsFactors = FALSE` inside your `data.frame()` calls to keep all the string columns in the `character` class. – Gregor Thomas May 10 '18 at 19:05
  • 3
    Possible duplicate of [rbind char vector to data frame](https://stackoverflow.com/questions/21074284/rbind-char-vector-to-data-frame) – Yannis Vassiliadis May 10 '18 at 19:05
  • Cool, `stringsAsFactors = FALSE` works. Thanks! – Julian May 13 '18 at 11:50

1 Answers1

1

A good solution is to use the dplyr package and to create tibbles instead of data frames (a tibble is a modern type of data frame which creates character variables as standard and not factors).

library(dplyr)
df1 <- tibble(id=1:5, name=c("peter","kate","lisa","daniel","paul"))
df2 <- tibble(id=5:1, age=c(11,24,25,67,2))
df3 <- left_join(df1,df2) #or merge(df1, df2) as you prefere so
df3 <- rbind(df3, c(6, "hannah", 30))
df3
str(df3)
jhvdz
  • 186
  • 6