12

I have three columns of x, y, and z coordinates in a dataframe in R that I would like to concatenate into one xyz value, like below. I have tried 'paste' with 'collapse'="" and sep="" but am having trouble, I think it's something to do with text vs. numeric variables.

I have:
x y z 
1 2 3 
2 3 2 
3 1 4 
4 2 1 

I want:
x y z xyz
1 2 3 123
2 3 2 232
3 1 4 314
4 2 1 421

There has to be some extremely easy/simple way to do this in R but I have been Googling and looking through Stack Overflow off-and-on for the past couple days and nothing has come to my attention. All I need is the xyz column to be unique so I can run fixed-effects regressions, (x ranges from 1:4, y from 1:4 and z 1:10) so I have 160 possible combinations. Currently I am using different exponents on the x, y, and z values and then multiplying them to get unique values--surely there's a better way! Thanks

user3745597
  • 159
  • 1
  • 2
  • 9

3 Answers3

11

For example:

transform(df,xyz=paste0(x,y,z))
  x y z xyz
1 1 2 3 123
2 2 3 2 232
3 3 1 4 314
4 4 2 1 421

Or using interaction:

transform(df,xyz=interaction(x,y,z,sep=''))
  x y z xyz
1 1 2 3 123
2 2 3 2 232
3 3 1 4 314
4 4 2 1 421

`

agstudy
  • 119,832
  • 17
  • 199
  • 261
4
df$NewCol <- do.call(paste, c(df[c("x", "y", "z")], sep = ""))
David Arenburg
  • 91,361
  • 17
  • 137
  • 196
Amarjeet
  • 907
  • 2
  • 9
  • 14
3

Two other options for combining columns are dplyr::mutate() and tidyr::unite():

df <- read.table(text = 'x y z 
                 1 2 3 
                 2 3 2 
                 3 1 4 
                 4 2 1', header = T)

library(dplyr)

df %>%
  mutate(xyz_char = paste0(x, y, z)) %>%           
  mutate(xyz_num = as.numeric(paste0(x, y, z)))

Note that using paste() converts numeric values to character. You'd need to wrap it with as.numeric() to keep the field as a numeric value, if that's what you needed.

library(tidyr)

df %>% 
  unite(xyz, x:z, sep = '', remove = FALSE)

The default argument in tidy::unite() is remove = TRUE, which drops the original columns from the data frame.

sbha
  • 9,802
  • 2
  • 74
  • 62