0

How do I create a new ID label summarizing the information stored in two factors. I would like the one factor to be nested (!?) within the other one.

As a small example: Let's say I surveyed many trees and now I want to add a label to each examined branch that gives me the number of the tree and the number of the branch on that tree. It would be unnecessary and confusing, if all branches had just an ongoing ID.

Example code:

mydata = data.frame(tree   = rep(letters[1:3], each = 20),
                    branch = rep(round(runif(12)*1000, 0), each = 5),
                    values = runif(60))

(Please don't ask me, why the branches have such strange numbers, this is just an example!)

Of course, I could just use interaction(mydata$treat, mydata$labels) like this answer suggests. Also, for a unique ID I could use something like this. But both would give me an ongoing ID which is not discriminating between trees! I could also use a long and complicated for loop, but I'd like to have a simple answer (since I expect there to be one...).

Expected Output:

Should look something like the new ID column in the end.

mydata = data.frame(tree   = rep(letters[1:3],each = 20),
                    branch = rep(round(runif(12)*1000, 0), each = 5),
                    values = runif(60),
                    ID     = rep(rep(1:4, each = 5)));mydata

mydata$ID = interaction(mydata$tree, mydata$ID)

EDIT:

So the solution in the comments by @suchait works well for the example data actually gives me an ongoing ID not discriminating between trees. Also, I have no knowledge of the data.table package and I cannot get my head around how it works in detail. When I apply the solution to my tibble, it won't work (it gives me again an ongoing ID ignoring one factor). Therefore, I would really like to see a dplyr solution or something similar.

bamphe
  • 328
  • 3
  • 12

1 Answers1

1

A dplyr solution using group_by to group the branches from each tree separately then just converting the branch IDs to factors and using the factor number as the branch ID

library(tidyverse)

tmp <- mydata %>% 
  group_by(tree) %>% 
  mutate(ID = str_c(tree, as.numeric(as.factor(branch)), sep = "."))
GordonShumway
  • 1,980
  • 13
  • 19
  • Nice, now I understand why I had problems with my actual `data.frame`. The `branch` column was of type `factor` not `character`. Changing that, it now works well for me. Thank you! – bamphe Mar 20 '18 at 17:26