3

I am trying to assign plots based upon unique combinations of treatments.

The following code will generate the data worksheet I am trying to create:

mom_id = rep(1:20, each=120)

species = c(
  rep("dryoar",1200),
  rep("dryola",1200)
  )

soil = rep(
  c("C","S"), 600
  )  

light = rep(
  c(
    rep("G",2), rep("U",2)
    ),300
  )

soil_light = paste(soil, light, sep="_")

random_numbers = rnorm(2400) #for within plot randomization

master = data.frame(species, mom_id, soil, light, soil_light, random_numbers)

This will create a dataframe that looks like this

species mom_id  soil    light   soil_light  random_numbers
dryoar  1        C        G        C_G      0.160598163
dryoar  1        S        G        S_G      -0.280779835
dryoar  1        C        U        C_U      0.457491942
dryoar  1        S        U        S_U      0.643139979
dryoar  1        C        G        C_G      -0.763162649
dryoar  1        S        G        S_G      -1.146383360
dryoar  1        C        U        C_U      1.415396249
dryoar  1        S        U        S_U      1.103691681
dryoar  1        S        U        S_U      1.103691681
dryoar  1        C        G        C_G      1.694206627
dryoar  1        S        G        S_G      -0.767433114
dryoar  1        C        U        C_U      -0.570996961

I would like to have a new column, plot, within this dataframe that assigns each appearance of a specific R factor in the soil_light column (e.g. C_U) with a sequential number until a set limit before repeating a sequence again.

To illustrate

soil_light  plot
    C_U      1
    C_U      2
    C_U      3
    C_U      1
    C_U      2
    C_U      3
    C_G      1
    C_G      2
    C_G      3
    C_G      1
    C_G      2
    C_G      3

The solution I am looking for is similar to the solution found here, but I would like to have the numbers end for instance at 8 and repeating again from 1 to 8 for each specific factor that appears.

Bonus: The eventual solution is of course to uniquely identify each treatment combinations by their plots. As such as even better outcome would be:

soil_light  plot
    C_U      1
    C_G      9
    S_U      17
    S_G      25
    C_U      2
    C_G      10
    S_U      18
    S_G      26
     .       .
     .       .
     .       .
    C_U      8
    C_G      16
    S_U      24
    S_G      32

Where each unique factor is assigned a sequential number, but the sequence of numbers changes for each new factor. In the example given above, 1:8 would be reserved for the factor C_U, 9:16 for C_G, 17:24 for S_U, and 25:32 for S_G.

Community
  • 1
  • 1
Rewarp
  • 220
  • 1
  • 11

1 Answers1

2

Using data.table:

library(data.table)
dt = as.data.table(master)

dt[, plot := 1:8, by = soil_light]

data.table will recycle as necessary and you will get warnings if the sequence doesn't recycle perfectly

To get the bonus, use .GRP (which numbers the groups):

dt[, plot := 1:8 + (.GRP - 1) * 8, by = soil_light]
eddi
  • 49,088
  • 6
  • 104
  • 155
  • That's beautiful. Do you have any ideas though for my bonus objective? Or would it be better to ask a new question since it may not be as trivial to resolve? – Rewarp Jul 15 '14 at 22:29
  • @Rewarp tbh I didn't really understand the "bonus objective" - do you maybe want `dt[, list(plot = 1:8), by = soil_light]`? – eddi Jul 15 '14 at 22:35
  • I want to be able to easily track my treatment combinations from the plot numbers alone. The initial setup allows me to track the plots by paring the `soil_light` vector with the `plot` vector. If instead I could assign only 1:8 for instance to C_U, and 9:16 for C_G, and so on in increments of 8 until all unique identifiers of `soil_light` have been exhausted, that would be ideal. – Rewarp Jul 15 '14 at 22:43
  • It worked. Words can't describe how happy I am with this solution. Thanks. Would you mind explaining though what the `+ (.GRP - 1) * 8` does? It's not as intuitively understandable as the previous solution. – Rewarp Jul 15 '14 at 23:04
  • @Rewarp great :) `.GRP` numbers the groups starting from 1, so that expression becomes `1:8 + 0` for first group, `1:8 + 8` for second group, etc – eddi Jul 15 '14 at 23:32