0

Im really new to R so Im sorry if I dont make total sense.

I have a DF with data collected in several different areas, called STAND. I need to create a sequence for my data running from 1:3, but is has to restart the sequence when it comes to a new STAND number.

Here is some dummy data

    STAND   TREE_SPECIES    DIAMETER
1   101737  Pine             276
2   101737  Spruce           98
3   101737  Spruce       104
4   101737  Leaf         53
5   155897  Spruce       82
6   155897  Spruce       61
7   155897  Leaf         97
8   155897  Spruce       89
9   155897  Spruce       75
10  202568  Spruce       46
11  202568  Spruce       56
12  202568  Pine         204
13  202568  Spruce       132
14  202568  Spruce       93 

I want it to look like this:

    STAND   TREE_SPECIES    DIAMETER     SEQ
1   101737  Pine             276          1
2   101737  Spruce           98           2
3   101737  Spruce       104          3
4   101737  Leaf         53           1
5   155897  Spruce       82           1
6   155897  Spruce       61           2
7   155897  Leaf         97           3
8   155897  Spruce       89           1
9   155897  Spruce       75           2
10  202568  Spruce       46           1
11  202568  Spruce       56           2
12  202568  Pine         204          3
13  202568  Spruce       132          1
14  202568  Spruce       93           2

If it is any help I have a total of 7416 Rows in my DF divided on 90 STANDS.

So far I've tried:

  myDF$SEQ <- seq(1:3)

But that only lists 1:3 over the whole df.

Really greatful for your help!

Erik
  • 1
  • 1
  • 1
    Hi and Welcome to stackoverflow! As you are new on SO, please take some time to read [about Stackoverflow](http://stackoverflow.com/about) and [how to ask](http://meta.stackoverflow.com/help/how-to-ask). You are much more likely to receive an answer if you provide a [minimal, reproducible data set](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610) and show us your attempted solutions, why they didn't work, and the expected results. Thanks! – Henrik Oct 02 '13 at 15:28

1 Answers1

1

Try this:

df$SEQ <- ave(x = df$STAND, df$STAND, FUN = function(y) rep(1:3, length.out = length(y)))

Or shorter, but with the same result, if you can live with warning messages due to the recycling of 1:3 sequences along the lengths of each STAND - lengths which are not necessarily multiples of the length of 1:3:

df$SEQ2 <- ave(df$STAND, df$STAND, FUN = function(y) 1:3)

The result:

df
#     STAND TREE_SPECIES DIAMETER SEQ SEQ2
# 1  101737         Pine      276   1    1
# 2  101737       Spruce       98   2    2
# 3  101737       Spruce      104   3    3
# 4  101737         Leaf       53   1    1
# 5  155897       Spruce       82   1    1
# 6  155897       Spruce       61   2    2
# 7  155897         Leaf       97   3    3
# 8  155897       Spruce       89   1    1
# 9  155897       Spruce       75   2    2
# 10 202568       Spruce       46   1    1
# 11 202568       Spruce       56   2    2
# 12 202568         Pine      204   3    3
# 13 202568       Spruce      132   1    1
# 14 202568       Spruce       93   2    2

ave splits the x vector (here: STAND), into pieces defined by the levels of the next (unnamed) argument (here: STAND). The default function FUN that is applied to each of the pieces in ave is mean. Here we change that function to an 'anonymous function', function(y), that we define as rep(1:3, length.out = length(y)). The 'y' corresponds to each of the pieces. You could replace 'y' with any name of choice (e.g. function(chunk) rep(1:3, length.out = length(chunk)). You will see that quite often people uses function(x), but I didn't want to use 'x' as the name of each piece here because 'x' is also used as the argument in ave for the entire vector. For each piece, replicate the values 1:3 to the desired length (length.out), i.e. the length of each piece: length(y)

Henrik
  • 65,555
  • 14
  • 143
  • 159