0

I have the following type of data. A column with tree Species (33 types), one with tree ID (for every species many trees were observed), a column with some values for each tree of each species, year, month:

Tree.ID, Species, Values, Year, Month

The data set is quite big > 20 000 rows. For my further analyses I need to first transform the Species column in the data frame as unique characters, as later I will need to run a for loop for every species and analyse their behaviour over many years.

Here is the code I am using to turn the Species column in the data frame as unique characters. I am using the following code that is not working and returns only one species, instead of all 33 with all their values.

for(sp in unique(flower.B$Species)){
  dat.sp <- flower.B[flower.B$Species == sp,]

Any thoughts on this?

Gabriela
  • 11
  • 3
  • Can you provide a reproducible example showing both the current and desired output? – cdeterman Dec 11 '15 at 14:11
  • [More info on how to give a reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610) – Jaap Dec 11 '15 at 14:12
  • I think there is no need to do this now. A little bit later you have to split the dataframe by `Species` (and eventually by `Year`) and calculate your results. There are many ways to do that, e.g. `ddply` from the package `plyr` or using the package `data.table` or `by()` from base R. – jogo Dec 11 '15 at 14:15
  • flower.B is the data frame called – Gabriela Dec 11 '15 at 14:25
  • @Gabriela just to be a little more clear about what you're looking for, you want to extract the unique values of Species out of your data frame and then subset each unique Species by ID? – s_scolary Dec 11 '15 at 14:33
  • you could always try to the `split()` function. This will create a list of data frames split by the variable you specify. `data = split(flower.B,list(flower.B$Species,flower.B$Tree.ID))` – s_scolary Dec 11 '15 at 14:43
  • @ Colin. I have a whole code that removes NAs, interpolates some missing data, and does some graphics over many years (analyses the cycles of flowering). So, this for lop with unique values will help to run the other code by species. So it will take every species with all their trees, all the years and all the months and with all the values and will plot the cycles. That is what I want to do. – Gabriela Dec 11 '15 at 16:49

1 Answers1

0

You are overwriting dat.sp in each iteration of your loop, is this desired? In any case, the following works fine for me.

data(iris)
for(sp in unique(iris$Species)){
  print(iris[iris$Species %in% sp, ])
}

I'm not sure what the rest of your loop will consist of but I would recommend looking at the split command which is useful in conjunction with the apply family of functions. See the following for example:

lapply(split(iris, iris$Species), NROW)
Raad
  • 2,675
  • 1
  • 13
  • 26