-1

Just wondering if there is a way to expand rows which have multiple observations, into rows of unique observations using R? I have data in an excel spreadsheet with the variable headings: Lease, Line, Bay, Date, Predators, Food.Index, DD, MM, YY.

On some dates, there have been multiple predators (from 1 to 4) recorded in the same row. Other days just have 0. On a day where there has been 4 predators recorded, I would like to somehow transform the data to show four unique observations (instead of one row with 4 recorded under "Predators").

I have 1669 rows of data and multiple rows need to be expanded

Example of Data set
enter image description here
Many thanks for your help in advance.

enter image description here

Greeny
  • 7
  • 6
  • 1
    It sounds like you're trying to convert your data from wide to long format. Maybe search on those terms, and check out `gather` in `tidyr` for one way to do this. – ulfelder Apr 09 '17 at 11:00
  • Thank you for your reply. I have come across these functions in my search to solve the problem I have, however these options don't appear to do exactly what I require. – Greeny Apr 09 '17 at 11:09
  • 1
    If you're after an exact solution, post a reproducible example that includes representative data and shows what the output you want would look like. – ulfelder Apr 09 '17 at 11:11
  • 3
    Possible duplicate of [Replicate each row of data.frame and specify the number of replications for each row](http://stackoverflow.com/questions/2894775/replicate-each-row-of-data-frame-and-specify-the-number-of-replications-for-each) – Sotos Apr 09 '17 at 11:34
  • In the example dataset attached, row 4, which shows a 2 under predators, I would like to transform this into two rows, each showing 1 predator in each row. Then I would like to apply this to the entire dataset, so each row can only have a maximum of 1 predator. I have 1669 rows of data, so it will be difficult to specify the number of replications for each row, without going through each row and having a really long code. – Greeny Apr 09 '17 at 11:48
  • 3
    images != data, please add a minimal example of your data problem,`data <- read.table(text = "some data")` and check the guidelines http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Nate Apr 09 '17 at 11:53
  • `df[rep(1:nrow(df), df$Predators), ]` – Axeman Apr 09 '17 at 11:58
  • Thank you Axeman. This code also works. – Greeny Apr 09 '17 at 12:36

1 Answers1

1

Assuming you have your data in a data.frame, df, one possible solution would be

df.expanded <- df[rep(row.names(df), df$Predators), ]

EDIT: If you also want to keep the rows with 0 predators, you can use pmax to always return at least one:

df.expanded <- df[rep(row.names(df), pmax(df$Predators, 1)),]

Here the pmax(df$Predators, 1) will return the elementwise maximum of df$Predators and 1 so that it returns a new vector where each element is at least 1 but takes the value of df$Predators if that number is greater than 1.

Eumenedies
  • 1,618
  • 9
  • 13
  • Excellent, thanks so much. This works! Additional rows are recorded which represent each observation. If there were 4 predators, each row shows '4', however I can reassign these to have a value of 1. Many thanks for your help, it is much appreciated. – Greeny Apr 09 '17 at 12:14
  • Just one query, this gets rid of all observations that have a 0 value recorded. I would still like these observations - is there a way to keep these? – Greeny Apr 09 '17 at 12:33
  • I have updated the answer to account for that scenario. – Eumenedies Apr 09 '17 at 12:43
  • Excellent! Thanks so much. You are amazing :) – Greeny Apr 09 '17 at 12:53