0

I'm currently running a project in school to determine the number of fishes in 200 ponds across the country. My regression model looks like the following model:

Number of Fishes = Number of fish species + Age of fish + Surface Area of Pond + other variables.

I'm running this regression in R Studio.

Currently, I have 5 data sets of the same 200 ponds with the same variables but from different years (2015 - 2019). I was wondering if there is a way for me to combine the 5 data set together, taking account of the time factor which is the year, instead of running the regression individually for each year.

Thank you in advance for answering my question

Max Lim
  • 21
  • 4
  • Yes, this is possible depending on your data structure, but please post a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example), with data, functions you are trying, and libraries you are using. – LMc Mar 24 '21 at 14:59
  • Why did you tag this "logistic regression"? It looks like you should be using Poisson regression? – Roland Mar 24 '21 at 15:10
  • 1
    You can combine the data sets into a single one with a new column indicating which data set each row came from using dplyr like this: `bind_rows(lst(df15, df16, df17, df18, df19), .id = "year")` – G. Grothendieck Mar 24 '21 at 15:51
  • Will there be any implications if I just bind the dataset together using - bind_rows(lst(df15, df16, df17, df18, df19), .id = "year")? This will discount away the year factor I'm assuming? – Max Lim Mar 25 '21 at 01:22

0 Answers0