I have a large dataframe (300k + rows) based on animals positions (GPS).
Something like this:
idAnimal date elevation sex Distance park Presence
animal1 01-09-2018 2376 M 678 park1 1
animal1 01-09-2018 2402 M 1023 park1 1
animal1 01-09-2018 2366 M 933 park1 1
animal1 02-09-2018 2402 M 239 park1 1
animal1 02-09-2018 2428 M 423 park1 1
animal1 02-09-2018 2376 M 817 park1 1
animal1 02-09-2018 2354 M 1073 park1 1
animal1 03-09-2018 2337 M 210 park1 1
animal1 03-09-2018 2334 M 967 park1 1
animal1 03-09-2018 2406 M 242 park1 1
animal2 04-09-2018 2231 F 547 park1 0
animal2 04-09-2018 2343 F 506 park1 0
animal2 04-09-2018 2306 F 1190 park1 0
animal2 04-09-2018 2177 F 1219 park1 0
animal2 05-09-2018 2206 F 271 park1 0
animal2 05-09-2018 2318 F 142 park1 0
animal3 05-09-2018 2324 F 263 park2 1
animal3 05-09-2018 2259 F 996 park2 1
animal3 06-09-2018 2396 F 54 park2 1
animal3 06-09-2018 2436 F 1129 park2 1
animal3 06-09-2018 2380 F 811 park2 1
I created a subset in order to consider one observation per day per animal (to avoid temporal autocorrelation)
data%>% group_by (idAnimal, date)%>% sample_n (size = 1)
What I want to do is to create n subsets (let's say 100) with the condition above (one observation per day per animal) and then insert these 100 subsets in a binomial mixed model. All this to use up all my data, not only a part.
What worries me most is how to run such a model with 100 dataframes keeping together the estimates:
glmer (Presence~ Distance * sex + elevation* sex + (1 |park), family = binomial, data = data)
Thank you for any help.