I have a big data with over 10 million entries.
I'm suppose to do any analysis I want on it and so I decided to focus on a subset of the population which was families in a certain country. So now I'm at about 150,000 entries. Now I have 26 variables and would like to run a logistic regression model on the data but R says
Error: cannot allocate vector of size 130.3 Gb
I'm assuming cause I just have too many variables. I tried searching up how to pick your variables for your model but functions like step require you to have the full model so I'm not sure how to proceed.
Am I supposed to eliminate variables I just don't think will have an effect on my response variables or is my data set still too big?