I have read about various big data packages with R. Many seem workable except that, at least as I understand the issue, many of the packages I like to use for common models would not be available in conjunction with the recommended big data packages (for instance, I use lme4, VGAM, and other fairly common varieties of regression analysis packages that don't seem to play well with the various big data packages like ff, etc.).
I recently attempted to use VGAM to do polytomous models using data from the General Social Survey. When I tossed some models on to run that accounted for the clustering of respondents in years as well as a list of other controls I started hitting the whole "cannot allocate vector of size yadda yadda..." I've tried various recommended items such as clearing memory out and using matrices where possible to no good effect. I am inclined to increase the RAM on my machine (actually just buy a new machine with more RAM), but I want to get a good idea as to whether that will solve my woes before letting go of $1500 on a new machine, particularly since this is for my personal use and will be solely funded by me on my grad student budget.
Currently I am running a Windows 8 machine with 16GB RAM, R 3.0.2, and all packages I use have been updated to the most recent versions. The data sets I typically work with max out at under 100,000 individual cases/respondents. As far as analyses go, I may need matrices and/or data frames that have many rows if for example I use 15 variables with interactions between factors that have several levels or if I need to have multiple rows in a matrix for each of my 100,000 cases based on shaping to a row per each category of some DV per each respondent. That may be a touch large for some social science work, but I feel like in the grand scheme of things my requirements are actually not all that hefty as far as data analysis goes. I'm sure many R users do far more intense analyses on much bigger data.
So, I guess my question is this - given the data size and types of analyses I'm typically working with, what would be a comfortable amount of RAM to avoid memory errors and/or having to use special packages to handle the size of the data/processes I'm running? For instance, I'm eye-balling a machine that sports 32GB RAM. Will that cut it? Should I go for 64GB RAM? Or do I really need to bite the bullet, so to speak, and start learning to use R with big data packages or maybe just find a different stats package or learn a more intense programming language (not even sure what that would be, Python, C++ ??). The latter option would be nice in the long run of course, but would be rather prohibitive for me at the moment. I'm mid-stream on a couple of projects where I am hitting similar issues and don't have time to build new language skills all together under deadlines.
To be as specific as possible - What is the max capability of 64 bit R on a good machine with 16GB, 32GB, and 64GB RAM? I searched around and didn't find clear answers that I could use to gauge my personal needs at this time.