i created an EC2 instance on AWS to run a R-Server. The instance type is "t2.micro". Than i used some code from Joshua Ulrich to get stock data from yahoo with the getSymbols() function.
In "nasdaq.symbols.head" are the first 100 ticker symbols from NASDAQ alphabetically.
My problem is, that executing the getSymbols()-function takes pretty much time. During execution of getSymbols() you can read the following message in the console:
"pausing 1 second between requests for more than 5 symbols"
The problem is i would like to get the data from all NASDAQ stocks, so more than 3500 ticker-symbols. Changing the instance type of EC2 to e.g. t2.2xlarge did not seem to accelerate the performance.
Here is the code i used.
# create environment to load data into
Data <- new.env()
getSymbols(nasdaq.symbols.head, from="2007-01-01", env=Data)
# calculate returns, merge, and create data.frame (eapply loops over all
# objects in an environment, applies a function, and returns a list)
Returns.nasdaq <- eapply(Data, function(s) ROC(Ad(s), type="discrete"))
Returns.nasdaq.DF <- as.data.frame(do.call(merge, Returns.nasdaq))
# adjust column names are re-order columns
colnames(Returns.nasdaq.DF) <- gsub(".Adjusted","",colnames(Returns.nasdaq.DF))
Returns.nasdaq.DF <- Returns.nasdaq.DF[,nasdaq.symbols.head]
tail(Returns.nasdaq.DF)