I've got a S3 bucket being updated in realtime with API data. The files are saved with a .XXX format, where xxx is 1...n.
My R script needs to be able to grab the latest files and add them to the analysis dataframe. I've been using the aws.s3 package so far. After setting secret/access keys to environment:
mybucket <- get_bucket("mybucket1")
Returns an s3 object of 1000 elements (presumably more), and it looks like each object has Contents:list if 7, one of which is $LastModified. How do I get the name of the last modified file?
Mybucket Large s3_bucket (1000 elements, 2.1Mb)
contents:List of 7
..$ Key : chr "folder1"
..$ LastModified: chr "2018-01-16T09:58:47.000Z"
..$ ETag : chr "\" nnnnnnnnnnn\""
etc (.. $Owner, $Storage class, $bucket, $-attr)
contents: List of 7
..$ Key : chr "folder1/file.1
..$ LastModified: chr "2018....etc"
..$ ETag : chr "...etc..."
etc....
contents: List of 7
etc.....
It's really the number after 'file.' that I need (in this case it would be 1).
After experimentation, I think and CLI command through RCurl would be a better option.
aws s3 ls s3://mybucket --recursive | grep APIdata@symbol=XXX&interval=5.1*
This gets me really close, but the command is leaving out the '&interval=5.1*' so it's returning ALL objects with 'APIdata@symbol=XXX*'