Scala 2.11 here, although this concerns the AWS S3 Java client API so it's really a Java question. It would be awesome if someone can provide an answer in Scala, but I'll happily accept any Java answer that works (I can always Scala-ify it on my own time).
I am trying to use the AWS S3 client library to connect to a bucket on S3 which was the following directory structure underneath it:
my-bucket/
3dj439fj9fj49j/
data.json
3eidi04d40d40d/
data.json
a874739sjsww93/
data.json
...
Hence every immediate child object under the bucket is a directory with an alphanumeric name. I'll call these the "ID directories". And each of these ID directories all have a single child object that are all named data.json
.
I need to accomplish several things:
- I need an array/map/datastruct of strings (Java
Array<String>
or ScalaArray[String]
) containing all the alphanumeric IDs of the ID directories (so element 0 is"3dj439fj9fj49j"
, element 1 is"3eidi04d40d40d"
, etc.); and - I need an array/map/datastruct of dates (Java
Array<Date>
or ScalaArray[Date]
) containing the Last Modified timestamp of each ID directory's correspondingdata.json
file. So ifmybucket/3dj439fj9fj49j/data.json
had a Last Modified date/timestamp of, say, 2017-05-29 11:19:24T, then that datetime would be the first element of this second array - These two arrays/maps/datastructs need to be associative, meaning I could access, say, the 4th element of the first (ID) array and get the 5th ID directory underneath
my-bucket
, and I could also access the 4th element of the second (date) array and get the Last Modified timestamp of the 5th ID directory'sdata.json
child object
These don't necessarily have to be arrays, they could be maps, tuples, whatever. I just need 1+ data structures to hold this content as described above.
From the lib's Javadocs I see an ObjectMetadata#getLastModified
field, but I don't see anything for reading parent directory paths for a given S3Object
(meaning the data.json
's parent ID directory). All in all, my best attempt is failing pretty spectacularly:
val s3Client = new AmazonS3Client(new BasicAWSCredentials(accessKey, secretKey))
val bucketRoot : S3Object = s3Client.getObject("myBucket","/")
// TODO: How to query 'bucketRoot' for all its child ID directories?
val idDirs : Array[S3Object] = ???
var dataMap : Map[String,Date] = null
idDirs.foreach(idDir ->
// TODO: getName() and getChildSomehow() don't exist...obviously
dataMap :+ idDir.getName() -> idDir.getChildSomehow("data.json").getObjectMetadata.getLastModified
)
Any S3 API gurus out there that can spot where I'm going awry, or nudge me in the right direction here? Thanks in advance!