I am running a job on AWS EMR (AMI 5.2). I have large files in S3, that I would like to copy and split into another S3 location using s3-dist-cp
. Here is the command I am using:
s3-dist-cp --src=s3://my-bucket/dir1/ --dest=s3://my-bucket/dir2/ --groupBy='(.*)' --targetSize=2
I get no errors, and the grouping seems to work fine (even when using other regexes). However, the target sizing does nothing. The file is simply being copied the the destination and not being split. The source file in this case is 50MB.