1

Our s3 buckets generally have a number of sub-directories, so that the path to a bucket is something like s3:functional-group/service/org-tenant-company-id/entity-id/actual-data

We're looking into Athena to be able to query against data on that /actual-data level, but within the org-tenant-company-id. So it seems like we need a way to either create a column or partition for that org-tenant-company-id. Is this possible?

I've read the page on partitions in the Athena docs. Seems like we may have to manually create partitions via the JDBC driver?

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
user26270
  • 6,904
  • 13
  • 62
  • 94

2 Answers2

0

Yes you can manually create the partitions, but if you set up you folder structure in hive format for example (s3:functional-group/service/org-tenant-company-id=xxxx/), the you can simply do a "MSCK REPAIR TABLE" command and Athena will automatically create all partitions for you.

Ted
  • 592
  • 6
  • 8
0

You can use the path as an attribute (How to get input file name as column in AWS Athena external tables) and use CTAS to create partitions.

Cornelius Roemer
  • 3,772
  • 1
  • 24
  • 55