2

Which one is faster pyathena or boto3 to query AWS Athena schemas using python script?

Currently I am using pyathena to query Athena schemas but it's quite slow and I know there is another option of boto3 but before starting need some experts advice.

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
L Lawliet
  • 419
  • 1
  • 7
  • 20
  • what are your requirements? from where do you want to run the query? do you just want to issue queries or also want the result to be fetched back to client? – Prabhakar Reddy Oct 11 '20 at 03:23
  • The query is actually running in the Athena compute and not on your side. The only thing that takes time from your side is loading the result data from S3, and the polling interval for Athena query end. What is the size of the result output? What is the query that you are running? – Guy Oct 12 '20 at 06:33

1 Answers1

5

Looking at the dependencies for PyAthena you can see that it actually have a dependency of boto3.

Unless PyAthena has added a lot of overhead to its library which is unlikely, the best performance improvements you're likely to see will depend on how you're using Athena itself.

There are many performance improvements you can make, Amazon published a blog named Top 10 Performance Tuning Tips for Amazon Athena which will help to improve the performance of your queries.

Chris Williams
  • 32,215
  • 4
  • 30
  • 68