5

I'm using Athena Query Execution to retrieve data from a Glue Table. A Crawler updates this table every hour using a S3 Bucket which is continuously updated by Kinesis Firehose.

My Node.js server executes basic queries using Athena. But I realized that some of the requests takes so long that my server throws Server Request Timeout.

I checked the Query History in Athena and I saw some of the latest requests' state is Queued which means they are waiting to be executed. They all have a small Run Time in the range of 1 to 5 seconds. It's obvious that the problem is not the Run Time causes timeouts but it's the queue.

How can I speed up the execution of these queries? or how can I increase concurrent execution limits so Athena immediately executes them?

Emre Alparslan
  • 1,022
  • 1
  • 18
  • 30
  • Have you checked the _official_ performance tuning tips listed [here](https://aws.amazon.com/blogs/big-data/top-10-performance-tuning-tips-for-amazon-athena/)? – Tasos P. Nov 27 '19 at 14:55
  • 1
    @Cascader Yes, I built it reading this documentation. Queries are executed in 1-5 secs with success. I just need to speed up query executions without waiting queue. – Emre Alparslan Nov 27 '19 at 15:23
  • I think you you have a problem similar to [this one](https://stackoverflow.com/questions/57145967/aws-athena-concurrency-limits-number-of-submitted-queries-vs-number-of-running). How many queries are actually getting executed in your case? – Ilya Kisil Nov 27 '19 at 15:46
  • @IlyaKisil It's about up to max 20 queries per minute. Yes I saw this question but honestly couldn't understand what I need to do for a solution. – Emre Alparslan Nov 27 '19 at 16:08
  • 2
    Athena is designed for analytical queries. If you have use case that needs transaction like concurrency, run one query that will load the needed data to RDS and query RDS (replace RDS with DynamoDB, ElasticSearch or any other relevant data store). – Guy Nov 28 '19 at 12:24
  • 2
    Service limits allow you to concurrently *submit* up to 20 queries to Athena https://docs.aws.amazon.com/athena/latest/ug/service-limits.html However, according to AWS: "After you submit your queries to Athena, it processes the queries by assigning resources based on the overall service load and the amount of incoming requests. We continuously monitor and make adjustments to the service so that your queries process as fast as possible." https://docs.aws.amazon.com/athena/latest/ug/release-note-2018-05-17.html So no guarantees on how many will *execute* concurrently. – Nathan Griffiths Dec 02 '19 at 01:20

1 Answers1

4

You can contact AWS support to increase the concurrent active queries limit, BUT that will not affect/decrease the **Queued** state

By definition, Queued state indicates that the query has been submitted to the service, and Athena will execute the query as soon as resources are available. resources here is refer to Athena resources not yours. https://docs.aws.amazon.com/athena/latest/APIReference/API_QueryExecutionStatus.html

I think there is nothing you can do about this Queued state.

  • What happens if I cancel a queued query? Am I still charged for it? – Dror Apr 22 '20 at 06:04
  • 1
    @Dror No, cancelled queries are only charged if they've started executing and have scanned some data, and even then you're charged only by the data they've already scanned. https://aws.amazon.com/athena/pricing/ quote: "There are no charges for Data Definition Language (DDL) statements like CREATE/ALTER/DROP TABLE, statements for managing partitions, or failed queries. Cancelled queries are charged based on the amount of data scanned." – Oren Apr 23 '20 at 08:03