I read a lot of articles and answers here about Google Task, my doubt is "rate" and "bucket_size" behavior.
I read this documentation: https://cloud.google.com/appengine/docs/standard/java/configyaml/queue
The snippet is:
Configuring the maximum number of concurrent requests
If using the default max_concurrent_requests settings are not sufficient, you can change the settings for max_concurrent_requests, as shown in the following example:
If your application queue has a rate of 20/s and a bucket size of 40, tasks in that queue execute at a rate of 20/s and can burst up to 40/s briefly. These settings work fine if task latency is relatively low; however, if latency increases significantly, you'll end up processing significantly more concurrent tasks. This extra processing load can consume extra instances and slow down your application.
For example, let's assume that your normal task latency is 0.3 seconds. At this latency, you'll process at most around 40 tasks simultaneously. But if your task latency increases to 5 seconds, you could easily have over 100 tasks processing at once. This increase forces your application to consume more instances to process the extra tasks, potentially slowing down the entire application and interfering with user requests.
You can avoid this possibility by setting max_concurrent_requests to a lower value. For example, if you set max_concurrent_requests to 10, our example queue maintains about 20 tasks/second when latency is 0.3 seconds. However, when the latency increases over 0.5 seconds, this setting throttles the processing rate to ensure that no more than 10 tasks run simultaneously.
queue:
# Set the max number of concurrent requests to 50
- name: optimize-queue
rate: 20/s
bucket_size: 40
max_concurrent_requests: 10
I understood that queue works like this:
The bucket is the unit that determine amount of tasks that are execute.
The rate is amount of bucket are fill to execute per period.
max_concurrent_requests is the max simultaneously can be executed.
This snippet here maybe strange:
But if your task latency increases to 5 seconds, you could easily have over 100 tasks processing at once. This increase forces your application to consume more instances to process the extra tasks, potentially slowing down the entire application and interfering with user requests.
Imagine that max_concurrent_requests is not setted. For me, it is impossible execute more than 100 tasks because the bucket_size is 40. For me, the low tasks would impact on time that tasks will be wait for a empty bucket.
Why the documentation said that tasks can have over 100?
if the bucket is 40, can more than 40 run simultaneously?
Edit
The bucket is fill up just the all tasks were executed or if some bucket is free in next rate will be increase? Example: 40 buckets are executing. 1 bucket finished. Imagine that each bucket spend more than 0.5 seconds and some bucket more than 1s. When 1 bucket is free, this will fill up in next second or the bucket wait all tasks finishing before bucket fill up again?