0

I have a Laravel job that takes a few seconds to complete and can involve processing a few KBs of json. There's a point in it where there's a loop and several entries have to be inputted consecutively in MariaDB 10.2.36.

At some point, after switching Redis/Horizon from the sync queue driver, while the starting of jobs became more reliable, the jobs have been terminating after a few seconds, I think 10 seconds for the first time, and then 6 seconds for the second and third time (there are three tries). They then resume at exactly 90 seconds later! And then they start from the beginning.

I have looked through the php files in /Jobs, the horizon.php in config, the php.ini, the supervisor configuration, the redis configuration, and all the timeouts are 6000, 10000... I can't seem to figure out what is killing these jobs and making them start all over.

I'm thinking this may be a MySQL/MariaDB connection error. But why would this happen only after switching an async queue? Is it possible that the session variables are different from an apache initiated php process and a horizon initiated one? After running this code:

SHOW SESSION VARIABLES LIKE '%timeout%';

I obtain these values in my datagrip connection. Is this dependent on which user I use to connect to the database? Datagrip after running query

I am wondering if this is related to the connect_timeout value. Here is the error I receive in Horizon:

Illuminate\Queue\MaxAttemptsExceededException: 
app\Jobs\SaveDocuments has been attempted too many times or run too long. The job may have previously timed out. in /home/my_user/public_html/vendor/laravel/framework/src/Illuminate/Queue/Worker.php:632

Update: I've added some more diagnostic info:

Results of SHOW ENGINE INNODB STATUS; right after the job terminates unsuccessfully:


=====================================
2021-04-19 08:23:35 0x7fe174121700 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 30 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 145 srv_active, 0 srv_shutdown, 29412 srv_idle
srv_master_thread log flush and writes: 29557
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 11539
OS WAIT ARRAY INFO: signal count 6447
RW-shared spins 254, rounds 7466, OS waits 243
RW-excl spins 122, rounds 1156, OS waits 12
RW-sx spins 4, rounds 102, OS waits 3
Spin rounds per wait: 29.39 RW-shared, 9.48 RW-excl, 25.50 RW-sx
------------
TRANSACTIONS
------------
Trx id counter 1273208
Purge done for trx's n:o < 1273208 undo n:o < 0 state: running
History list length 1
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 422081296630400, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 422081296626216, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 422081296622032, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 422081296609480, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 422081296634584, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 422081296613664, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 422081296617848, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 422081296605296, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
--------
FILE I/O
--------
I/O thread 0 state: waiting for completed aio requests (insert buffer thread)
I/O thread 1 state: waiting for completed aio requests (log thread)
I/O thread 2 state: waiting for completed aio requests (read thread)
I/O thread 3 state: waiting for completed aio requests (read thread)
I/O thread 4 state: waiting for completed aio requests (read thread)
I/O thread 5 state: waiting for completed aio requests (read thread)
I/O thread 6 state: waiting for completed aio requests (write thread)
I/O thread 7 state: waiting for completed aio requests (write thread)
I/O thread 8 state: waiting for completed aio requests (write thread)
I/O thread 9 state: waiting for completed aio requests (write thread)
Pending normal aio reads: [0, 0, 0, 0] , aio writes: [0, 0, 0, 0] ,
ibuf aio reads:, log i/o's:, sync i/o's:
Pending flushes (fsync) log: 0; buffer pool: 0
3303 OS file reads, 4022 OS file writes, 1875 OS fsyncs
17.17 reads/s, 16384 avg bytes/read, 14.10 writes/s, 6.87 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 6, seg size 8, 0 merges
merged operations:
insert 0, delete mark 0, delete 0
discarded operations:
insert 0, delete mark 0, delete 0
Hash table size 34673, node heap has 3 buffer(s)
Hash table size 34673, node heap has 2 buffer(s)
Hash table size 34673, node heap has 2 buffer(s)
Hash table size 34673, node heap has 1 buffer(s)
Hash table size 34673, node heap has 3 buffer(s)
Hash table size 34673, node heap has 2 buffer(s)
Hash table size 34673, node heap has 3 buffer(s)
Hash table size 34673, node heap has 2 buffer(s)
34.13 hash searches/s, 28.10 non-hash searches/s
---
LOG
---
Log sequence number 5389363988
Log flushed up to   5389363988
Pages flushed up to 5389363988
Last checkpoint at  5389363979
0 pending log flushes, 0 pending chkp writes
1188 log i/o's done, 4.90 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total large memory allocated 137232384
Dictionary memory allocated 1184256
Buffer pool size   8191
Free buffers       5238
Database pages     2935
Old database pages 1103
Modified db pages  0
Percent of dirty pages(LRU & free pages): 0.000
Max dirty pages percent: 75.000
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 0, not young 0
0.00 youngs/s, 0.00 non-youngs/s
Pages read 2786, created 149, written 2596
17.17 reads/s, 0.07 creates/s, 8.70 writes/s
Buffer pool hit rate 992 / 1000, young-making rate 0 / 1000 not 0 / 1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 2935, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
0 read views open inside InnoDB
Process ID=5069, Main thread ID=140605867353856, state: sleeping
Number of rows inserted 425, updated 214, deleted 5, read 32721201
3.30 inserts/s, 0.53 updates/s, 0.00 deletes/s, 4690.61 reads/s
Number of system rows inserted 0, updated 0, deleted 0, read 0
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================

Stack trace


Illuminate\Queue\MaxAttemptsExceededException: app\Jobs\SaveDocuments has been attempted too many times or run too long. The job may have previously timed out. in /home/my_user/public_html/vendor/laravel/framework/src/Illuminate/Queue/Worker.php:632
Stack trace:
#0 /home/my_user/public_html/vendor/laravel/framework/src/Illuminate/Queue/Worker.php(446): Illuminate\Queue\Worker->maxAttemptsExceededException(Object(Illuminate\Queue\Jobs\RedisJob))
#1 /home/my_user/public_html/vendor/laravel/framework/src/Illuminate/Queue/Worker.php(358): Illuminate\Queue\Worker->markJobAsFailedIfAlreadyExceedsMaxAttempts('redis', Object(Illuminate\Queue\Jobs\RedisJob), 3)
#2 /home/my_user/public_html/vendor/laravel/framework/src/Illuminate/Queue/Worker.php(314): Illuminate\Queue\Worker->process('redis', Object(Illuminate\Queue\Jobs\RedisJob), Object(Illuminate\Queue\WorkerOptions)) 
#3 /home/my_user/public_html/vendor/laravel/framework/src/Illuminate/Queue/Worker.php(134): Illuminate\Queue\Worker->runJob(Object(Illuminate\Queue\Jobs\RedisJob), 'redis', Object(Illuminate\Queue\WorkerOptions))
#4 /home/my_user/public_html/vendor/laravel/framework/src/Illuminate/Queue/Console/WorkCommand.php(112): Illuminate\Queue\Worker->daemon('redis', 'default', Object(Illuminate\Queue\WorkerOptions))
#5 /home/my_user/public_html/vendor/laravel/framework/src/Illuminate/Queue/Console/WorkCommand.php(96): Illuminate\Queue\Console\WorkCommand->runWorker('redis', 'default')
#6 /home/my_user/public_html/vendor/laravel/horizon/src/Console/WorkCommand.php(46): Illuminate\Queue\Console\WorkCommand->handle()
#7 /home/my_user/public_html/vendor/laravel/framework/src/Illuminate/Container/BoundMethod.php(36): Laravel\Horizon\Console\WorkCommand->handle()
#8 /home/my_user/public_html/vendor/laravel/framework/src/Illuminate/Container/Util.php(37): Illuminate\Container\BoundMethod::Illuminate\Container\{closure}()
#9 /home/my_user/public_html/vendor/laravel/framework/src/Illuminate/Container/BoundMethod.php(93): Illuminate\Container\Util::unwrapIfClosure(Object(Closure))
#10 /home/my_user/public_html/vendor/laravel/framework/src/Illuminate/Container/BoundMethod.php(37): Illuminate\Container\BoundMethod::callBoundMethod(Object(app\Application), Array, Object(Closure))
#11 /home/my_user/public_html/vendor/laravel/framework/src/Illuminate/Container/Container.php(590): Illuminate\Container\BoundMethod::call(Object(app\Application), Array, Array, NULL)
#12 /home/my_user/public_html/vendor/laravel/framework/src/Illuminate/Console/Command.php(134): Illuminate\Container\Container->call(Array)
#13 /home/my_user/public_html/vendor/symfony/console/Command/Command.php(255): Illuminate\Console\Command->execute(Object(Symfony\Component\Console\Input\ArgvInput), Object(Illuminate\Console\OutputStyle))
#14 /home/my_user/public_html/vendor/laravel/framework/src/Illuminate/Console/Command.php(121): Symfony\Component\Console\Command\Command->run(Object(Symfony\Component\Console\Input\ArgvInput), Object(Illuminate\Console\OutputStyle))
#15 /home/my_user/public_html/vendor/symfony/console/Application.php(1009): Illuminate\Console\Command->run(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#16 /home/my_user/public_html/vendor/symfony/console/Application.php(273): Symfony\Component\Console\Application->doRunCommand(Object(Laravel\Horizon\Console\WorkCommand), Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#17 /home/my_user/public_html/vendor/symfony/console/Application.php(149): Symfony\Component\Console\Application->doRun(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#18 /home/my_user/public_html/vendor/laravel/framework/src/Illuminate/Console/Application.php(93): Symfony\Component\Console\Application->run(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#19 /home/my_user/public_html/vendor/laravel/framework/src/Illuminate/Foundation/Console/Kernel.php(131): Illuminate\Console\Application->run(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#20 /home/my_user/public_html/artisan(41): Illuminate\Foundation\Console\Kernel->handle(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
O. Jones
  • 103,626
  • 17
  • 118
  • 172
Jonathan Ma
  • 556
  • 6
  • 20
  • 1
    I think this is anyone's guess until you can share the exact error message with us, not just the high level laravel one. – Shadow Apr 19 '21 at 11:57
  • Got it, thank you for the help, I found out how to get the innodb status, I will post that and the stack trace. – Jonathan Ma Apr 19 '21 at 12:17
  • Ok, I have added the results of SHOW ENGINE INNODB STATUS from PHPMyAdmin, which I ran 5 seconds after the job terminated unsuccessfully, as well as the stack trace. – Jonathan Ma Apr 19 '21 at 12:31
  • Does this answer your question? [Max Attempts Exceeded Exception queue laravel](https://stackoverflow.com/questions/62921745/max-attempts-exceeded-exception-queue-laravel) – Bernd Buffen Apr 19 '21 at 13:20
  • It does explain why the intervals are 90 seconds each, but I'm still not sure why the job is cutting off after 10 seconds. There are two other production servers with exact same configuration, except for using MariaDB 10.3.28 instead of 10.2.37 apparently, and they don't cut off with Laravel Horizon. I'm trying to upgrade to get the server consistent, but it is a challenge. I can confirm that the job works properly with a **sync** driver. – Jonathan Ma Apr 20 '21 at 19:56

0 Answers0