So I am working on a pretty high traffic rails/heroku/postgres app, the backend only, and after running for hours, days, or weeks at times the database will randomly start taking 120 seconds to perform queries that usually take 2-3 seconds, and it clears up as soon as the app is restarted and everyone is essentially "kicked off". What could cause a database to start taking a ridiculously long time to perform all queries? The database is not running out of memory, it is being vacuumed regularly, and it is not running out of connections. There are around 500 users at times, dynos are autoscaling, and the web server is passenger. However this is probably something with PG as it is happening at the query level.
Asked
Active
Viewed 61 times
0
-
Looking the time in your graph I suspect it's when automated backup happens. You can verify does backup time match the time of spike. – Kristo Mägi Mar 23 '17 at 19:16
-
Thanks but no the backup did not run around then, it ran about 6 hours earlier and about 2 hours later. And there were no logical backups taken then either. – TheGmann Mar 23 '17 at 19:20
-
Did you check your logs ? It should say what process is taking so long. – Alexander Luna Mar 23 '17 at 19:35
-
Ok, then alternatively since it does run on shared resources Heroku may have some own operating task happening at the time. To the second topic to optimize the query I can suggest that if actually not needed then you can try to reduce ORDER BY-s in the query and gain some obvious win from it. And there are some other advanced tricks might interest you: http://postgres.cz/wiki/PostgreSQL_SQL_Tricks & http://postgres.cz/wiki/PostgreSQL_SQL_Tricks_III – Kristo Mägi Mar 23 '17 at 19:36
-
@AlexanderLuna my logs are showing many different queries starting to take a long time, but no backups or anything else of interest beginning around that time. I cannot see anything in the logs that stand out, so I am trying to diagnose common db issues that might cause something like this. – TheGmann Mar 23 '17 at 19:41
-
@KristoMägi I will look into this thank you. – TheGmann Mar 23 '17 at 19:41
-
My biggest question is what could cause all of the calls to the DB to increase proportionally to each other in the length of time they took? Since it wasn't a DB backup? – TheGmann Mar 23 '17 at 20:03
-
@TheGmann – as I mentioned then suspect you may be getting classical "Shared resources" hit that you can get at any cloud IaaS/PaaS provider/solution (inc. Heroku). This may give some light into this – http://stackoverflow.com/questions/29088113/heroku-sporadic-high-response-time/32465651#32465651 . It may make sense to contact with Heroku to get a more detailed answer from them. – Kristo Mägi Mar 23 '17 at 20:57