I have reviewed the similar question See and clear Postgres caches/buffers? , but all of the answers focus on the data buffers, and Postgresql has changed a lot since 2010.
Unlike the OP of that question, I am not looking for consistent behavior when computing performance, I am looking for adaptive behavior as the database changes over time.
In my applicaiton, at the beginning of a job execution, rows in the working tables are empty. Queries run very quickly, but as time goes on performance degrades because the prepared statements are not using ideal access paths (they were prepared when the tables were empty - doh!). Since a typical execution of the job will ultimately cover a few hundred million rows, I need to minimize all of the overheads and periodically run statistics to get the best access paths.
In SQLServer, one can periodically call update statistics
and DBCC FreeProccache
, and the prepared statements will automatically be re-prepared to use the new access paths.
Edit: FreeProcCache: in SQLServer, prepared statements are implemented as stored procedures. FreeProcCache wipes the compiled stored procedures so that they will be recompiled on the next invocation, and the new access paths come into effect immediately.
Edit: Details of postgresql management of prepared statements: Postgresql defers the prepare until the first call to EXECUTE
, and caches the result ofthe prepare after the 5th execution. Once cached, the plan is fixed until the session ends or the prepared statement is freed with DEALLOCATE
. Closing JDBC objects does not invoke DEALLOCATE
, as an optimization to support open/read/close programming like many web apps display.
Is there a way to force a (Edit)JDBC prepared statement to recompile, (Edit) after running ANALYZE
, so it will use the latest statistics?
EDIT: I am using JDBC PreparedStatement
to prepare and execute queries against the database and the Postgres JDBC driver.