I have a table with a primary id column (automatically indexed), two sub-id columns (also indexed), and 12 columns of type geometry(Polygon). If I am going to update ~2 million rows on this table, is it faster for me to run ~2 million individual update statements like
update TABLE_NAME set ( COLUMNS ) = ( VALUES ) where ID_COLUMN = NEXT_ID
or is it faster to do some smaller number of larger update statements like in this answer
update TABLE_NAME as update_t set
COLUMNS = new_vals.COLUMNS
from (values
(id, polygon1val, polygon2val, ... polygon12val), /* row 1 */
(id, polygon1val, polygon2val, ... polygon12val), /* row 2 */
... /* ... */
(id, polygon1val, polygon2val, ... polygon12val) /* row N */
) as new_vals( COLUMNS )
where new_vals.id = update_t.id
If the latter, do you have any suggestions on what a good N
might be? Is N
= ~2mil, or some smaller subset (that I would repeat until they're all done)?
EDIT: Obviously, in the former case I would use a prepared statement. But I also wonder, in the latter case, is there any benefit in trying to use a prepared statement?
I'm using PostgreSQL 9.2.