To insert bulk data/populate a database in PostgreSQL, the fastest way would be to use COPY. Source
I have to populate a database. Right now I am getting write speeds as low as 100-200 per second. This involves sending many individual INSERTS through the C++ library libpqxx. The two reasons I suppose are:
- The data has many repeated records.(I have raw logs, which I parse and send.) Which causes primary key exception.
- The one-by-one sending of the Insert Statements.
The first one is out of my hands. However I was reading about the second one.
As far as I know tablewriter class was suited to this purpose. However it has apparently been deprecated. I have read that it possible to use stdin as a parameter to copy.
But after these clues I am lost. Can someone lead me to a solution?
Edit: Here is the code, where I have a function which executes the statemnt:
void pushLog(Log log,pqxx::connection *conn){
pqxx::work w(*conn);
std::stringstream stmt;
stmt<<"INSERT INTO logs VALUES('"<<log.getDevice()<<"','"<<log.getUser()<<"','"<<log.getDate()<<"','"<<log.getLabel()<<"');";
try{
pqxx::result res = w.exec(stmt.str());
w.commit();
}
catch(const std::exception &e){
std::cerr << e.what() << std::endl;
std::cout<<"Exception on statement:["<<stmt.str()<<"]\n";
return;
}
}
I establish the connection earlier, and pass a reference to it.
PS: The question might lack some details. If so, please comment, and I'll edit and add them.