0

I try to call Model.bulkCreate method to insert four million data into the items table. But got error below:

Executing (default): DROP TABLE IF EXISTS "items" CASCADE;
Executing (default): DROP TABLE IF EXISTS "items" CASCADE;
Executing (default): CREATE TABLE IF NOT EXISTS "items" ("id"   SERIAL , "tenant_id" UUID NOT NULL, PRIMARY KEY ("id"));
Executing (default): SELECT i.relname AS name, ix.indisprimary AS primary, ix.indisunique AS unique, ix.indkey AS indkey, array_agg(a.attnum) as column_indexes, array_agg(a.attname) AS column_names, pg_get_indexdef(ix.indexrelid) AS definition FROM pg_class t, pg_class i, pg_index ix, pg_attribute a WHERE t.oid = ix.indrelid AND i.oid = ix.indexrelid AND a.attrelid = t.oid AND t.relkind = 'r' and t.relname = 'items' GROUP BY i.relname, ix.indexrelid, ix.indisprimary, ix.indisunique, ix.indkey ORDER BY i.relname;

<--- Last few GCs --->

[1599:0x103001000]    53198 ms: Mark-sweep 1396.2 (1424.2) -> 1395.6 (1423.2) MB, 1951.8 / 0.0 ms  (average mu = 0.082, current mu = 0.019) allocation failure scavenge might not succeed
[1599:0x103001000]    53210 ms: Scavenge 1396.4 (1423.2) -> 1395.8 (1423.7) MB, 4.2 / 0.0 ms  (average mu = 0.082, current mu = 0.019) allocation failure 
[1599:0x103001000]    53224 ms: Scavenge 1396.6 (1423.7) -> 1396.0 (1424.2) MB, 7.3 / 0.0 ms  (average mu = 0.082, current mu = 0.019) allocation failure 


<--- JS stacktrace --->

==== JS stack trace =========================================

    0: ExitFrame [pc: 0x1e2d4c25be3d]
Security context: 0x3a2af6f9e6e9 <JSObject>
    1: /* anonymous */(aka /* anonymous */) [0x3a2aed282249] [/Users/ldu020/workspace/github.com/mrdulin/node-sequelize-examples/node_modules/sequelize/lib/model.js:~2551] [pc=0x1e2d4c8edaa1](this=0x3a2a987026f1 <undefined>,values=0x3a2acd876e81 <Object map = 0x3a2aef0e87a1>)
    2: arguments adaptor frame: 3->1
    3: StubFrame [pc: 0x1e2d4c226699]
    4: m...

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
 1: 0x10003cff5 node::Abort() [/usr/local/bin/node]
 2: 0x10003d1ff node::OnFatalError(char const*, char const*) [/usr/local/bin/node]
 3: 0x1001b8265 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/usr/local/bin/node]
 4: 0x1005861c2 v8::internal::Heap::FatalProcessOutOfMemory(char const*) [/usr/local/bin/node]
 5: 0x100588c95 v8::internal::Heap::CheckIneffectiveMarkCompact(unsigned long, double) [/usr/local/bin/node]
 6: 0x100584b3f v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [/usr/local/bin/node]
 7: 0x100582d14 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/usr/local/bin/node]
 8: 0x10058f5ac v8::internal::Heap::AllocateRawWithLigthRetry(int, v8::internal::AllocationSpace, v8::internal::AllocationAlignment) [/usr/local/bin/node]
 9: 0x10058f62f v8::internal::Heap::AllocateRawWithRetryOrFail(int, v8::internal::AllocationSpace, v8::internal::AllocationAlignment) [/usr/local/bin/node]
10: 0x10055ef74 v8::internal::Factory::NewFillerObject(int, bool, v8::internal::AllocationSpace) [/usr/local/bin/node]
11: 0x1007e7254 v8::internal::Runtime_AllocateInNewSpace(int, v8::internal::Object**, v8::internal::Isolate*) [/usr/local/bin/node]
12: 0x1e2d4c25be3d 
[1]    1599 abort      npx ts-node 

code here:

import { Model, DataTypes } from 'sequelize';
import { sequelize } from '../../db';
import faker from 'faker';

class Item extends Model {}
Item.init(
  {
    tenant_id: {
      type: DataTypes.UUID,
      allowNull: false,
    },
  },
  { sequelize, tableName: 'items', modelName: 'item' },
);

(async function() {
  try {
    await sequelize.sync({ force: true });
    // seed
    const items = Array.from({ length: 4000 * 1000 }).map((_) => ({ tenant_id: faker.random.uuid() }));
    await Item.bulkCreate(items, { benchmark: true });
  } catch (error) {
    console.log(error);
  } finally {
    await sequelize.close();
  }
})();
  • sequelize version: "sequelize": "^5.21.3",
  • PostgreSQL version: PostgreSQL 9.6
  • "ts-node": "^8.10.2",
  • "typescript": "^3.9.6"

The code was written by TypeScript and I execute this script using npx ts-node index.ts command.

Is this a sequelize problem or a PostgreSQL database problem? Are there any restrictions on the amount of data when using Model.bulkCreate method.

How should I solve this?

UPDATE

The chunk size is 1000.

for (let i = 0; i < 4000; i++) {
  const items = Array.from({ length: 1000 }).map((_) => ({ tenant_id: faker.random.uuid() }));
  await Item.bulkCreate(items, { benchmark: true, returning: false });
}

It works. But it took 10 minutes to insert four million pieces of data. How should I decide the chunk size so I can achieve the maximum insertion speed?

Lin Du
  • 88,126
  • 95
  • 281
  • 483
  • It appears that `bulkCreate` returns the objects it creates (probably as an array) and you don’t have enough memory for them. Sequelize might offer something to deal with it, but it would be much less wasteful on many levels to write your own query that would generate the UUIDs server-side. – Ry- Jul 17 '20 at 07:54
  • 1
    You cannot generate 4m uuids and store them at once, thats why your out of memory. Try to chunk them. – Code Spirit Jul 17 '20 at 07:58
  • Eat less, live longer. Feed less data but repeat it, using loop. – Vahid Alimohamadi Jul 17 '20 at 08:03
  • @CodeSpirit Yeah, How do I know how the size of each chunk is supposed to be? So I can do multiple insertions concurrency – Lin Du Jul 17 '20 at 08:04
  • @slideshowp2 trial and error. – Code Spirit Jul 17 '20 at 08:45

2 Answers2

1

PostgreSQL can generate UUIDs for you (pgcrypto, uuid-ossp, or natively in PostgreSQL 13) saving all of the client-side memory, serialization, and parsing:

INSERT INTO items (tenant_id)
  SELECT gen_random_uuid() FROM generate_series(1, 4000 * 1000)

(this can also be done in chunks)

Ry-
  • 218,210
  • 55
  • 464
  • 476
0

It looks like the Node process just ran out of memory - as others have said, you can either chunk the data you're loading or raise the amount of heap memory with a flag:

npx ts-node index.ts --max_old_space_size=4096

or an environment variable:

NODE_OPTIONS=--max_old_space_size=4096

But this assumes you have 4GB of RAM available.

Lewis
  • 4,285
  • 1
  • 23
  • 36