3

I am using bulkCreate and uupdate

const item = await models.Gsdatatab.bulkCreate(gsdatamodel,{updateOnDuplicate: ["SCRIP","LTP","OHL","ORB15","ORB30","PRB","CAMARILLA"]});

I see the timestamps(createdAt and updatedAt) are not getting updated in DB after the the update. Do I need to explicitly pass those two in the bulKCreate to get them updated each time there is an update or is there any option I am missing. Also the id is getting incremented while rows are getting updated. I dont want the id column to auto increment in case of update.

I am using the extended model creation for defining the model

Siba Swain
  • 93
  • 7

1 Answers1

3

The following was run using

  • MySQL Server version: 8.0.25 MySQL Community Server
  • Sequelize version 6.6.5

Summary

Timestamps: The values returned from the .bulkCreate method can be misleading. You will need to query for the items after doing a bulkUpdate to find the new values. To quote the sequelize docs for version 6:

The success handler is passed an array of instances, but please notice that these may not completely represent the state of the rows in the DB. This is because MySQL and SQLite do not make it easy to obtain back automatically generated IDs and other default values in a way that can be mapped to multiple records. To obtain Instances for the newly created values, you will need to query for them again.

Also, to update the updatedAt column, it will need to be included in the array parameter for updateOnDuplicate. Otherwise, it will not receive a new timestamp.

Non-sequential primary keys: The next auto_increment value for the MySQL primary key appears to be incremented when an update is being done. I'm not really sure if there's a way to prevent this from happening. However, it is still possible to insert rows that have primary keys which have been skipped over by the auto_increment mechanism. Also, according to another answer on stackoverflow concerning non-sequential primary keys, there should be no impact on efficiency. As an alternative, bulkCreate statements could be separated into two groups, one for inserts and one for updates, which could then be done separately using sequelize. The downside is that there would be extra queries to determine whether incoming data already exists in the database in order to decide between inserts versus updates.

Here's a code sample:

let {
        Sequelize,
        DataTypes,
    } = require('sequelize')

async function run () {
    let sequelize = new Sequelize(process.env.DB_NAME, process.env.DB_USER, process.env.DB_PASSWORD, {
            host:       'localhost',
            dialect:    'mysql',
            logging:    console.log
        })

    let Item = sequelize.define('item', {
            name: DataTypes.STRING,
            age: DataTypes.INTEGER
        }, {
            tableName: 'items',
            schema: 'agw_queries'
        })

    await sequelize.sync({ force: true })

    let wait = sec => new Promise( res => setTimeout(res, sec * 1000));

    let items = await Item.bulkCreate([{ name: 'mickey', age: 32 }, { name: 'minnie', age: 30 }])
    console.log()
    console.log('These values are returned upon creation.')
    console.log()
    console.log(JSON.stringify(items, null, 2))

    console.log()
    console.log('These values are returned after a subsequent query.')
    console.log()
    let r = await Item.findAll({})
    console.log(JSON.stringify(r, null, 2))

    console.log()
    console.log('Waiting two seconds ...')
    console.log()
    await wait(2)

    console.log('These values are returned after an update.')
    console.log()
    items = await Item.bulkCreate(
            [
                { id: 1, name: 'mickey mouse', age: 33 },
                { id: 2, name: 'minnie mouse', age: 31 },
                { name: 'goofy', age: 37 }
            ],
            { updateOnDuplicate: [ 'name', 'updatedAt' ] })
    console.log(JSON.stringify(items, null, 2))

    console.log()
    console.log('These values are returned after another subsequent query.')
    console.log()
    r = await Item.findAll({})
    console.log(JSON.stringify(r, null, 2))

    console.log()
    console.log('Waiting two seconds ...')
    console.log()
    await wait(2)

    console.log('These values are returned after an update.')
    console.log()
    items = await Item.bulkCreate(
            [
                { id: 1, name: 'mickey t. mouse', age: 33 },
                { id: 2, name: 'minerva mouse', age: 31 },
                { name: 'donald duck', age: 32 }
            ],
            { updateOnDuplicate: [ 'name', 'updatedAt' ] })
    console.log(JSON.stringify(items, null, 2))

    console.log()
    console.log('These values are returned after another subsequent query.')
    console.log()
    r = await Item.findAll({})
    console.log(JSON.stringify(r, null, 2))

    await sequelize.close()
}

run()

And here's the output

Executing (default): DROP TABLE IF EXISTS `items`;
Executing (default): DROP TABLE IF EXISTS `items`;
Executing (default): CREATE TABLE IF NOT EXISTS `items` (`id` INTEGER NOT NULL auto_increment , `name` VARCHAR(255), `age` INTEGER, `createdAt` DATETIME NOT NULL, `updatedAt` DATETIME NOT NULL, PRIMARY KEY (`id`)) ENGINE=InnoDB;
Executing (default): SHOW INDEX FROM `items`
Executing (default): INSERT INTO `items` (`id`,`name`,`age`,`createdAt`,`updatedAt`) VALUES (NULL,'mickey',32,'2021-09-06 12:17:44','2021-09-06 12:17:44'),(NULL,'minnie',30,'2021-09-06 12:17:44','2021-09-06 12:17:44');

These values are returned upon creation.

[
  {
    "id": 1,
    "name": "mickey",
    "age": 32,
    "createdAt": "2021-09-06T12:17:44.042Z",
    "updatedAt": "2021-09-06T12:17:44.042Z"
  },
  {
    "id": 2,
    "name": "minnie",
    "age": 30,
    "createdAt": "2021-09-06T12:17:44.042Z",
    "updatedAt": "2021-09-06T12:17:44.042Z"
  }
]

These values are returned after a subsequent query.

Executing (default): SELECT `id`, `name`, `age`, `createdAt`, `updatedAt` FROM `items` AS `item`;
[
  {
    "id": 1,
    "name": "mickey",
    "age": 32,
    "createdAt": "2021-09-06T12:17:44.000Z",
    "updatedAt": "2021-09-06T12:17:44.000Z"
  },
  {
    "id": 2,
    "name": "minnie",
    "age": 30,
    "createdAt": "2021-09-06T12:17:44.000Z",
    "updatedAt": "2021-09-06T12:17:44.000Z"
  }
]

Waiting two seconds ...

These values are returned after an update.

Executing (default): INSERT INTO `items` (`id`,`name`,`age`,`createdAt`,`updatedAt`) VALUES (1,'mickey mouse',33,'2021-09-06 12:17:46','2021-09-06 12:17:46'),(2,'minnie mouse',31,'2021-09-06 12:17:46','2021-09-06 12:17:46'),(NULL,'goofy',37,'2021-09-06 12:17:46','2021-09-06 12:17:46') ON DUPLICATE KEY UPDATE `name`=VALUES(`name`),`updatedAt`=VALUES(`updatedAt`);
[
  {
    "id": 1,
    "name": "mickey mouse",
    "age": 33,
    "createdAt": "2021-09-06T12:17:46.174Z",
    "updatedAt": "2021-09-06T12:17:46.174Z"
  },
  {
    "id": 2,
    "name": "minnie mouse",
    "age": 31,
    "createdAt": "2021-09-06T12:17:46.174Z",
    "updatedAt": "2021-09-06T12:17:46.174Z"
  },
  {
    "id": 5,
    "name": "goofy",
    "age": 37,
    "createdAt": "2021-09-06T12:17:46.174Z",
    "updatedAt": "2021-09-06T12:17:46.174Z"
  }
]

These values are returned after another subsequent query.

Executing (default): SELECT `id`, `name`, `age`, `createdAt`, `updatedAt` FROM `items` AS `item`;
[
  {
    "id": 1,
    "name": "mickey mouse",
    "age": 32,
    "createdAt": "2021-09-06T12:17:44.000Z",
    "updatedAt": "2021-09-06T12:17:46.000Z"
  },
  {
    "id": 2,
    "name": "minnie mouse",
    "age": 30,
    "createdAt": "2021-09-06T12:17:44.000Z",
    "updatedAt": "2021-09-06T12:17:46.000Z"
  },
  {
    "id": 3,
    "name": "goofy",
    "age": 37,
    "createdAt": "2021-09-06T12:17:46.000Z",
    "updatedAt": "2021-09-06T12:17:46.000Z"
  }
]

Waiting two seconds ...

These values are returned after an update.

Executing (default): INSERT INTO `items` (`id`,`name`,`age`,`createdAt`,`updatedAt`) VALUES (1,'mickey t. mouse',33,'2021-09-06 12:17:48','2021-09-06 12:17:48'),(2,'minerva mouse',31,'2021-09-06 12:17:48','2021-09-06 12:17:48'),(NULL,'donald duck',32,'2021-09-06 12:17:48','2021-09-06 12:17:48') ON DUPLICATE KEY UPDATE `name`=VALUES(`name`),`updatedAt`=VALUES(`updatedAt`);
[
  {
    "id": 1,
    "name": "mickey t. mouse",
    "age": 33,
    "createdAt": "2021-09-06T12:17:48.258Z",
    "updatedAt": "2021-09-06T12:17:48.258Z"
  },
  {
    "id": 2,
    "name": "minerva mouse",
    "age": 31,
    "createdAt": "2021-09-06T12:17:48.258Z",
    "updatedAt": "2021-09-06T12:17:48.258Z"
  },
  {
    "id": 8,
    "name": "donald duck",
    "age": 32,
    "createdAt": "2021-09-06T12:17:48.258Z",
    "updatedAt": "2021-09-06T12:17:48.258Z"
  }
]

These values are returned after another subsequent query.

Executing (default): SELECT `id`, `name`, `age`, `createdAt`, `updatedAt` FROM `items` AS `item`;
[
  {
    "id": 1,
    "name": "mickey t. mouse",
    "age": 32,
    "createdAt": "2021-09-06T12:17:44.000Z",
    "updatedAt": "2021-09-06T12:17:48.000Z"
  },
  {
    "id": 2,
    "name": "minerva mouse",
    "age": 30,
    "createdAt": "2021-09-06T12:17:44.000Z",
    "updatedAt": "2021-09-06T12:17:48.000Z"
  },
  {
    "id": 3,
    "name": "goofy",
    "age": 37,
    "createdAt": "2021-09-06T12:17:46.000Z",
    "updatedAt": "2021-09-06T12:17:46.000Z"
  },
  {
    "id": 6,
    "name": "donald duck",
    "age": 32,
    "createdAt": "2021-09-06T12:17:48.000Z",
    "updatedAt": "2021-09-06T12:17:48.000Z"
  }
]
Andrew
  • 6,144
  • 10
  • 37
  • 54
  • Thanks Andrew for the detailed example. Sequelize seems to be increasing the id field by one every time its doing an update. I think instead sequelize handle the id field I will take care of the id field by getting the record count and increment in case there are new record and keep it as it is when there is an update to any row – Siba Swain Sep 06 '21 at 05:22
  • 1
    @SibaSwain You're right Siba, I don't know how I missed that. I've updated my answer to include what you've observed about the primary keys. To be honest, I don't really have a good solution. I put a couple of ideas into the answer, though. Hope this helps. – Andrew Sep 06 '21 at 12:26