0

I have built an application using Express, Postgres, and Sequelize on Google App Engine and I'm having some trouble running a longer migration. This migration simply dumps the data from one of my large tables into elastic search.

As of right now, I have been running my migrations in the pre-start command as such

npm i && sequelize db:migrate

but I notice that Google App Engine has been running my migration over and over again due to the auto-scaling nature of the instances. Is there a better practice for running migrations? Is there a way to only run this migration once and prevent auto-scaling for just the pre-start command?

1 Answers1

0

First is necessary understand how App Engine handle the scalling types:

  • Automatic scaling creates instances based on request rate, response latencies, and other application metrics. You can specify thresholds for each of these metrics, as well as a minimum number instances to keep running at all times.

  • Basic scaling creates instances when your application receives requests. Each instance will be shut down when the application becomes idle. Basic scaling is ideal for work that is intermittent or driven by user activity.

  • Manual scaling specifies the number of instances that continuously run regardless of the load level. This allows tasks such as complex initializations and applications that rely on the state of the memory over time.

I recommend you to choose the manual scaling in order to set the specific number of instances you need/want, or, if you are going to use automatic scaling just pay attention to the limits (max/min (idle) instances) to set specific limits. However, it is up to you choosing the configuration that best suits your requirements.

Being this said, regardless the scaling method you choose, it seems that your script is being restarted every time your GAE scales, or, that it is the script telling your application to repeat the process over and over. It could be useful if you share details on how you are executing your script and what it does, in order to get a better perspective.

A possible workaround for this task could be port the functionality of the migration script itself into the body of an admin-protected handler in the GAE app which can be triggered with a HTTP request for a particular URL.

I think is possible to split the potentially long-running migration operation into a sequence of smaller operations (using push task queues), much more GAE-friendly.

Also, I suggest you take a look on this thread.

However I understand you want to migrate your data from PostgreSQL to one Elasticsearch table, I found out this tutorial where is recommended create a CSV file from your PostgreSQL database, then you can pass the data from CSV to the Json format, this is because you can use the service Elasticdump format your Json file as Elastic Search document,these steps are on Node JS, therefore you can create a script on App Engine or in Cloud functions depending of data size and execute the import, by example:

# import
node_modules/elasticdump/bin/elasticdump --input=formatted.json --output=http://localhost:9200/
Harif Velarde
  • 733
  • 5
  • 10