0

I need to write a result set from MySQL ina csv format inside a bucket in Google Cloud Storage.

Following the instructions here, I created the following example code:

import cloudstorage
from google.appengine.api import app_identity
import db # My own Mysql wrapper

dump = db.get_table_dump(schema) # Here I made a simples SQL SELECT and fetchall()
bucket_name = app_identity.get_default_gcs_bucket_name()
file_name = "/" + bucket_name + "/finalfiles/" + schema + "/" +"myfile.csv"
with cloudstorage.open(file_name, "w") as gcsFile:
    gcsFile.write(dump)

It did not work 'cause write expects a string parameter and dump is tuple of tuples result from fetchall().

I can't use this approach (or similar) since I can't write files in GAE enviroment and I also can't create a CSV string from tuple like here, due to the size o my result set (Actually, I tried it and it takes too long and it timed out before finish).

So, my question is, which is the best way to get a result set from MySQL and save it as CSV in a Google Cloud Storage Bucket?

James
  • 1,653
  • 2
  • 31
  • 60

1 Answers1

1

I just went through the same problem with PHP. I ended up using the cloud sql api (https://cloud.google.com/sql/docs/mysql/admin-api/v1beta4/) with the following workflow:

  1. Create an export bucket (i.e. test-exports)
  2. Give the SQL Instance Read/Write permissions to the bucket created in step 1
  3. Within the application, make a call to instance export (https://cloud.google.com/sql/docs/mysql/admin-api/v1beta4/instances/export). This endpoint accepts the SQL to run, as well as a path to an output bucket. (created in step (1))
  4. Step (3) will return back an operation with a 'name' property. You can use this 'name' and poll the operations/get endpoint (https://cloud.google.com/sql/docs/mysql/admin-api/v1beta4/operations/get) until the status is returned as DONE

We have a job which performs these steps nightly (as well as an import using the /import command) on 6 tables and have yet to see any issues. The only thing to keep in mind is that only one operation can run on a single database instance at a time. To combat this, you should the top item from the the operations list endpoint (https://cloud.google.com/sql/docs/mysql/admin-api/v1beta4/operations/list) to confirm the database is ready before issuing any commands.

dwelling
  • 491
  • 4
  • 5