2

Background: I'm taking data in my Python/AppEngine project and creating a .tsv file so that I can create charts with d3.js. Right now I'm writing the CSV with each page load; I want to instead store the file once in Google Cloud Storage and read it from there.

How I'm currently writing the file, each time the page is loaded!:

def get(self):  ## this gets called when loading myfile.tsv from d3.js
    datalist = MyEntity.all()
    self.response.headers['Content-Type'] = 'text/csv'
    writer = csv.writer(self.response.out, delimiter='\t')
    writer.writerow(['field1', 'field2'])
    for eachco in datalist:
        writer.writerow([eachco.variable1, eachco.variable2])

And while inefficient, this is working just fine.

Using this Google Cloud Storage documentation, I've been trying to get something like this working:

def get(self):
    filename = '/bucket/myfile.tsv'
    datalist = MyEntity.all()
    bucket_name = os.environ.get('BUCKET_NAME', app_identity.get_default_gcs_bucket_name())
    write_retry_params = gcs.RetryParams(backoff_factor=1.1)
    writer = csv.writer(self.response.out, delimiter='\t')
    gcs_file = gcs.open(filename, 'w', content_type='text/csv', retry_params=write_retry_params)
    gcs_file.write(writer.writerow(['field1', 'field2']))
    for eachco in datalist:
        gcs_file.write(writer.writerow([eachco.variable1, eachco.variable2]))
    gcs_file.close()

But I'm getting:

TypeError: Expected str but got <type 'NoneType'>.

I thought that the output of csv.writer would be a string, so I'm not sure why I'm getting the TypeError.

So I can think of two situations:

  1. I've got something screwed up in my code that writes the tsv to Cloud Storage. It should be simple to iterate through and write a TSV/CSV file to Cloud Storage though, right?
  2. I've gone about this the completely wrong way entirely, and should maybe even use BlobStore or db.TextProperty() to store this .tsv data. (The files aren't that big; definitely well under 1MB)

I'd appreciate any help!

edit - full traceback

Traceback (most recent call last):
  File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/lib/webapp2-2.5.1/webapp2.py", line 1530, in __call__
    rv = self.router.dispatch(request, response)
  File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/lib/webapp2-2.5.1/webapp2.py", line 1278, in default_dispatcher
    return route.handler_adapter(request, response)
  File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/lib/webapp2-2.5.1/webapp2.py", line 1102, in __call__
    return handler.dispatch()
  File "/mydirectory/myapp/handlers.py", line 21, in dispatch
    webapp2.RequestHandler.dispatch(self)
  File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/lib/webapp2-2.5.1/webapp2.py", line 572, in dispatch
    return self.handle_exception(e, self.app.debug)
  File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/lib/webapp2-2.5.1/webapp2.py", line 570, in dispatch
    return method(*args, **kwargs)
  File "/mydirectory/myapp/thisapp.py", line 384, in get
    gcs_file.write(writer.writerow(['field1', 'field2']))
  File "lib/cloudstorage/storage_api.py", line 754, in write
    raise TypeError('Expected str but got %s.' % type(data))
TypeError: Expected str but got <type 'NoneType'>.
Jed Christiansen
  • 659
  • 10
  • 21
  • you forget to set the response headers. – Avinash Raj Sep 16 '16 at 06:04
  • can you add the full traceback? – Dan Cornilescu Sep 16 '16 at 06:05
  • I don't know what you're trying to achieve with this `gcs_file.write(str(writer.writerow([eachco.variable1, eachco.variable2])))` line. – Avinash Raj Sep 16 '16 at 06:05
  • Doh! I also tried: gcs_file.write(writer.writerow([eachco.variable1, eachco.variable2])) and it didn't work. I'll edit the code above. – Jed Christiansen Sep 16 '16 at 06:10
  • Dan - added the full traceback above // Avinash - I believe the response headers are set via this: gcs_file = gcs.open(filename, 'w', content_type='text/csv', retry_params=write_retry_params) – Jed Christiansen Sep 16 '16 at 06:17
  • refer to https://stackoverflow.com/questions/43601294/write-csv-to-google-cloud-storage?noredirect=1&lq=1 – Tokci Jul 05 '21 at 16:52
  • If you search for something similar in a Google Cloud Function, see [How to open a file from google cloud storage into a cloud function](https://stackoverflow.com/a/53232676/11154841). – questionto42 Jan 31 '22 at 21:18

2 Answers2

5

You're still attempting to create the writer on a response:

writer = csv.writer(self.response.out, delimiter='\t')

You need to write to the GCS file. Something like this:

    datalist = MyEntity.all()
    bucket_name = os.environ.get('BUCKET_NAME', app_identity.get_default_gcs_bucket_name())
    filename = os.path.join(bucket_name, 'myfile.tsv')
    write_retry_params = gcs.RetryParams(backoff_factor=1.1)
    gcs_file = gcs.open(filename, 'w', content_type='text/csv', retry_params=write_retry_params)
    writer = csv.writer(gcs_file, delimiter='\t')
    writer.writerow(['field1', 'field2'])
    for eachco in datalist:
        writer.writerow([eachco.variable1, eachco.variable2])
    gcs_file.close()

Notes:

  • not actually tested
  • I also adjusted the filename to use bucket_name
  • if you do this in the get() request you may want to check if the file already exists and, if so, use it, otherwise you'd be still generating it at every request. Alternatively you could move this code on a task or in the .tsv upload handler.
Dan Cornilescu
  • 39,470
  • 12
  • 57
  • 97
  • That works! I still had to use the filename = '/bucket/myfile.tsv' line because that's the format explicitly required by GCS. (The error was ValueError: Path should have format /bucket/filename but got app_default_bucket/myfile.tsv) – Jed Christiansen Sep 16 '16 at 06:39
  • Ah, missing leading '/' I suspect. Try `filename = '/%s/myfile.tsv' % bucket_name` instead (it might not be a good idea to assume the default app bucket's name will be `bucket` in production). – Dan Cornilescu Sep 16 '16 at 06:44
0

The problem is that writer.writerow doesn't return anything. The return type will be None, and you are trying to write that into gcs_file.

Robert F.
  • 455
  • 3
  • 14