1

I am using a library (Apache Libcloud) to make requests to google storage. Inside the library, a HEAD request is made to the URL that is queried within google storage. You can find the code for it here, search for def get_object. The important line is

response = self.connection.request(object_path, method='HEAD')

According to the Google Documentation of HEAD response objects, a content-length field should be part of the response:

Response

...
expires: Wed, 30 May 2018 21:41:23 GMT
date: Wed, 30 May 2018 21:41:23 GMT
cache-control: private, max-age=0
last-modified: Wed, 30 May 2018 20:36:34 GMT
etag: "2218880ef78838266ecd7d4c1b742a0e"
x-goog-generation: 1486161811706000
x-goog-metageneration: 15
x-goog-stored-content-encoding: identity
x-goog-stored-content-length: 328
content-type: image/jpg
x-goog-hash: crc32c=HBrbzQ==
x-goog-hash: md5=OCydg52+pPG1Bwawjsl7DA==
x-goog-storage-class: STANDARD
accept-ranges: bytes
content-length: 328  # <--- here
...

However, it is missing for some files (but not all files). I do receive a x-goog-stored-content-length entry in both cases, but the library needs content-length.

The content-length header is used a bit further down the call chain in def _headers_to_object, where I just get a KeyError due to the missing header:

    def _headers_to_object(self, object_name, container, headers):
        hash = headers['etag'].replace('"', '')
        extra = {'content_type': headers['content-type'],
                 'etag': headers['etag']}
        meta_data = {}

        if 'last-modified' in headers:
            extra['last_modified'] = headers['last-modified']

        for key, value in headers.items():
            if not key.lower().startswith(self.http_vendor_prefix + '-meta-'):
                continue

            key = key.replace(self.http_vendor_prefix + '-meta-', '')
            meta_data[key] = value

        obj = Object(name=object_name, size=headers['content-length'],  # <-- here
                     hash=hash, extra=extra,
                     meta_data=meta_data,
                     container=container,
                     driver=self)
        return obj

The question is: What could I possibly have done to the file when I uploaded it to cause google storage to not send that header? Or is this a bug within google storage (I doubt it)?

Some more info:

  • The content type is application/json
  • This file is gzip encoded.
  • It works for other gzip encoded files
  • I am using the Apache Libcloud API 3.3.0
  • I don't think it's a bug within libcloud since the documentation of HEAD specifies the content-length header, but it works if I overwrite _headers_to_object to use x-goog-stored-content-length.
  • I am unable to reproduce this with a file that I could make public at the moment to demonstrate
RunOrVeith
  • 4,487
  • 4
  • 32
  • 50
  • I found that this is the case when a file is not public, but gzip encoded and we send a request with `Accept-Encoding: gzip`. I opened issues in [libcloud](https://github.com/apache/libcloud/issues/1544) and [google storage](https://issuetracker.google.com/issues/177896087) for it – RunOrVeith Jan 19 '21 at 18:01
  • Was this fixed? – Jason V Feb 27 '23 at 05:16
  • No, they're saying it's even a bug that it sometimes works and it should be removed in all cases where compressive encoding is used (see last answer to the google ticket linked above). There is no ETA for that. Within libcloud I created a workaround, so there it should be fixed. – RunOrVeith Feb 27 '23 at 10:07

1 Answers1

0

I also needed this when integrating with a third-party provider that does not accept gzip encoded media url.

I ended up solving it by creating a "proxy API route" on my backend that adds a "content-type" response header, based on the "x-goog-stored-content-length" header from Google Cloud Storage.

Example (Node.js):

const fetchMediaResponse = await axios({
  method: "get",
  url: mediaUrl,
  responseType: "stream"
});

const contentType = fetchMediaResponse.headers["content-type"];
const contentLength =
  fetchMediaResponse.headers["x-goog-stored-content-length"];

res.writeHead(200, {
  "content-type": contentType,
  "content-length": contentLength
});

return fetchMediaResponse.data.pipe(res);
feletodev
  • 1
  • 1