I am using a library (Apache Libcloud) to make requests to google storage.
Inside the library, a HEAD request is made to the URL that is queried within google storage.
You can find the code for it here, search for def get_object
.
The important line is
response = self.connection.request(object_path, method='HEAD')
According to the Google Documentation of HEAD response objects, a content-length
field should be part of the response:
Response
...
expires: Wed, 30 May 2018 21:41:23 GMT
date: Wed, 30 May 2018 21:41:23 GMT
cache-control: private, max-age=0
last-modified: Wed, 30 May 2018 20:36:34 GMT
etag: "2218880ef78838266ecd7d4c1b742a0e"
x-goog-generation: 1486161811706000
x-goog-metageneration: 15
x-goog-stored-content-encoding: identity
x-goog-stored-content-length: 328
content-type: image/jpg
x-goog-hash: crc32c=HBrbzQ==
x-goog-hash: md5=OCydg52+pPG1Bwawjsl7DA==
x-goog-storage-class: STANDARD
accept-ranges: bytes
content-length: 328 # <--- here
...
However, it is missing for some files (but not all files).
I do receive a x-goog-stored-content-length
entry in both cases, but the library needs content-length
.
The content-length
header is used a bit further down the call chain in def _headers_to_object
, where I just get a KeyError due to the missing header:
def _headers_to_object(self, object_name, container, headers):
hash = headers['etag'].replace('"', '')
extra = {'content_type': headers['content-type'],
'etag': headers['etag']}
meta_data = {}
if 'last-modified' in headers:
extra['last_modified'] = headers['last-modified']
for key, value in headers.items():
if not key.lower().startswith(self.http_vendor_prefix + '-meta-'):
continue
key = key.replace(self.http_vendor_prefix + '-meta-', '')
meta_data[key] = value
obj = Object(name=object_name, size=headers['content-length'], # <-- here
hash=hash, extra=extra,
meta_data=meta_data,
container=container,
driver=self)
return obj
The question is: What could I possibly have done to the file when I uploaded it to cause google storage to not send that header? Or is this a bug within google storage (I doubt it)?
Some more info:
- The content type is
application/json
- This file is gzip encoded.
- It works for other gzip encoded files
- I am using the Apache Libcloud API 3.3.0
- I don't think it's a bug within libcloud since the documentation of HEAD specifies the
content-length
header, but it works if I overwrite_headers_to_object
to usex-goog-stored-content-length
. - I am unable to reproduce this with a file that I could make public at the moment to demonstrate