0

I'm trying to support large file uploads for a Cloud Run (and App Engine) project. There are some constraints that prevent the usual workarounds from working:

  • The clients are .NET 4.0 applications which means HTTP2 is not available (which gets you around at least Cloud Run's 32MB request size limit)
  • Legacy clients are not upgradable so chunked uploads are not available for them, and backwards compatibility is a requirement
  • Signed URLs to cloud storage is the current solution and work well, however some % of clients do not work at all because the customer's IT has blocked googleapis (but not our company domain)
  • Asking the customer's IT to unblock googleapis is difficult/non-starter

This leads me to the conclusion that I should setup a forward proxy that allows Signed URLs to get around IT restrictions through our GCP project/company domain. I would accomplish this in Compute Engine with an instance running nginx or squid or something and then have a load balancer direct URLs of a certain pattern to the forward proxy which will rewrite the URL to the correct cloud storage signed URL and forward the request.

However, this seems like a bit of a clunky solution. Is there something simpler native to GCP that accomplish what I'm trying to do?

Gillespie
  • 5,780
  • 3
  • 32
  • 54
  • it sounds to me like cloud load balancer with a bucket backend covers this use case already https://cloud.google.com/load-balancing/docs/https/ext-load-balancer-backend-buckets#buckets_as_load_balancer_backends , you might also need cloud CDN for authentication https://cloud.google.com/cdn/docs/using-signed-urls Not posting as an answer as I've never used either of them, so not sure if it covers it all. – somethingsomething Jun 07 '22 at 08:40
  • Doesn't seem like GCP load balancer allows you do do simple proxy passes - it makes you select from a dropdown of existing backend services. In my case I got it to work by connecting to a GCE instance group that just has nginx running with proxy pass config. I'll post my nginx conf file as an answer in case anyone is interested. – Gillespie Jun 07 '22 at 21:00

1 Answers1

0

I was able to proxy cloud storage signed URLs using nginx:

events {
  worker_connections 1024;
}

http {
 client_max_body_size 500M;

 server {
   listen 80;
   listen [::]:80;

   server_name mydomain;

    location /storagepxy/ {
        proxy_pass https://storage.googleapis.com/;
    }
  }
}

I then setup a GCP load balancer to direct any requests starting with /storagepxy/* to a compute engine instance group running nginx with the above config.

Thus, I could read/write to cloud storage using requests of the form:

GET mydomain/storagepxy/[cloud storage signeduri]
PUT mydomain/storagepxy/[cloud storage signeduri]

So if you had a signed URL like:

https://storage.googleapis.com/example-bucket/cat.jpeg?X-Goog-Algorithm=
GOOG4-RSA-SHA256&X-Goog-Credential=example%40example-project.iam.gserviceaccount
.com%2F20181026%2Fus-central1%2Fstorage%2Fgoog4_request&X-Goog-Date=20181026T18
1309Z&X-Goog-Expires=900&X-Goog-SignedHeaders=host&X-Goog-Signature=247a2aa45f16
9edf4d187d54e7cc46e4731b1e6273242c4f4c39a1d2507a0e58706e25e3a85a7dbb891d62afa849
6def8e260c1db863d9ace85ff0a184b894b117fe46d1225c82f2aa19efd52cf21d3e2022b3b868dc
c1aca2741951ed5bf3bb25a34f5e9316a2841e8ff4c530b22ceaa1c5ce09c7cbb5732631510c2058
0e61723f5594de3aea497f195456a2ff2bdd0d13bad47289d8611b6f9cfeef0c46c91a455b94e90a
66924f722292d21e24d31dcfb38ce0c0f353ffa5a9756fc2a9f2b40bc2113206a81e324fc4fd6823
a29163fa845c8ae7eca1fcf6e5bb48b3200983c56c5ca81fffb151cca7402beddfc4a76b13344703
2ea7abedc098d2eb14a7

You could proxy it via:

https://mydomain/storagepxy/example-bucket/cat.jpeg?X-Goog-Algorithm=
GOOG4-RSA-SHA256&X-Goog-Credential=example%40example-project.iam.gserviceaccount
.com%2F20181026%2Fus-central1%2Fstorage%2Fgoog4_request&X-Goog-Date=20181026T18
1309Z&X-Goog-Expires=900&X-Goog-SignedHeaders=host&X-Goog-Signature=247a2aa45f16
9edf4d187d54e7cc46e4731b1e6273242c4f4c39a1d2507a0e58706e25e3a85a7dbb891d62afa849
6def8e260c1db863d9ace85ff0a184b894b117fe46d1225c82f2aa19efd52cf21d3e2022b3b868dc
c1aca2741951ed5bf3bb25a34f5e9316a2841e8ff4c530b22ceaa1c5ce09c7cbb5732631510c2058
0e61723f5594de3aea497f195456a2ff2bdd0d13bad47289d8611b6f9cfeef0c46c91a455b94e90a
66924f722292d21e24d31dcfb38ce0c0f353ffa5a9756fc2a9f2b40bc2113206a81e324fc4fd6823
a29163fa845c8ae7eca1fcf6e5bb48b3200983c56c5ca81fffb151cca7402beddfc4a76b13344703
2ea7abedc098d2eb14a7

Note: If your bucket path contains URL-encoded characters such as colons, you'll need a slightly more complicated nginx config:

# This is a simple nginx configuration file that will proxy URLs of the form:
#   https://mydomain/storagepxy/[signed uri]
# to
#   https://storage.googleapis.com/[signed uri]
#
# For use in GCP, you'll likely need to create an instance group in compute engine running nginx with this config
# and then hook up a load balancer to forward requests starting with /storagepxy to it
worker_processes auto; # Auto should spawn 1 worker per core

events {}
http {
  client_max_body_size 500M;

  server {
    listen 80; # IPv4
    listen [::]:80; # IPv6
    server_name mydomain;

    location /storagepxy/ {
      # To resolve storage.googleapis.com
      resolver 8.8.8.8;

      # We have to do it this way in case filenames have URL-encoded characters in them
      # See: https://stackoverflow.com/a/37584637
      # Also note, if the URL does not match the rewrite rules then return 400
      rewrite ^ $request_uri;
      rewrite ^/storagepxy/(.*) $1 break;
      return 400;

      proxy_pass https://storage.googleapis.com/$uri;
    }
  }
}
Gillespie
  • 5,780
  • 3
  • 32
  • 54