Google App Engine: unpredictable cost and discrepancy between app engine dashboard vs billing export

Question

I have been exploring the App Engine settings for a small data science web application for 2 weeks. Since it is a personal project that bills my own wallet, I tried a few different parameters in app.yaml to reduce the "frontend instances" cost. Several changes in, I got unexpected ~10x cost surge!!! It was painful!!! In order to not waste it, I decided to learn something here to understand the behaviour :)... Don't worry, I had temporarily shut down my app ;)

Version 1 app.yaml:

service: my-app
runtime: python37
instance_class: F4
env: standard
automatic_scaling:
  min_idle_instances: 1
  max_idle_instances: 1
default_expiration: "1m"

inbound_services:
- warmup

entrypoint: gunicorn -b 0.0.0.0:8080 main:server

Version 1, billing result (usage.amount_in_pricing_units exported from billing account): ~100hr/day, the same as Front end Instance Hours shown from App Engine billing status. This is understandable, because I had a F4 instance constantly runing idle that would translate into 24*4=96 frontend instance hours. Adding the instance usage from actual requests (from me only), ~100hr/day seems reasonable.

Version 2, where I intended to lower the instance class and number of instances and also made longer the default_expiration and hoping it would help the app to start quicker and some other stuff that I thought wouldn't affect much....

service: my-app
runtime: python37
instance_class: F2
env: standard
automatic_scaling:
  min_instances: 1
  max_instances: 1
  target_cpu_utilization: 0.85
  max_concurrent_requests: 80
  max_pending_latency: 6s
default_expiration: "3h"

inbound_services:
- warmup

entrypoint: gunicorn -b 0.0.0.0:8080 main:server

Version 2, billing result (usage.amount_in_pricing_units exported from billing account): ~800hr+/day, ouch!!! In contrast, the Front end Instance Hours from App Engine dashboard billing status is less than 60hr/day as expected. This is where I got lost:

Why the usage from billing is so much larger than the App Engine Dashboard where do those usage come from?
Where to find and track indicators of those unaccounted usage in App Engine Dashboard etc?

Billing shows the total cost of your project, how many app engine services are running in your projects? and how many instances you can see?, as is mentioned in the [documentation](https://cloud.google.com/appengine/pricing) `The bill will apply the appropriate multiple of instance hours for each instance class you use. For example, if you use an F4 instance for one hour, you see "Frontend Instance" billing for four instance hours at the F1 rate.` This indicates that your project has ~16.6 running 24 hours; 800 HR@ day is equal to 400HR F2 400hr/24 = `~16.6 running in 24 hours` — Jan Hernandez, Jan 13 '21 at 21:29
Thanks a lot @JanHernandez! This is the puzzling part. Any time when I checked the AppEngine/dashboard, there is only 1-2 instance runing. The frontend instance hours shown on AppEngine/dashboard never exceeded 60 hrs, whereas exported billing usage says 800 hr. — alcoholfreebear, Jan 14 '21 at 15:50
This is likely the answer. https://stackoverflow.com/questions/47125661/pricing-of-google-app-engine-flexible-env-a-500-lesson — alcoholfreebear, Jan 14 '21 at 16:38

alcoholfreebear · Accepted Answer · 2021-01-31T08:21:09.853

2020-01-16 Solution for issue #1.

While I was waiting for Google Billing Support to come back to me, I found this: Pricing of Google App Engine Flexible env, a $500 lesson

Namely, the past deployed versions of the app also eating frontend instance hours, which needed real world confirmation. (To my surprise, this has nothing to do with app.yaml file!!) So I deleted all the past versions of the app and let it run for two days while observing instance hours and billing records with the following app.yaml file.

service: my-app
runtime: python37
instance_class: F2
env: standard
automatic_scaling:
  min_instances: 1
  max_instances: 2
  max_idle_instances: 1
  target_cpu_utilization: 0.85
  max_concurrent_requests: 80
  max_pending_latency: 6s
default_expiration: "1m"

inbound_services:
- warmup

entrypoint: gunicorn -b 0.0.0.0:8080 main:server

This should have always one F2 instance running and goes up to maximum 2 instances. This time both app engine and exported billing usage hours agreed on 50 hours frontend instance hours. Yes!!! The daily cost is cut down to 1/16.

This solves the cost question #1, but #2 remains to be answered. It is very problematic that app engine dashboard is not showing all the billed usage of frontend instances. Yesterday I heard from Google Billing Support Team, the answer is not helpful (mainly talking about instance numbers in app.yaml, which doesn't help), they seem oblivious about this issue, I will have to let them know.

2020-01-31 Followup on issue #2.

Google Billing Support Team responded swiftly, acknowledged the discrepency between App Engine Dashboard vs Billing Export and agreed to ajust the billing for me. Effectively, the bills during the spiky days were refunded as a result. Kudos to them!

Google App Engine: unpredictable cost and discrepancy between app engine dashboard vs billing export

1 Answers1