I support a data management solution built on Google Cloud Platform. As our product matures, more and more teams and individuals are adopting it, meaning more people are storing and searching for data and racking up costs. We need to better understand how much each of these users/workflows are costing us so that we can eventually start charging them for using our services.
I already have billing data for the Google Cloud Platform project that our solution runs on exported to BigQuery. I've observed that 70-80 percent of our Google Cloud Platform bill for the project in question is attributed to App Engine (as a product), so I'm currently focusing on splitting App Engine costs. Here's a condensed view of App Engine costs for the project for one day (from BigQuery):
Row product resource_type start_time end_time cost usage_amount usage_unit
1 App Engine Simple Searches 2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 0.1473 3946.0 requests
2 App Engine Flex Instance RAM 2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 0.6816 3.710851743744E14 byte-seconds
3 App Engine Search Document Storage 2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 0.505028 8.0921704558464E15 byte-seconds
4 App Engine Code and Static File Storage 2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 0.0 5.96811043008E13 byte-seconds
5 App Engine Datastore Entity Writes 2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 0.085804 67669.0 requests
6 App Engine Other Search Ops 2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 0.0 1732.0 requests
7 App Engine Out Bandwidth 2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 0.273014 3.516638423E9 bytes
8 App Engine Datastore Read Ops 2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 1.494541 2540902.0 requests
9 App Engine Search Document Indexing 2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 0.05012 3.7645832E7 bytes
10 App Engine Datastore Storage 2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 1.72891 2.7716055728688E16 byte-seconds
11 App Engine Flex Instance Core Hours 2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 5.0496 345600.0 seconds
12 App Engine Task Queue Storage 2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 0.0 5.14512E8 byte-seconds
13 App Engine Datastore Small Ops 2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 0.0 16166.0 requests
14 App Engine Backend Instances 2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 206.080588 1.4870202339153E7 seconds
15 App Engine Frontend Instances 2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 1.35596 198429.126958 seconds
Question 1: By the way, for anybody familiar with Google Cloud Platform billing exports, an entry with start_time
2017-08-20 07:00:00 UTC
and end_time
2017-08-20 08:00:00 UTC
reflects costs incurred on 2017-08-20, not 2017-08-19, right?
Now, I understand that associating these App Engine costs with App Engine activity is not going to be an exact mapping–Google Cloud Platform does not bill per action, and there will be fixed and, I guess, shared resource costs (please correct me if I'm wrong!)–but I'd still like to get a sensible estimate. My first attempt involved checking Google's logged estimated cost per request. Therefore, I created a sink for the App Engine request logs and waited for the numbers to roll in. However, the total estimated cost for all requests on a given day using this approach is very low:
SELECT SUM(protoPayload.cost) AS cost_total
FROM [my-data-management-solution:request_log.appengine_googleapis_com_request_log_20170820];
yields
Row cost_total
1 3.2711573326337837
That barely accounts for 1.5% of the total App Engine costs!
Question 2: What resource_type
(s) (from the Google Cloud Platform billing export) do the request log cost estimates correspond or contribute to?
About 95% of my App Engine costs are attributed to the Backend Instances resource_type
. I did some cursory research into what they are (including this video claiming that Google was moving away from the whole backend/frontend instances distinction). I assume (or may have read) that Google relies on whatever secret algorithms to spin up, shut down, and otherwise manage these instances. As such…
Question 3 (the big question): How can I get some visibility into how much individual user/workflow actions (limited to via App Engine is OK) contribute to total App Engine costs, or minimally App Engine Backend Instances costs, for a Google Cloud Project? Is it possible without something like regressing costs against user activity and creating an ML model? Is the idea of gaining insight into how this black box (both from the scaling and pricing perspectives) works or otherwise thinking that App Engine costs are somewhat directly correlated with user activity reasonable at all?
Additional Information
Our data management solution uses its own concept of identity, and I'm not expecting Google to magically figure it out. I can currently link
request_log
items to users by parsing Stackdriver logs, and I'll work out the user-workflow associations or get them from another tool.Just in case, is there anything to do some of this stuff out of the box? One StackOverflow comment mentioned Potamus, but the repository is no longer available, and there's hardly any information out there about what it did to begin with.
If App Engine cost splitting isn't a big deal, how about for other products like Cloud Storage? It will be my next target, although the challenge of associating Cloud Storage costs (both the actual, potentially negligible, storage costs and the more expensive I/O costs) with App Engine activity seems even less reasonable at this point.