54

I have recently performed a migration to Google Cloud Platform, and I really like it.

However I can't find a way to monitor the memory usage of the Dataproc VM intances. As you can see on the attachment, the console provides utilization info about CPU, disk and network, but not about memory.

Without knowing how much memory is being used, how is it possible to understand if there is a need of extra memory?

enter image description here

Daniele B
  • 19,801
  • 29
  • 115
  • 173
  • As far as I can tell, there is no easy way. But if you set it a bit too low, you will get an alert with a concrete recommendation. – Adam Bittlingmayer Mar 12 '18 at 07:29
  • Is Carlos's answer acceptable? If so please accept it to make it easier for people to find. – skeller88 Oct 11 '19 at 19:18
  • @skeller88 I'm not sure if it's acceptable. I got the agent but can only log CPU and disk read. Not sure if there's a missing step in his response. – probitaille Nov 08 '19 at 16:02

7 Answers7

26

By installing the Stackdriver agent in GCE VMs additional metrics like memory can be monitored. Stackdriver also offers you alerting and notification features. Nevertheless agent metrics are only available for premium tier accounts.

See this answer for Dataproc VMs.

Community
  • 1
  • 1
Carlos
  • 966
  • 7
  • 15
  • 4
    Do we need to _manually_ install this on VMs, or is there a way to request it be installed when VMs are created? – quickshiftin Apr 20 '18 at 15:42
  • 1
    Yes the agent has to be manually installed in GCE instances. If you are using [GKE it can be setup](https://cloud.google.com/kubernetes-engine/docs/how-to/monitoring) while creating the cluster. – Carlos Apr 24 '18 at 22:59
  • 1
    My Stackdriver agent doesn't log VM memory, only CPU and disk read by default. Do we need to edit something to get instance memory? – probitaille Nov 08 '19 at 16:01
10

The stackdriver agent only supports monitoring of RAM of the E2 family at the moment. Other instance types such as N1, N2,... are not supported.

See the latest documentation of what is supported; https://cloud.google.com/monitoring/api/metrics_gcp#gcp-compute

Stackdriver merics

sketchthat
  • 2,678
  • 2
  • 13
  • 22
user2314327
  • 157
  • 1
  • 6
7

Well you can use the /proc/meminfo virtual file system to get information on current memory usage. You can create a simple bash script that reads the memory usage information from /proc/meminfo. The script can be run periodically as a cron job service. The script can send an alert email if the memory usage exceeds a given threshold.

See this link: https://pakjiddat.netlify.app/posts/monitoring-cpu-and-memory-usage-on-linux

Nadir Latif
  • 3,690
  • 1
  • 15
  • 24
4

The most up-to-date answer here.

How to see memory usage in GCP?

  1. Install the agent on your virtual machine. Takes less than 5 minutes.
curl -sSO https://dl.google.com/cloudagents/add-monitoring-agent-repo.sh
sudo bash add-monitoring-agent-repo.sh
sudo apt-get update
sudo apt-get install stackdriver-agent

the code snippet should install the most recent version of the agent, but for up-to-date guide you can always refer to https://cloud.google.com/monitoring/agent/installation#joint-install.

  1. After it's installed, in a minute or two, you should see the additional metrics in Monitoring section of GCP. https://console.cloud.google.com/monitoring

where to search for memory metrics in GCP

Explanation and why it's invisible by default?

The metrics (such as CPU usage or memory usage) can be collected at different places. For instance, CPU usage is a piece of information that the host (machine with special software running your virtual machine) can collect. The thing with memory usage and virtual machines, is, it's the underlying operating system that manages it (the operating system of your virtual machine). Host cannot really know how much is used, for all it can see in the memory given to that virtual machine, is a stream of bytes.

That's why there's an idea to install agents inside of that virtual machine that would collect the metrics from inside and ship it somewhere where they can be interpreted. There are many types of agents available out there, but Google promotes their own - Monitoring Agent - and it integrates into the entire GCP suite well.

Adam Sibik
  • 885
  • 9
  • 13
  • Some GCP environments don't need the user to install the agent, e.g., Dataproc as shown in the question. – Dagang Feb 03 '21 at 18:17
1

The agent metrics page may be useful: https://cloud.google.com/monitoring/api/metrics_agent

You'll need to install stackdriver. See: https://app.google.stackdriver.com/?project="your project name"

The stackdriver metrics page will provide some guidance. You will need to change the "project name" (e.g. sinuous-dog-133823) to suit your account:

https://app.google.stackdriver.com/metrics-explorer?project=sinuous-dog-133823&timeSelection={"timeRange":"6h"}&xyChart={"dataSets":[{"timeSeriesFilter":{"filter":"metric.type=\"agent.googleapis.com/memory/bytes_used\" resource.type=\"gce_instance\"","perSeriesAligner":"ALIGN_MEAN","crossSeriesReducer":"REDUCE_NONE","secondaryCrossSeriesReducer":"REDUCE_NONE","minAlignmentPeriod":"60s","groupByFields":[],"unitOverride":"By"},"targetAxis":"Y1","plotType":"LINE"}],"options":{"mode":"COLOR"},"constantLines":[],"timeshiftDuration":"0s","y1Axis":{"label":"y1Axis","scale":"LINEAR"}}&isAutoRefresh=true

This REST call will get you the cpu usage. You will need to modify the parameters to suite your project name (e.g. sinuous-dog-133823) and other params to suit needs.

GET /v3/projects/sinuous-cat-233823/timeSeries?filter=metric.type="agent.googleapis.com/memory/bytes_used" resource.type="gce_instance"& aggregation.crossSeriesReducer=REDUCE_NONE& aggregation.alignmentPeriod=+60s& aggregation.perSeriesAligner=ALIGN_MEAN& secondaryAggregation.crossSeriesReducer=REDUCE_NONE& interval.startTime=2019-03-06T20:40:00Z& interval.endTime=2019-03-07T02:51:00Z& $unique=gc673 HTTP/1.1
Host: content-monitoring.googleapis.com
authorization: Bearer <your token>
cache-control: no-cache
Postman-Token: 039cabab-356e-4ee4-99c4-d9f4685a7bb2
StvnBrkdll
  • 3,924
  • 1
  • 24
  • 31
0

VM memory metrics is not available by default, it requires Cloud Monitoring Agent 1.

The UI you are showing is Dataproc, which already has the agent installed, but disabled by default, you don't have to reinstall it. To enable Cloud Monitoring Agent for Dataproc clusters, set --properties dataproc:dataproc.monitoring.stackdriver.enable=true 2 when creating the cluster. Then you can monitor VM memory and create alerts in the Cloud Monitoring UI (not integrated with Dataproc UI yet).

Also see this related question: Dataproc VM memory and local disk usage metrics

Dagang
  • 24,586
  • 26
  • 88
  • 133
-2

This article is now out of date as Stackdriver is now a legacy agent. This has been replaced by the Ops Agent. Please read the latest articles on GCP about migrating to Ops Agent

WDAdmin
  • 21
  • 1
  • 1
    This looks more like a comment rather than a new answer. – Eric Aya May 09 '22 at 14:00
  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community May 11 '22 at 05:41