Google Cloud (GCE) StackDriver Installation

John Contreras
OriginMaster
Published in
4 min readMar 18, 2018

--

Using Google Cloud VMs, known as Google Compute Engine (GCE) has been a very pleasant experience with a few exceptions. One of those exceptions was getting StackDriver working. It turns out installation is straight-forward, but it’s easy to forget a key step or make an assumption about the difference between a paid and premium tier account.

Let’s rewind. By default you get some very basic metrics when you deploy a new GCE instance. One of those metrics is CPU utilization. As with any VM, the next logical metric you’d want to understand is memory utilization. Unfortunately, memory is not available by default and requires using a separate Google Cloud product called StackDriver. StackDriver was a startup that was acquired by Google https://techcrunch.com/2014/05/07/google-acquires-cloud-monitoring-service-stackdriver/ in 2014. As a result, the StackDriver experience inside of Google Cloud still feels a little disjointed, but the integration works well. In addition to capturing additional metrics, StackDriver also provides other very compelling features related to logging, debugging (live), tracing, and alerting based on monitoring metrics.

We run Docker containers inside of GCE and while we can use standard linux or docker commands to check on memory utilization for a point in time, we’d like to view trends over time and establish alerts when things are running low.

// linux memory utilities
> top
> free
// docker memory utilities
> docker stats
> docker container stats <container_id>

We looked at running our own ELK or Prometheus installation, but the ease of getting up and running with StackDriver quickly proved to be the winning variable in this case (note: we may reconsider in the future when we have more time to re-evaluate).

Installing StackDriver

Great, let’s install StackDriver. First, you’ll need to turn StackDriver on (an additional paid service). Don’t forget to do this.

StackDriver can be installed on GCE, Amazon and likely other methods. My focus will be on installing to GCE, so the instructions are very easy (full instructions here https://cloud.google.com/monitoring/agent/install-agent )

You’ll ned to run these instructions from the terminal on each VM you want to collect metrics on.

> curl -sSO https://dl.google.com/cloudagents/install-monitoring-agent.sh> sudo bash install-monitoring-agent.sh

Boom! Done. Right?

Jump back over to StackDriver and try and pull those metrics up by going to Resources > Metrics Explorer

StackDriver says displaying those metrics could take a few minutes. No problem, wait and wait, then try using the following settings.

Note: you won’t see GCE VM instance if you missed a step. This is a first good indication that something could be wrong.

Nope. Nothing. I get a screen that says “Chart Definition Invalid”

What could I possibly be doing wrong? Well folks. The documentation says it very clearly.

Premium Tier service in the Stackdriver account, which is required to use the Stackdriver Monitoring agent or AWS.

Let me emphasize. This is not just a paid account. We already had a paid account so I assumed we were fine there. This is an additional upgrade you need to make (The upgrade button should appear near the navigation inside the StackDriver context). If you forget this step, everything will look correct, but no data will flow to your StackDriver account metrics. You MUST upgrade to the premium tier account. Once you do this, you should start seeing your metrics flow into StackDriver in near real-time.

The StackDriver Agent runs as local service on the target machine and collects those additional metrics to submit to the StackDriver api. As long as your VM has access to those Google APIs you should be ok. If your VMs are locked down, you may need to add the proper Proxy or Firewall rules. That is beyond the scope of this article and the official StackDriver docs can hopefully help there.

I love Google Cloud, but understanding billing is quite tricky, so it’s also possible that we weren’t paying for StackDriver yet and had only activated/connected it to our project. I assumed we were already paying for it as we can utilize StackDriver just fine for most things prior to the upgrade to premium tier.

During my Googling for an answer, I didn’t find too many people discussing StackDriver in general (even though it’s a fantastic tool), so hopefully this article helps at least one person save some time :)

--

--