Blog

Cloud Insights or Native Tooling? How to Best Monitor Your Full Stack

There was once a time when a service was delivered to an end-user with only a few components. You probably had some storage that was somehow attached to a server, and an application on that server that represented the entire service. Managing this environment using element managers was probably okay because you had entirely different teams managing each one of these components.

Today, that setup couldn’t be further from reality. A service is typically delivered to an end-user through a multitude of microservices running on top of clustered resources, possibly across multiple clouds with some on-premises storage sprinkled in, too. This level of complexity means that supporting the service using native element management tooling involves consulting many such tools along the way.

Changes in the industry and the tooling that supports it means that using disparate element managers isn’t as daunting a proposition as it once could have been, though. Cloud service providers offer an extensive toolset for first-party and select third-party services, and there’s a rich ecosystem of open source tooling that can be tailored to your exact requirements for any component of your service stack. If there’s a metric that exists for a service or resource, no matter how low-level, you can bet there’s a tool out there that will collect and visualize data and provide you with alerts. Such a system can be invaluable to deep-drive troubleshooting.

Hopefully, however, you aren’t performing deep-dive troubleshooting on a regular basis. If you want an overall view of how your service is performing, a sprawl of tooling can present a challenge. Most of the time—for day-to-day operations, triage, and troubleshooting—you only need a centralised view of the entire stack, with alerting and visualizations for key metrics.

With Cloud Insights, we’ve developed just that—a centralised view of key signals across infrastructure resources and clouds, key indicators for the services that utilize those resources, and end-user experience KPIs.

But What If I Only Use One Cloud?

When discussing multi-cloud management with various organizations, I hear this regularly: I only use one cloud, so why would I want a tool for multi-cloud management?

What I find interesting is that once I start scratching at the surface, this perception often isn’t entirely accurate. Many organizations, of course, have formal commitments and preferences for one cloud provider over another—and yours may be one of these—but the reality so often differs from what’s on paper.

Shadow IT is comprised by users that pick the providers they prefer, with the services that they believe most closely suit their applications, and often go about their business entirely under the radar. It’s easier than ever for users to operate like this because there’s no physical infrastructure to acquire, and no big up-front costs to deal with. In the case of open-source, there may be no direct costs whatsoever, though there is an operational cost to tooling sprawl. It tends to happen regardless of the measures and mandates that specify the use of one provider over another. I think it’s best to ensure that these users and departments have access to a centralised tool for management oversight, to give them at least some degree of visibility.

Maybe you really do use only one cloud, you may find that the native tooling adequately meets your current needs today. But consider: will this always be the case? Will you always use the same cloud?  

A Wholesale Overview of Your Services

Developing and integrating toolsets is key to efficiently managing your environment, but doing so for first-party cloud provider tooling does have the potential to bring a level of inertia to your workloads. Agility is one of the most important values of using the cloud, and integrating native tooling makes your cloud provider sticky— good for them, but perhaps not for you.

Using a multi-cloud tool such as Cloud Insights, API access allows you to integrate once to cover any current or future cloud providers, as well as any infrastructure or services you may run in your own data centers. Cloud Insights provides normalized metrics, yielding like-for-like comparisons between providers.

I don’t think there’s a case for entirely abandoning native element management tooling though— there will always be some esoteric metrics needed for certain services under certain circumstances, and you may not be able to collect that data with a heterogeneous management tool. If you can consult such a tool for most of your troubleshooting, monitoring, and management needs, however, it will dramatically reduce the level of effort required to keep the lights on, giving you more time to develop and improve your services.

Even in cases that call for the type of granular metrics that only native tooling can provide, a full-stack view of the entire service can greatly speed up mean time to resolution (MTTR) by enabling even non-experts to identify where the issue lies, so that the experts can fix it. Your management may not know, for example, what an acceptable elastic heap usage is for a given application. They probably want to know when the end-user response times get too high though, and if they can visualise the whole stack in one dashboard, they can easily identify where the problem lies if the KPI is the only attribute in the red.

Monitor Your Full Stack in Minutes

If you want to try it out for yourself, you can be up and running in minutes with a 14-day free trial of Cloud Insights for cross-platform monitoring and troubleshooting of your full stack.

-