Should I go with native cloud monitoring tools or should I leave it to monitoring tools?
Native cloud monitoring solutions like AWS CloudWatch, Azure Monitor are powerful tools, but whether they’re enough to achieve a converged observability depends on several factors.
Native cloud monitoring tools
A native cloud monitoring tool provides you with a number of benefits.
- Ease of implementation: Native cloud monitoring tools are built into the respective cloud platforms, so they are easy to implement.
- Cost: Basic services are generally included in your subscription cost. However, for a “production-grade” monitoring solution, you may need to pay additional costs. The extent of these costs depends on your monitoring requirements. (For example, a live log ingestion solution that uses a faster storage for one year with high availability will naturally come with a higher price tag than a basic system monitoring solution.)
- Monitoring dashboards: Native cloud monitoring tools provide built-in monitoring dashboards, so you do not need to install a separate console to visualize your monitoring metrics. You can find more information about AWS CloudWatch in my post on this topic.
At the same time, native cloud monitoring techniques work within a boundary. They are not built towards providing you an observability platform. So there are some challenges as well.
- Technical boundary: Native cloud monitoring tools offer the features promised by the cloud vendor. For day-to-day SRE responsibilities, you may need more features such as multi-cloud monitoring, hybrid-infrastructure monitoring, custom applications monitoring, integration with x or y, etc. Not all of them would be provided by the solution.
- Depth of monitoring: Recently someone told me that they received an alert for high memory utilization. They couldn’t figure out why that specific memory area was higher. They needed to find the respective commands themselves to get more information.
- Coverage/custom monitoring: We go to the cloud to run some business services. We may run COTS products. Many of such products are not monitored by native cloud monitoring techniques. There are CRM tools, ERP applications, call center applications, virtual desktops, etc.
- Not completely free: As I said above, there is a free tier. But a production-grade monitoring may need a cost. A sophisticated MELT monitoring or observability would cost you more.
Observability tools
A third party observability vendor would certainly help you overcome the limitations of native cloud monitoring. There are open source tools. There are proprietary tools. I have seen different people use the right tools to meet their objectives and succeed in their SRE journey. Unfortunately, I have also seen people embark on ambitious SRE journeys with high costs using different tools for different tiers of observability and struggle to achieve results. Keeping these stories apart, let’s list down the pros and cons.
They definitely have advantages:
- Achieving full/converged observability: They help you achieve visibility across all the tiers of MELT. With the right data mining and dashboards, they can offer you information–not just data–which will empower you to make decisions.
- Distributed Tracing: You can run in one cloud, multi-cloud, hybrid–a cloud-aware observability tool (not a conventional one) should help you have a unified console across all the environments.
- Customization/tailor-made: If you have a requirement, there is no guarantee that popular cloud vendors will listen to you and build it for you, even if you are willing to pay! But your observability vendor may be willing to provide you the solution because they do not see you as a tenant; they see you as a potential account!
- Try before use: Your vendor may be willing to offer you a trial before you make the actual purchase.
All these come with their cons as well:
- Cost: Firstly, at cost. Unless you choose open source tools, none of the tools are free. There is even a cost to maintain and support open source tools.
- Long purchase cycle: You may have to follow the purchase lifecycle like RFP/RFQ/other tender methods.
- Expertise: Not all tools are a “one-stop-solution.” Sometimes you may need to go with different expert solutions for different tiers of MELT. Correlating across all these tiers may be a challenge.
- System requirements: You may need to reserve resources such as virtual machines, databases, storage space, and maintain them. SaaS solutions will help you overcome this.
- Security: As you install a third-party solution, you need to ensure the security settings are in alignment with your corporate policies.
So I am not here to provide you a binary verdict. The best approach depends on your deliverables in the project and depth of monitoring. So discuss with your SRE consultant or vendor. Of course have the architecture of the system, your day today problems which are to be addressed, your budget allocation by your side.
Wrap up
Native cloud monitoring tools are valuable tools. But it may not be sufficient to offer you full observability in all scenarios. but they may not be enough for full observability in all cases. Consider your specific needs and explore additional tools and strategies to achieve a comprehensive understanding of your system’s health, performance, and user experience.
—
This post is written as part of #WriteAPageADay campaign of BlogChatter