Key Takeaways
|
|
Contemporary monitoring
With the advent of byte code instrumentation (BCI) in 2008, application performance management took a giant leap in what is known as "inside-out monitoring," that is, monitoring from inside the application. Before that, application monitoring was largely limited to tracking CPU, memory, disk, and process availability. BCI offered new opportunities in terms of how applications could be monitored and what could be monitored from an application performance perspective. Use of BCI was pioneered by Wily Technology, which was acquired by CA/Broadcom. Wily included BCI as part of the Java language in JSR 199. This worked well for monitoring monolithic applications.
Starting in 2010, with cloud and microservices becoming popular, applications became significantly more complex. A single request could go through numerous services (including authentication, authorization, third-party, and so on). At the same time, there was a new paradigm for deploying applications: Teams began to deploy more frequently in a dynamic infrastructure, with the ability to scale up and down on demand.
With these changes, the traditional monitoring approach using BCI proved to be insufficient. Simply turning up more metrics only added to the noise. The volume, variety, and velocity of incoming data, popularly known as the “three Vs,” required a new approach.
What is observability?
Observability has been an IT buzzword for more than five years. According to Wikipedia, “Observability is a measure of how well internal states of a system can be inferred from knowledge of its external outputs. In control theory, the observability and controllability of a linear system are mathematical duals.”
Traditional monitoring helps teams answer questions for situations when they know there is a problem, that is, a known-known problem. This includes questions like “What is the CPU utilization of my process?” or “What is the latency of my request?” However, modern applications with complex request fulfillment paths touching numerous services and systems can have multiple points of failures and unknowns. Do we have the right data/tools to ask the right questions? Is the system observable enough to answer questions like “How is my system doing overall?” and “Why is it not working?” Observability allows teams to answer these types of questions and helps them interpret the data.
Three pillars of observability
There are three primary aspects of observability:
- Logs: Logs have been around forever and produce interesting events that can assist in troubleshooting. The challenge with logs is volume and context. Scoping to ensure context is provided is the key to making sense of logs.
- Metrics: Typically, metrics are time-series data that allows users to set alerting. Generally, metrics answer the "what" question. Per the Google SRE guide, latency, traffic, errors, and saturation are the four golden signals.
- Traces: Traces show end-to-end request tracking. This helps reveal key information related to latencies and errors as a request crosses various system boundaries and can accurately pinpoint the class and method that are causing the issue.
Observability made easy with AIOps from Broadcom
AIOps and observability technologies from Broadcom provide full stack monitoring and observability of application, infrastructure, and network data. This solution can pull data from Broadcom monitoring solutions as well as third-party tools and open-source technologies. The solution can correlate different entities from these data sources and show metrics, logs, and traces in context and in a single pane of glass. This helps reveal some of the known problems and provides enough information to investigate the unknowns.
Srikant Noorani
Srikant Noorani, Client Services Architect focusing on AIOps and Observability, has over 20 years experience working on complex technical challenges. A hands-on architect with a passion for guiding enterprises in their digital transformation journey, Srikant has worked on the largest APM deployments plus DevOps,...
Other posts you might be interested in
Explore the Catalog
Blog
January 10, 2025
When and How to Use Log-Based Metrics in DX Operational Observability
Read More
Blog
December 13, 2024
Full-Stack Observability with OpenTelemetry and DX Operational Observability
Read More
Blog
December 6, 2024
Power Up Your Alarms! Enriched UIM Alarms for Added Intelligence
Read More
Blog
November 26, 2024
Topology: Services for Business Observability
Read More
Blog
November 22, 2024
Regular Expressions That I Use Regularly
Read More
Blog
November 22, 2024
Cloud Application Performance: Common Reasons for Slow-Downs
Read More
Blog
November 4, 2024
Unlocking the Power of UIMAPI: Automating Probe Configuration
Read More
Blog
October 4, 2024
Capturing a Complete Topology for AIOps
Read More
Blog
October 4, 2024