Key Takeaways
|
|
As enterprises continue their cloud and container journeys as part of modernization efforts, they are realizing “hybrid reality” is here to stay. For many, moving all services to clouds or containers is not a viable option. As a result, at least some services will be required to remain on premises.
This presents unique challenges and ongoing complexity for monitoring and observability. Enterprises will need to manage both services deployed in on-prem environments and any dependent services in cloud and container infrastructures.
I recently spoke with a Broadcom customer who is facing the challenges of this hybrid reality: Although many downstream services have migrated to the cloud, a number of their strategic and critical legacy apps are “stuck” in the on-prem vCenter cluster.
This customer’s challenge is common to many organizations: They remain committed to the promise and goals of their modernization effort, while they bridge the requirements of monitoring for a hybrid reality. As a result, the goals of optimizing IT resources and agility, delivering flawless user experiences, and achieving business-aware IT put additional pressures on IT teams. Monitoring teams must still ensure comprehensive observability, minimize MTTR/MTTI, and produce KPIs to measure IT's alignment with the business.
DX Operational Observability (DX O2) is uniquely able to address the challenges of this complex hybrid IT landscape. The product blends traditional monitoring, full stack observability, and state-of-the-art AIOps to deliver on three important themes:
Builds on the product’s ability to aggregate performance data of both Broadcom and non-Broadcom data sources and then to normalize, correlate, and enrich data in a unified data lake. This comprehensive coverage and analysis addresses observability gaps across the IT estate, while ensuring various personas receive the information relevant to them, with the context they need.
Uses logs, metrics, traces, topology, end user experience data, and other sources to stitch together both full-stack and end-to-end analysis of complex systems and services. One example, Triage Inspector, is a powerful enhancement that marries observability and AIOps to provide a seamless triaging and troubleshooting experience. By assessing data across app, infrastructure, network, events, and logs in contexts, such as time and topology, Triage Inspector helps teams rise above alarm overload. It looks at the full picture of available signals and uses GenAI to present a summary of the issue, detail likely culprits, and guide users with best-in-class root cause analysis and recommended next steps.
Helps IT teams prioritize alarms, emerging issues, remediation, capacity allocations, and other work based on the impact to the business. For example, service analytics capabilities of DX O2 enable organizations to dynamically create logical groups of IT elements that contribute to a specific business or IT service. Alarms, metrics, notifications, performance measures, and more are, in turn, enhanced with this understanding of services, which is shared across IT teams and domains. In addition, each team can understand the health, performance, and availability of the service and drill down to information specific to their domain.
Now, back to the customer situation. This customer has a fair number of business-critical applications that have services distributed across the on-prem vCenter data center and have microservices deployed in a public cloud infrastructure. To be able to meet their organizational modernization goal, they use DX O2 for both the vCenter and cloud-native environments.
The screenshot below shows a snapshot of the solution’s complete, end-to-end observability, from the legacy app to data center and beyond, out-of-the-box.
This is made possible because of the ability of DX O2 to ingest data from any source and to normalize, correlate, analyze, and then enrich the data. With the aggregated data available, the customer began leveraging the following capabilities of DX O2.
Service-based triage and service-based monitoring: This allows the customer to organize the IT environment and prioritize operational effort based on critical business services. This makes it easy for IT to understand the business impact, instead of just reacting to a server alert. The service-centric view streamlines root cause analysis, ensuring the right ticket reaches the right team on time. This helps reduce MTTR and MTTI. The service-centric view brings to fruition the “observe with confidence” and “business-aware IT” pillars described above.
Service view
Service dependency (app to infrastructure)
Service detail view
Capacity planning: This is a key use case and regulatory requirement for this customer, who needs to understand how resources are used from both CapEx and resource utilization perspectives. They needed to avoid over-provisioning and also ensure they have sufficient capacity to meet demands. By leveraging Capacity Analytics within DX O2, IT teams are able to make informed, data-driven decisions that align with business needs.
Triage Inspector: One of the key strengths of DX O2 is its ability to provide a holistic view of data. A recent incident arose that was related to resource starvation due to the provisioning of new VMs. This impacted a number of existing VMs and critical applications. Triage Inspector was able to quickly identify the suspect by assessing data across layers (including signals from metrics, events, logs, traces, and so on) in context with time and topology. Additionally, Triage Inspector was also able to leverage built-in GenAI to summarize the problem as readable text. This helped the teams avoid any finger pointing, while quickly getting to the heart of the issue, which, in this case, was resource starvation due to newly provisioned VMs. Triage Inspector provides a comprehensive view and intelligent data analysis that aligns with the “connected domain intelligence” pillar.
Triage Inspector summary
GenAI summarization
Logs for triage in context
In summary, DX O2 is designed to help organizations wherever they are in their IT modernization journey, whether embracing a hybrid model or moving entirely to cloud and containers.