LEARNING PATH

Rapid and Accurate Isolation of Issues in Modern Networks

Learn how Network Observability by Broadcom helps customers optimize operations with Rapid and Accurate Isolation of Issues.

Courses

8 chapters

to review

2 hours

estimated completion

Rapid and Accurate Isolation of Issues in Modern Networks

Course sections

Use Case Overview

As reported by Enterprise Management Associates, Inc. (EMA), Only 6% of organizations can get useful insights from their tools, leading to costly and longer triage and problem resolution time (MTTD/MTTI/MTTR).

The expansion of diverse networks and networking technologies has caused network operations (NetOps) teams to adopt disparate tools to manage these networks. This growth in network usage, siloed tools, and the adoption of software-defined technologies has created a dramatic proliferation in event and alarm volumes. With so many events and alarms, troubleshooting takes too much time and effort.

To date, tools employed within many organizations can only offer support for a narrow set of products and vendors—and the number of distinct technologies employed only continues to grow. Given this, NetOps teams continue to contend with lengthy triage efforts, inefficient root cause analysis, and inadequate governance of configuration changes. Therefore, organizations, are more exposed to outages and performance issues.

This spiraling complexity creates an increasingly untenable situation for NetOps teams—and the organizations they support. After all, slow is the new down. Outages, even slow performance, are even more devastating for organizations. In an always-on digital world, corporate clients and consumers are increasingly unwilling to put up with downtime or lagging performance, and today’s digitized services make it easier to switch to another vendor at any time.

The only way to combat these issues is by investing in an advanced network monitoring and management solution that provides:

A highly scalable, unified data model. Every piece of multi-vendor network data needs to be collected by one solution. This one solution must be able to collect, normalize, and correlate disparate data sets from across the organization’s multi-vendor, -technology, and -protocol network environments. This data needs to be presented in intelligent, unified views of network health, delivering the “one source of truth” that eludes many NetOps teams today.
Advanced analytics. Advanced and patented analytics must be applied to this collected data. Teams need analytics that correlate network fault, performance, and flow data. These analytics enable teams to uncover patterns, identify issues faster, and anticipate how changes will affect the user experience or network health.
Intelligent triage workflows. The results of the collected data and analytics must be presented to the operator in easy-to-understand troubleshooting workflows. The solution must minimize alarm noise, so NetOps teams can quickly diagnose issues and get to the root cause. This solution must also enable teams to quickly dive into a specific technology domain to get the details required.

These three areas do not work in isolation but together provide the observability needed to manage today’s complex networks. For example, if you need to look at the logs on a specific device you are troubleshooting, you should not have to connect to that device to get your answers. An advanced network monitoring and management solution should have already collected those logs, analyzed those logs, found any patterns or anomalies, and revealed the root cause in easy triage workflows. Further, these workflows should be integrated with standard operating procedures that minimize the number of clicks needed to resolve issues.

To learn more about how Network Observability by Broadcom helps with the rapid and accurate isolation of issues, read the white paper: Rapid and Accurate Isolation of Issues in Modern Networks.

The Network Observability by Broadcom Approach

Network Observability by Broadcom is the solution that brings together active and passive monitoring approaches to provide an environment suitable for rapid and accurate isolation of errors that impact the business network environment. The solution offers comprehensive abilities for customization to accommodate unique and diverse technologies, environments, and performance criteria. The solution combines the multi-technology automated discovery capabilities for the internally managed network provided by DX NetOps with the active external network monitoring provided by AppNeta.

The DX NetOps platform consists of a set of software components that are deployed on-premises or hybrid cloud and then integrated together to power the different DX NetOps capabilities.

The DX NetOps platform consists of the following software components:

DX NetOps Performance Management provides a unified portal across the network landscape and gathers SNMP performance statistics from devices and controllers.
DX NetOps Spectrum provides fault monitoring, proactive network change management, fault isolation, and root cause analysis.
DX NetOps Virtual Network Assurance (VNA) provides modern network monitoring to collect data from SDN and NFV controllers and orchestrators.
NetOps Flow provides flow monitoring to capture flow exports produced in various flow protocol definitions.

AppNeta is a SaaS platform, provisioned by Broadcom, that utilizes strategically installed Monitoring Points to monitor the different aspects of the network and provide data to the AppNeta UI and, when integrated, to the DX NetOps Portal.

Unified Data Model
NetOps teams need various types of data to intelligently manage modern, multi-vendor networks, including logs, flow information, configuration details, and more. Too often, this data is held in disparate locations and managed by different tools. This means each team needs to toggle between its unique set of tools and multiple interfaces when issues arise, which leads to inefficiency and complexity, and ultimately slows triage.

NetOps teams today need a solution that is:
- Flexible and network-agnostic. Understand complex properties and relationships within and across multiple data types, including logical and physical networks and the large volume and variety of data consumed and generated by monitoring and SDN systems.
- Post-deployment-extensible. Future-proof the organization by enabling new technologies and data types to be quickly incorporated within the model without enforcing the need to manually identify and tag elements. As networks and technologies evolve, our unified data model is prepared to capture and correlate new data types.
- Multi-dimensional. Reconcile elements from multiple layers and with multiple facets into unified models with relationships and highly contextual data. Every new layer of information provided by a data source increases understanding and enables the users to consume relevant data from multiple dimensions in a single pane of glass.
Advanced Analytics

Diagnosing network issues and managing performance continues to get more complex. For teams short on resources and expertise, locating the source of network outages and performance issues is increasingly akin to finding a needle in a large, constantly shifting haystack.

NetOps teams use a patchwork of scripts and isolated tools. This scenario intensifies the administrative burden and leads to lengthy troubleshooting efforts. When isolating the root cause of issues in modern networks, advanced analytics is needed to deliver the insights needed to achieve accurate and fast identification of network errors and performance issues.
Intelligent Triage Workflows

NetOps teams face a skill shortage. This is often exacerbated by executive leadership's preference to keep teams and operating budgets lean. Consequently, training and development resources are scarce, and internal staff are relatively inexperienced. For these reasons, ensuring teams stay current with advanced, rapidly evolving technologies and best practices is becoming increasingly difficult.

When organizations have teams using multiple, disconnected network management tools, they experience a higher percentage of problems resulting from manual errors. If a large number of tools are employed, it typically leads to suboptimal processes and policies because each tool will have overlapping capabilities, making it difficult, if not impossible, to enforce consistent controls. To reduce the associated risks, there is a compelling need to consolidate network management tools as much as possible. This consolidation is vital in limiting errors and improving overall network management practices.

Implement Network Observability for Rapid and Accurate Isolation of Issues

The Network Observability by Broadcom solution gives NetOps teams full visibility into the modern network environment. The ability to discover, monitor faults, and gather performance metrics from both owned network and service provider networks.

Configuring the Network Observability by Broadcom solution for end-to-end coverage to utilize proactive insights typically involves the following procedures:

Work with Broadcom Services or a certified Broadcom partner to design and install DX NetOps.
Discover devices with DX NetOps to obtain a device topology, process network faults, and gather performance statistics.
Group network items in DX NetOps to align with critical business functions for alarming and reporting.
Customize dashboards in DX NetOps to create visual workflows for reporting performance and drill-downs into problem areas.
Create relevant threshold monitoring profiles.
Use plugins to integrate data sets from vendors in SDN, SDDC, SD-WAN, NFV, and Wi-Fi into a single alert and reporting solution.
Discover and map layer 2 and layer 3 topologies to visualize device dependencies and use those dependencies in root cause analysis.
Configure active network monitoring to eliminate blind spots in external networks
Configure flow monitoring and deep packet inspection separately or together to identify applications and gain insight into network utilization.
AppNeta Monitoring Points are strategically deployed throughout the network to gain visibility into the performance of externally managed networks, such as cloud and transit networks

The End-to-End Network Operations Coverage Learning Path describes a full installation for fault and performance monitoring along with the deployment of AppNeta Monitoring Points. This learning path, Rapid and Accurate Isolation of Issues, describes the implementation of using the performance metrics gathered from the end-to-end implementation.

Note: Typically, Broadcom Services or a certified Broadcom partner completes the design, installation, and configuration of DX NetOps and integrates it with other tools, such as AppNeta.

For more information on DX NetOps installation, refer to the DX NetOps Installation and Configuration Learning Path.

For an overview of Broadcom’s experience-driven approach to network observability, refer to the online course: Experience-Driven NetOps: Overview 100.

Achieving a Unified Data Model

DX NetOps offers teams a complete, domain-agnostic, multi-vendor network management solution to counter these challenges. DX NetOps delivers high-scale monitoring that enables fast data collection for various networking data types. Further, network teams need a solution to deliver the high scalability required to aggregate and correlate the massive volumes of data today’s networks generate. DX NetOps provides a highly scalable data model that collects, cleans, correlates, and normalizes data from disparate vendors, technologies, and protocols to provide an enterprise-wide view of network health. Unlike other data repositories, the DX NetOps data model has been fine-tuned to work with the largest networking data sets. DX NetOps Performance Management uses Vertica as a database. This scalable, highly available database stores network item descriptions, performance, and other detailed information for projections. DX NetOps provides this out of the box.

To learn more about the power of the Vertica Data Base, refer to Vertica Documentation.

To learn more about the installation of the Data Repository, refer to Prepare to Install the Data Repository, (NetOps Documentation).

Integrating External Data
The major integration points with DX NetOps are through DX NetOps Virtual Assurance (VNA). DX NetOps Virtual Assurance is a flexible and scalable software gateway covering the largest number of technology stacks, including traditional, SDN, SDDC, SD-WAN, NFV, and hybrid cloud architectures. This integration will add all the relevant metrics to the NetOps team’s single view. One example of this integration is adding AppNeta metrics into the Data Repository.

For an example of integration that enhances the NetOps data store, refer to the AppNeta integration video:

To learn more about integration with DX NetOp Virtual Assurance, refer to Building (DX NetOps Documentation)

To learn more about DX NetOps Virtual Assurance installation and configuration, refer to the following resources:
- DX NetOps 23.3.x: Install and Integrate Virtual Network Assurance 200 (Academy Course)
- Configure a VNA Plug-In using DX NetOps Portal (Video):
- Verify VNA Data Integration with DX NetOps Portal (Video):
The following are other methods of DX NetOps integrations:
- Performance integration, Integrating (DX NetOps Documentation)
- Fault Management, Integrator Overview (DX NetOps Documentation).

Power of Advanced Analytics

Network Observability by Broadcom provides DX NetOps Spectrum as the infrastructure management solution with integrated service, fault, and configuration functionality for modeling, monitoring, and reporting across multiple network device types and technologies. Using a “trust but verify” methodology, DX NetOps Spectrum provides an automated and intelligent management approach for your particular environment, whether you are a service provider or an enterprise.

Knowing about a problem is no longer enough. Predicting and preventing problems, pinpointing their root cause, and prioritizing issues based on impact are requirements for today’s management solutions. The number and variety of possible fault, performance, and threshold problems mean that no single approach to root cause analysis is suited for all scenarios. For this reason, DX NetOps Spectrum incorporates model-based inductive modeling technology (IMT), topology-based fault isolation, and policy-based condition correlation technology to provide an integrated, intelligent approach to drive efficiency and effectiveness in managing IT infrastructure as a business service.

For a detailed explanation of DX NetOps Advanced Analytics, refer to Interpreting Events with Intelligence to Find Root Cause (Broadcom Software Academy).

Fault Isolation and Alarm Suppression

DX NetOps Spectrum offers a patented, algorithmic approach to fault management. The solution proactive polls component status and generates events based on threshold violations. It does this while analyzing fault domains, which are collections of alarms affected by the same failure. The complete discovery of the network is the key to fault suppression and unlocking to power of this out-of-the-box feature.

The core of the DX NetOps Spectrum fault root cause analysis (RCA) solution is its patented Inductive Modeling Technology (IMT). IMT uses an object-oriented modeling paradigm with model-based reasoning analytics. DX NetOps fault most often uses IMT for physical and logical topology analysis, as the software can automatically map topological relationships through its efficient, automated discovery engine.

To learn more about DX NetOps Spectrum discovery, refer to the following video:

Refer to the Demonstrating Fault Isolation and Alarm Impact in DX NetOps Spectrum OneClick video for an example of model-based inductive modeling technology providing topology-based fault isolation.

To learn more about DX NetOps advanced analytics, refer to NetOps 101: An Algorithmic Approach to NetOps (Bright Talk Video).
Policy-Based Condition Correlation Technology

NetOps teams need a broader set of capabilities to perform more complex user-defined or user-controlled correlations. A condition is similar to a state. An event can set a condition and clear it. It is also possible to have an event set a condition but requires a user-based action to clear it. The condition exists from the time it is set until the time it is cleared.

Many devices in an IT infrastructure provide a specific function. The device-level function is often without context as it relates to the functions of other devices. Most managed devices can emit event streams, which are local to each component. A simple example is when a response time test identifies a result exceeding a threshold. At the same time, an event may identify a condition of a router port exceeding a transmit bandwidth threshold. These conditions are seemingly disparate, as they are created independently and without context or knowledge of each other. In reality, the two are quite related.

To learn more about condition correlation, refer to Condition Correlation (DX NetOps Documentation).
Event Management System

Event rules can be established to allow a higher-order correlation of event streams. Event Rule processing is required when the event stream is the only source of management information. For example, this situation can occur when DX NetOps Spectrum accepts event streams from devices and applications that it does not directly monitor, so that DX NetOps Spectrum can only listen—it cannot talk. DX NetOps Spectrum provides many out-of-the-box event rules, but also provides easy-to-use methods for creating new rules using one or more of the event rule types.

To learn more about the DX NetOps Spectrum event management system, refer to Event Configuration (DX NetOps Documentation).
Historical Performance Analytics

Advanced Analytics is also provided through DX NetOps Performance Management. Historical performance metrics are used to baseline performance to understand normal network patterns over time, identify abnormal activity, and spot emerging trends. To be effective, teams need a solution to filter out noise and establish a reliable baseline for all key network metrics and indicators. The DX NetOps must also use advanced threshold functions to detect and alert on any network anomalies, abnormal patterns, and developing trends or issues that can affect network performance and user experience quality.

Baseline averages help characterize past performance for selected monitored metrics and assess present performance. The data aggregator continually calculates baseline averages and related standard deviations as each hour passes. The standard deviation provides a statistical indicator of how much variability exists in the population data that is factored into the baseline average calculations.

DX NetOps Performance Management can calculate future values based on historical metric data. Metric projection is useful for capacity planning. For example, teams can calculate the projected interface utilization to verify that the interface bandwidth is sufficient for a specific time in the future.

To learn more about using and configuring metrics to provide insights into network monitoring, refer to the Proactive Insights Learning Path (Academy).
Entire Application Delivery Chain Fault Analytics

Network Observability by Broadcom also provides AppNeta active network monitoring accomplished through the TruPath™ packet train dispersion technology. TruPath sends and receives many varied short sequences of packets, which are referred to as packet trains. Packet trains are transmitted using Internet Control Message Protocol (ICMP) or User Datagram Protocol (UDP). Packets are sent to defined end hosts or targets, which can be any endpoint that can respond to an ICMP-based ping or can send back a Transmission Control Protocol (TCP) or UDP packet.

Using this technology, TruPath can build up a complete set of network statistics very quickly—in many cases, in just tens of seconds. TruPath uses special patterns designed to detect if instrumentation packets are interfering with each other. If that happens, it takes more varied samples over a longer time scale to ensure that the resulting statistics are clean.

By sending multiple sets of distinct packet sequences, TruPath can analyze a wide range of traffic conditions that a user on a network path might experience. TruPath probes the path repeatedly with the packet sequences, obtaining a statistically significant collection of responses for each type. TruPath will detect when samples are captured during rapidly changing conditions and adjust its measurement patterns accordingly.

Unlike packet flooding technologies available on the market, the TruPath approach delivers high accuracy without requiring an intrusively high instrumentation load on the network path. Because of the technology’s low overhead, network operations teams can run TruPath in production and through third-party networks for end-to-end visibility.

To learn more about the TruPath technology, refer to Understanding Network Monitoring (AppNeta documentation).

Monitoring Network Configurations

Configuration management is the process of identifying and monitoring configurations of single devices and device families that comprise a network. Devices include routers, hubs, and switches. Using the DX NetOps to capture, monitor, compare, update, and recover device configuration ensures increased network uptime and quicker recovery from erroneous changes.

Network configuration changes can significantly impact network reliability and business continuity. It is important to be able to monitor change and also use policies to monitor for standards in network configurations.

To take advantage of Network Configuration Management, configure devices to send SNMP Traps to Spectrum. These traps should indicate when a fault occurs or a configuration change is made to the device.

To learn more about managing network configurations, refer to Network Configuration Manager (DX NetOps documentation).

Unified Portal

The DX NetOps Portal allows intelligent workflows that help triage network issues rapidly. It brings disparate data across multi-vendor, multi-tech, multi-protocol network architectures for a unified view of global network health. With DX NetOps, teams can work with one alarm portal that suppresses hundreds or thousands of collected alerts from across a multi-vendor network landscape.

Using the DX NetOps Portal, network operators can perform the day-to-day tasks of monitoring the network's health and responding to alarms and major network events, and network planners can identify areas where the network can be optimized.

For a walkthrough of the DX NetOps Portal, refer to DX NetOps 23.3.x: Getting Started with DX NetOps Portal 100 (Academy Course).

To learn more about the DX NetOps Portal, refer to Using (DX NetOps Documentation).

Network Observability by Broadcom Rapid and Accurate Isolation of Issues

For too long, networks and the tools teams have used to manage those networks have kept getting more complex. Consequently, teams spend too much time, effort, and money trying to find the root cause of issues. These trends simply can’t continue.

Broadcom Observability by Broadcom provides a solution to manage ever-increasing complex networks. This solution helps find the root cause of an issue within the network.

For more information on how to implement Network Observability by Broadcom for other use cases, explore other learning paths.
For technical documentation for Network Observability, refer to DX NetOps and AppNeta

Visit our Small Bytes page for a complete list of upcoming and on-demand presentations in the Network Observability series.

To learn about Network Observability by Broadcom, refer to Make Every Network Work for You.

For more information, contact Broadcom.