<img height="1" width="1" style="display:none;" alt="" src="https://px.ads.linkedin.com/collect/?pid=1110556&amp;fmt=gif">
Skip to content
    January 31, 2023

    Outages Happen. Now What?

    Network outages happen more often than you think. We may not experience them directly or even know they're occurring at all. When outages affect household names like Facebook, Amazon, Microsoft, and others, however, we're sure to find out after the fact that there was an issue.

    Depending on the user's activities and the duration of the issue, stress and frustration levels can vary. When a marketer can’t get that ground-breaking advertisement up on Facebook, they can get antsy. When a hybrid worker can’t place an order for that amazing home office equipment deal on Amazon, they can feel cheated. And when we’re unable to finish up that Microsoft PowerPoint presentation that our boss is waiting for, we can get very stressed out.

    Imagine how much more stressful it can be if you’re the one responsible for the service levels being delivered. Further, think about what it would be like if you’re responsible for operations at a large-scale enterprise that delivers business- or mission-critical digital services. First, the direct cost of outages can be massive. The average cost of IT downtime is $5,600 per minute or $336,000 per hour according to Gartner. Second, outages can cause more than just immediate revenue loss: staff productivity can suffer, data and other precious assets can get lost, customers can get angry and frustrated, the brand can be damaged, and the organization’s compliance status can be put at risk.

    Why Do Outages Happen and What Can You Do About It?

    Outages happen for a number of reasons. Cyberattacks, including ransomware, distributed denial of service (DDoS), malware, and other attacks continue to rank among the top causes of outages. Human errors like typos, misconfigurations, and cutting corners by ignoring documented procedures or applying unauthorized shortcuts are also common culprits.

    What’s more, outages are getting increasingly common. Outages related to software, network, and system problems are increasing as a result of complexities from adopting cloud technologies and software-defined architectures. In particular, networks are especially problematic. According to Uptime’s 2022 Data Center Resiliency Survey, networking-related issues have been the top cause of downtime over the past three years.

    For years, many network operations (NetOps) teams have relied on network monitoring tools to manage availability and performance within the four walls of their organization’s data centers. But the connectivity demands of today’s digital business are driving the need for a new approach. Now, NetOps teams need a way to gain better visibility and control of both internally managed networks and the networks run by external organizations, including cloud providers and ISPs. The question then becomes, “In today’s hyper-connected and multi-cloud environments, what can you do to prevent outages or respond faster when they happen?”

    The adoption of experience-driven network observability and management can help. This approach represents a superset of network monitoring. With experience-driven network observability and management, NetOps teams can understand, manage, and optimize the performance of digital services. With this approach, teams can gain visibility into the end-to-end user experience delivery chain, including every communication path and potential degradation point. This enables teams to focus on getting ahead of issues—before they affect end users.

    Three Ways to Protect User Experience in an Outage

    Experience-driven network observability and management tools and practices can help your NetOps teams gain actionable insights about the current and future state of a network. They deliver these insights by ingesting telemetry on network device performance; network and internet paths; alarms, faults, logs, and configurations; cloud and SaaS application performance; network traffic flows; and user experience metrics. Armed with this intelligence, NetOps teams can take the following actions to protect the user experience from the damaging effects of outages:

    1. Identify user impact first. Many times, outages will impact certain applications or regions, but not others. At any time, you need to know the state of the network and how the user experience is affected by changing network conditions. With experience-driven network observability and management tools, teams can identify any application in use at any location, continuously measure its performance for each user, and understand the impact on the network that delivers it. In the case of the recent Microsoft outage, the culprit was a network connectivity issue that arose between users and Microsoft applications.

      ESD_FY23_Academy-Blog.Outages Happen - Now What.Figure 1
      Graphs reveal connection outages and high amounts of loss and latency.
    2. Isolate where issues are located, and which are crucial. With these tools, teams can quickly and accurately identify the root cause of a problem. Knowing the source of the issue will either help you validate innocence or accelerate problem resolution. Using robust event correlation techniques can help you understand how outages and performance issues are affecting actual end-user experience and application delivery. As a result, you can prioritize remediation efforts based on business impact rather than simply on alarm duration or severity.

    3. Employ active monitoring of the network. To use apps like Office365, which run in Microsoft’s networks, users’ connections may traverse a huge number of network hops. In the example below, over the course of 30 minutes and then one hour, the number of dynamic paths varies for a single device targeting Office 365. This illustrates how dynamic cloud environments can be. Small changes can have big consequences for connectivity. This heightens the value of actively testing network delivery to track SaaS and web applications, enabling you to proactively find and fix issues before they affect users. With experience-driven network observability and management tools, your teams can actively and continuously measure the end-to-end health, performance, and availability of the network.

      ESD_FY23_Academy-Blog.Outages Happen - Now What.Figure 2
      Multi-path route visualization shows how routes terminate at the edge of the Microsoft network.

    How Broadcom Can Help

    With Broadcom, your team can establish optimized NetOps capabilities. With our solutions, you can minimize the risk and impact of network outages, streamline operations, and maximize network performance and availability. In the process, the solution helps you more fully capitalize on revenue opportunities. Register now and join our 30-minute Small Bytes webinar on February 1st at 12 PM EST to learn more about how to troubleshoot Microsoft Teams issues in today’s hybrid work environments.

    Gedeon Hombrebueno

    Gedeon focuses on bringing the Network Observability by Broadcom solution to market. The solution enhances network visibility to boost network operations efficiency and user experience—key to today’s business success. Gedeon has extensive product marketing, product management, and integrated marketing experience in...

    Other resources you might be interested in

    icon
    Office Hours October 23, 2025

    Rally Office Hours: October 9, 2025

    Discover Rally's new AI-powered Team Health Widget for flow metrics and drill-downs on feature charts. Plus, get updates on WIP limits and future enhancements.

    icon
    Course October 23, 2025

    AAI - Navigating the Interface and Refining Data Views

    This course introduces you to AAI’s interface and shows you how to navigate efficiently, work with tables, and refine large datasets using search and filter tools.

    icon
    Office Hours October 23, 2025

    Rally Office Hours: October 16, 2025

    Rally's new AI-driven feature automates artifact breakdown - transforming features into stories or stories into tasks - saving time and ensuring consistency.

    icon
    Blog October 22, 2025

    What’s New in Network Observability for Fall 2025

    Discover how the Fall 2025 release of Network Observability by Broadcom introduces powerful new capabilities, elevating your insights and automation.

    icon
    eBook October 22, 2025

    Modernizing Monitoring in a Converged IT-OT Landscape

    The energy sector is shifting, driven by rapid grid modernization and the convergence of IT and OT networks. Traditional monitoring tools fall short.

    icon
    Blog October 22, 2025

    Your network isn't infrastructure anymore. It's a product.

    See why it’s time to stop managing infrastructure and start treating the network as your company's most critical product. Justify investments and prove ROI.

    icon
    Blog October 22, 2025

    The Network Engineers You Can't Hire? They Already Work for You

    See how the proliferation of siloed monitoring tools exacerbates IT skills gaps. Implement an observability platform that empowers the teams you already have.

    icon
    Blog October 8, 2025

    Nobody Cares About Your MTTR

    This post outlines why IT metrics like MTTR are irrelevant to business leaders, and it emphasizes that IT teams need network observability to bridge this gap.

    icon
    Blog October 8, 2025

    Tag(ging)—You’re It: How to Leverage AppNeta Monitoring Data for Maximum Insights

    Find out about tagging capabilities in AppNeta. Get strategies for making the most of tagging and see how it can be a game-changer for your operations teams.