<img height="1" width="1" style="display:none;" alt="" src="https://px.ads.linkedin.com/collect/?pid=1110556&amp;fmt=gif">
Skip to content
    April 8, 2022

    The Future of Monitoring: Turning Unknown Unknowns into Known Knowns

    To ascertain risk, national security and intelligence professionals have long used concepts such as known knowns, known unknowns, and unknown unknowns. The idea of unknown unknowns was created in 1955 by American psychologists Joseph Luft (1916–2014) and Harrington Ingham (1916–1995). This concept continues to be used today in risk assessments and is applicable to technology. 

    The unknown unknowns are the threats and potential problems that remain invisible until their impact manifests. When they do, they create consequences that can disrupt the status quo. 

    In business, and in particular in technology, the unknown unknowns pose a serious problem and often remain elusive to IT teams. As applications have become more complex, it’s become harder to track risks. Traditional application performance monitoring (APM) has focused on monitoring for known problems. You knew the performance baseline for an application, as well as expectations and performance indicators, and you’d set thresholds and static alerts to see if performance deviates. You were monitoring for known knowns. Today, this is no longer sufficient.  

    APM Needs to Address the Unknown Unknowns

    Modern apps are extremely complex and multi-layered. You can no longer depend on monitoring for known problems. In the cloud, the application stack is so complex that issues can happen anywhere and disrupt app performance. These app disruptions can have a negative impact on the user experience, customer/user satisfaction, and even customer retention and business revenue. 

    But can an APM solution really monitor for unknown unknowns? 

    Yes, it can and here’s how. To monitor all the layers of the environment, and understand these layers and how they interact, an APM solution needs a topology model of dynamic and constantly changing environments. 

    Along with a topology model, a solution must have the intelligence needed to determine when a metric deviates from norms. A solution needs to be able to correlate events and anomalies and identify probable root causes, then automatically remediate the issue. An APM solution such as Broadcom’s DX APM SaaS does this with a rich topology based on our strengths in monitoring applications, infrastructures, and networks. We have a built-in, automated remediation engine that leverages the topology and helps customers automatically remediate issues when they occur. 

    DX APM SaaS is capable of dynamically monitoring for unknown unknowns, enabling businesses to elevate their level of observability and create confidence that they won’t run into surprises. 

    What does this look like? Here’s a scenario to illustrate. In this fictional scenario, we’ll demonstrate how John, an SRE/DevOps engineer, can use our DX APM SaaS solution to query across captured instrumentation data to examine, correlate, and derive insight or uncover unknown unknowns (observability) from this data.

    1. John is notified of a performance problem with the ACME Business Service. It appears that several users are unable to log into the site. John logs into DX APM SaaS and opens up the experience view for service health and notices that the service is indeed “Red.” Next, he drills down into service health. 

    ESD_FY2022_The-Future-of-Monitoring_figure_01-Jul-20-2022-06-32-59-55-PM-2

    2. He sees that the activity service application is reporting a problem with several failed transactions. He sees that APM has already detected a problem and is reporting an issue with the authenticator service. He clicks on the problem to view the associated details in a problem analysis notebook view.

    ESD_FY2022_The-Future-of-Monitoring_figure_02-Jul-20-2022-06-33-04-66-PM-2

    3. In the notebook view, John views the problem details and notices that there are issues reported with several different technology tiers of the application stack.

    ESD_FY2022_The-Future-of-Monitoring_figure_03-Jul-20-2022-06-32-44-45-PM-1

    4. He notices that the system has identified the authenticator service as problematic and selects the service.

    ESD_FY2022_The-Future-of-Monitoring_figure_04-Jul-20-2022-06-32-53-36-PM-2

    5. He opens up the traces tab to take a look at the automatically collected transactions and sees several error traces. He opens the first trace and sees that the request from the first service has timed out with a 400 error.

    ESD_FY2022_The-Future-of-Monitoring_figure_05-Jul-20-2022-06-32-46-10-PM-1

    6. At the same time, he also observes that only some transactions are timing out while others are successful.ESD_FY2022_The-Future-of-Monitoring_figure_06-Jul-20-2022-06-33-02-00-PM-1

    7. He wonders what could have changed with the application and brings up the timeline view to see change events. He notices a new version of the authenticator service was released to production last night. The new version was re-implemented as a microservice in a new technology stack. Furthermore, the specific team that developed the new service decided to instrument it with OpenTracing for their observability needs. Could there be an issue with the new service?

    ESD_FY2022_The-Future-of-Monitoring_figure_07-Jul-20-2022-06-32-50-18-PM-2

    8. He turns his attention to the alarms. The system is reporting an abnormally high CPU utilization on the new service. In addition, the system has also detected a change in the deployment configuration, with the number of replicas going down from six to one.  This is obviously not right: the new service was checked in with an incorrect deployment configuration. This error caused the service to be under-provisioned, resulting in performance issues. 

    ESD_FY2022_The-Future-of-Monitoring_figure_08-Jul-20-2022-06-33-03-27-PM-2

    As you can see, DX APM SaaS automatically detects issues via dynamic baselines and correlates these seemingly unrelated sets of events, both performance and change events, together in a unified topology. This helps John quickly diagnose the problem, down to the specific line of code. As a result, the solution converts unknown unknowns into known knowns. 

    Learn more about how DX APM SaaS can help you address unknown unknowns.  

    Nishant Kabra

    Nishant is a Product Manager for Application Performance Management. He is passionate about releasing market-leading solutions covering full-stack observability for applications.

    Other resources you might be interested in

    icon
    Video May 1, 2026

    Automic Automation Cloud Integration: DBT

    This video explains the Automic Automation DBT agent integration and its benefits. Learn about the agent and find out how to install, configure, and use it.

    icon
    Video May 1, 2026

    Automic Automation Cloud Integrations: Cloud Foundry Agent Integration

    This video explains the Automic Automation Cloud Foundry agent integration and its benefits. Learn how to install, configure, and use the agent.

    icon
    Office Hours April 30, 2026

    Rally Office Hours: April 30, 2026

    This Rally office hours video covers a new milestone delivery confidence framework, user Q&A on features like ranking, and upcoming event news.

    icon
    Course April 28, 2026

    Clarity: Objects, Attributes, and Views

    In this course, you will master the five core functional areas of Clarity Admin Studio configuration that form the backbone of the user experience.

    icon
    Course April 28, 2026

    Enabling Secure HTTPS Communication with TLS Encryption for AutoSys Components

    Workload Automation Administrators can gain essential knowledge on enabling secured HTTPS communication between AutoSys components using TLS encryption.

    icon
    Course April 24, 2026

    Automic Automation: Getting Started with the Automic Web Interface Version 26

    Get started with the v26 Automic Web Interface (AWI). Learn how to navigate the modernized UI, customize your workspace, and move between perspectives.

    icon
    Course April 24, 2026

    Automic Automation v26: Zero Downtime Upgrade (ZDU)

    Learn how to employ the Zero Downtime Upgrade (ZDU) process. Transition from Automic v24 to v26 while your mission-critical workflows continue to execute.

    icon
    Office Hours April 23, 2026

    Rally Office Hours: April 23, 2026

    Learn about new tools Rally provides for moving legacy pages to custom views. Listen to technical Q&A and find out about upcoming events.

    icon
    Course April 22, 2026

    ValueOps ConnectALL: Synchronize Jira and Rally for Frictionless Cost Accounting

    This course teaches you how to integrate Jira data into Rally for the purpose of frictionless cost accounting in Clarity.