<img height="1" width="1" style="display:none;" alt="" src="https://px.ads.linkedin.com/collect/?pid=1110556&amp;fmt=gif">
Skip to content
    September 9, 2025

    Observability and Monitoring Governance (Part 2 of 4)

    A Process Framework and Best Practices for Getting Started

    7 min read

    Key Takeaways
    • Discover many of the downstream benefits of establishing strong monitoring governance.
    • See why it is vital to establish an effective balance between over monitoring and insufficient coverage.
    • Gain best practices on how and where to start establishing monitoring governance.

    “How did we fail to monitor xyz prior to this incident?"

    “We should monitor everything. We have so many tools…maybe we already are!”

    “Are we vetting applications prior to deployment, including security apps that may adversely affect application performance and responsiveness?”

    “Where and how can we streamline monitoring so our app owners and support teams receive the information they need, when they need it, but avoid information overload, false alerts, and irrelevant data that lacks context?”


    Building on the benefits covered in part one of this blog series, “Observability and Monitoring Governance,” there are significant downstream benefits to strong monitoring governance. Here are some of the potential benefits:

    • Aligning monitoring with business unit goals
    • Avoiding unnecessary monitoring and alerts
    • Vetting of new business-critical apps being introduced into the environment
    • Optimizing resource allocation Improving incident management and troubleshooting
    • Fostering a transparent and collaborative environment between business units and IT
    • Informing strategic planning and decision-making

    The answer to the monitoring paradox: Ask why

    IT is able to monitor nearly everything in or connected to the enterprise estate, including servers, Kubernetes clusters, network-connected power supplies, data center hardware and systems, and even individual database tables within DB2 for z/OS. Although monitoring everything ensures there are no observability gaps, the approach of over monitoring can be problematic and expensive. Monitoring teams frequently contend with an overwhelming volume of alarms, the sheer number and redundancy of which can diminish their effectiveness.

    Alarm noise elimination is a best practice and key to monitoring governance success

    For robust control, NOCs and SOCs rely heavily on custom alarm filtering, as well as precise, real-time alerting and intelligent remediation. These capabilities are essential for networks, applications, OS and hardware, enterprise workloads, and other areas. Filter alarms so teams see only actionable, relevant events. There must be zero alarm noise from other NOC and SOC observability views and consoles!

    Insufficient monitoring, on the other hand, presents equally significant and costly challenges that potentially harm business operations, team productivity, and personal careers. Therefore, the wisest approach involves striking an effective balance.

    Monitoring governance provides a discipline that enables teams to confidently prioritize work so they can monitor what matters most without introducing new risk. It’s about aligning monitoring and business goals to prioritize and precisely focus monitoring on key metrics, systems, and applications that truly impact business operations, rather than indiscriminately attempting to monitor every single component and data point. Monitoring governance provides answers to these questions:

    • Which resources in or connected to the IT estate are business critical?
    • What is an IT asset’s normal or healthy behavior, considering its baseline value for response?
    • What level of monitoring (frequency and granularity) is needed to identify issues when they arise, or predict issues before they affect the business?
    • When things change in the IT estate, or before they change, is there an established formal process to inform monitoring teams so they can adjust monitoring coverage?
    • Can monitoring teams say “No” to new monitoring requests? What might justify this?

    Well-conceived IT monitoring plans are essential for supporting business goals. IT administrators must understand and prioritize the underlying business rationale (the why) for each monitoring request. This is crucial to prevent wasting resources on irrelevant data. This also helps teams dedicate sufficient effort to reducing alert noise and ensuring monitoring contributes effectively to business objectives.

    Best practices: How and where to start monitoring governance

    Executive sponsorship

    • Secure executive and business department buy-in to champion monitoring governance efforts.

    Pilot program and participant selection

    • Identify a core team of technical experts with several years of monitoring experience and demonstrated affinity for aligning with business needs.
    • Consider starting with a single subject matter expert who can lead the project.
    • Brainstorm on all ideas, collect all input, and gain a general consensus on scope, roles, goals, and project leadership.

    Strategic vision

    • Encourage a "big thinking" approach that orients team members around critical services, business metrics (revenue, profit/loss, transaction volume), brand reputation, and worst-case scenarios like outages and service failures. Confirm with executives and business departments.

    Goal setting

    • Establish clear objectives aimed at achieving comprehensive monitoring coverage for all critical IT assets involved in business transactions and processing.

    Actionable plan

    • Develop an initial list of high-priority, achievable goals with corresponding timelines.

    Ask questions to prompt urgency: What are the consequences of not implementing monitoring governance?

    What would the impact to business and users be if important IT assets, such as claims applications or systems, were to crash or become unavailable?

    • What revenue loss occurs if our mortgage application servers are unavailable for 20 minutes?
    • How much revenue could be lost if our e-trading systems and apps fail for as few as five minutes?
    • What percentage of our yearly revenue is associated with this application's availability or designed performance?
    • Should we also consider monitoring disruptions due to disaster recovery (DR) in our project plan?
      • Disaster recovery is a complex and vital process. It is particularly challenging when dealing with large, intricate systems, high data volumes, and poorly governed environments. Effective monitoring and strong governance are essential for a smooth and successful disaster recovery effort.

    Ultimately, it is up to executive teams and business departments to understand the background first, then make informed monitoring decisions once all aspects are given full consideration. Here are a few important areas that IT must prioritize:

    Consider scope

    To be effective, monitoring governance must eventually encompass business-critical IT assets. Still, starting small is better than not starting at all. Ideally, start with monitoring applied to any business system from which critical time-sensitive services are being delivered.

    • Avoid scope creep in the early phases of defining and adopting monitoring governance.
    • Design and follow an MVP-type project plan.

    Roles and responsibilities

    Improvements in monitoring governance will benefit teams tasked with managing the health, performance, and availability of IT assets. Consider these aspects:

    • Develop a focused task force with monitoring experts that embrace the change-agent role. It can be helpful to start with an application team since they understand business relevance and are likely to represent multiple perspectives, such as infrastructure, network, APIs, and so on. Empower them and insist on close collaboration.
    • Begin with a broad-based organizational initiative. Executive sponsorship and a strong change-agent culture are crucial for elevating the importance of monitoring governance and establishing an effective implementation.
    • Define and assign a separate monitoring governance role to act as “point-person change agent.” A change agent champions and drives the adoption of new monitoring processes, tools, and cultural shifts that are required for effective monitoring governance. This could be a project manager, IT manager, senior administrator, or business leader who can serve as an advisor. It is often helpful for the monitoring governance role to be an employee or team separate from the actual monitoring team.

    Timing

    • Establish a RACI matrix for monitoring governance early.
    • The sooner monitoring governance becomes a part of a deployment, the better.
    • After-the-fact monitoring governance never fully catches up to organizational needs. The monitoring environment continues to grow and evolve—eventually becoming unmanageable.

    Communication

    • Business units must clearly communicate their monitoring and reporting needs to IT.
    • Monitoring administrators must ask why something needs to be monitored.
    • Experienced administrators may offer other alternatives to achieving the related monitoring goals.

    Continuous improvement: Project management and tracking

    • Project management software can be a valuable tool for governing enterprise monitoring. This software can provide centralized tracking, workflow automation, risk management, and reporting capabilities, ultimately leading to better outcomes, improved communication, and more consistent adherence to monitoring governance best practices.

    For big endeavors in IT, people often talk in terms of journeys. Monitoring governance is more akin to shifting to a healthier diet. You can start small, build momentum, sustain progress, and see the benefits of incremental improvements along the way.

    Any incremental progress made to fill a void of “no monitoring governance” will offer benefits. And, with improvements in observability technologies complemented by AIOps advances, there are few excuses to accept observability gaps, struggle with too many alarms, lack confidence in monitoring coverage, or continue to work in reactive mode.  

    Steve Danseglio

    With over 25 years of expertise in IT, Steve Danseglio is a seasoned technical professional with a proven track record of supporting Fortune 500 clients through complex enterprise software solutions. He excels in driving customer success through a combination of technical proficiency and strategic vision. Steve has...

    Other Resources You might be interested In

    icon
    Blog September 9, 2025

    Observability and Monitoring Governance (Part 1 of 4)

    Find out how strong monitoring governance can help IT teams cut through the noise, see what truly matters, and act with precision.

    icon
    Blog September 9, 2025

    Observability and Monitoring Governance (Part 2 of 4)

    Read this post and discover some of the top downstream benefits of establishing strong monitoring governance. Gain best practices on how and where to start.

    icon
    Blog September 9, 2025

    DX UIM Hub Interconnectivity and the Benefits of Static Hubs

    Find out how using static hubs is a powerful way to enhance observability. Discover when and how to use static hubs, and the benefits they can provide.

    icon
    Blog September 8, 2025

    Broadcom Recognized as a Leader: Engineering the Future of Service Orchestration

    Read this post and see why Broadcom was named a Leader in the 2025 Gartner® Magic Quadrant™ for Service Orchestration and Automation Platforms.

    icon
    Video September 8, 2025

    Customer Spotlight: Global Bank MUFG Saves Millions of Dollars

    MUFG’s Bruce Frank discusses how the global bank invokes Broadcom's Automated Analytics & Intelligence (AAI) to manage SLAs and ensure regulatory compliance, saving millions of dollars annually.

    icon
    Blog September 8, 2025

    The "Lighthouse" of Strategy: Guiding Your Organization Through Decision Chaos

    Strategic clarity is key. See how strategic portfolio management (SPM) helps align resources and decisions for better business outcomes and ROI.

    icon
    Blog September 8, 2025

    4 Ways AppNeta Enhances Cost-Focused Cloud Planning

    See how AppNeta delivers insights that enable cloud architects to correlate wasted spending with performance degradation and proactively relocate resources.

    icon
    Video September 5, 2025

    Automic Automation Cloud Integrations: Azure Functions Agent Integration

    Broadcom's Azure Functions Automation Agent lets you easily execute Azure Functions, monitor and manage them with your existing enterprise workload automation, as well as other cloud-native...

    icon
    Blog September 4, 2025

    The Public Internet Is Not Your WAN

    Moving beyond MPLS was a strategic necessity. To succeed in modern environments, you need to stop guessing about internet performance and start measuring it.