<img height="1" width="1" style="display:none;" alt="" src="https://px.ads.linkedin.com/collect/?pid=1110556&amp;fmt=gif">
Skip to content
    May 11, 2023

    IT Operations in 2023: AI/ML & Automation Will Continue to Be the North Star

    The use of statistics, advanced algorithms and AI/Ml is becoming omnipresent. The benefits are visible in every walk of life, from web searches, to movie and retail recommendations, to auto-completing our emails. Of course, not many anticipated the dramatic entrance of generative AI in the form of ChatGPT for writing college essays and poetry on arcane topics.

    The benefits of these technologies are all around us, although less obvious, in applications such as cost optimization in manufacturing, real-time safety and fuel economy adjustments in automobiles, and life-saving solutions in healthcare. While it may take years for Smart-Quant to mature enough to meet our science fiction-level expectations, the growing investment and progress around us in these areas is undeniable.

    To make progress, these models need data…lots of data.

    Few parts of society generate and can capture more data than IT Operations. So, the use of the above techniques for solving IT Operations challenges is not surprising. With strong monitoring tools and increasingly more things to monitor, the volume of data proceeds unabated.

    Data sources

    Feeding the AIOps machine are varied data sources. To generate meaningful insights, AIOps benefits from a variety of data from direct or synthetic monitoring of applications, infrastructures, networks, and user experiences.

    AIOps also needs to make sense of the data.

    Powerful normalization and correlation (using statistical models and knowledge of IT assets and service architectures) is of course needed to structure the data for clean analysis.

    Topologies

    Most IT Operators love service maps. In order to build these service maps, AIOps must derive or ingest inventory and relationships from multiple sources. Application topologies can be extracted from cross-transaction traces, while network topologies can be established from device logs and connectivity tests between different devices.

    AIOps needs to normalize and enrich this data with additional attributes from third party sources and persist in a uniform graph data model. Two elements typically comprise a graph model:

    1. Vertices/Nodes. Vertices are the entities of the model and can hold any number of attributes or key-value pairs that help describe the entities.
    2. Edges. Also known as dependencies or relationships, edges provide the relevant connections between two vertices. A relationship always has a direction, a start node, and an end node. Although they can be directed, relationships can be navigated in either direction

    ML algorithms can easily query such a graph model and use it as a primary dimension for all analyses. For example, for a typical use case like performance problem isolation, the ML algorithm can easily identify the hosts and application components involved in a business transaction and use this to set the data scope for analyzing relevant performance metrics and alarms.

    The overabundance of wonderful data will cause recursive progress for AI/ML and advanced algorithms.

    • More data will result in greater need for AI/ML.  
    • More AI/ML will motivate us to capture, clean and analyze more data.

    Rinse and repeat.

    Given the limits of humans, we turn to automation.

    For AIOps, here are a few areas that beg to be automated:

    • Auto-discovery of changes to entities added to or removed from the IT environment. What is new monitoring data available to the AIOps machine? What information is no longer available or relevant?
    • Auto-correlation of data. Which entities are associated with the business service I’m responsible for? (See my blog, “IT Operations in 2023: Business Services Become a Viable Organizing Principle”)
    • Auto-ticketing. Which teams or individuals should be notified when certain performance thresholds are breached, or better, when certain thresholds may be breached in the future?
    • Auto-remediation. Automatically remediate frequently occurring issues in the production environment. This significantly reduces unplanned downtime and the MTTR.

    AIOps without automation is a non-starter.

    Automation within AIOps will march ahead in 2023, perhaps in smaller incremental steps as IT practitioners test, validate and gradually trust the automation, while relinquishing some level of control in favor of greater productivity in other aspects of their job.

    Practitioners will welcome and adopt automation when they have substantial oversight and control of it.

    Despite the emergence of AIOps, the IT Operations community as a whole remains cautious. Early adopters who sought first-mover advantage or who had greater tolerance for risk have achieved measurable success, tuning expectations, solution requirements, and adoption plans as they learn on-the-fly.

    Other, early- or mid-majority type buyers approached AIOps with more limited expectations and narrowly scoped adoption plans. By constraining AIOps adoption to a single business application team or geography, they could limit risk and isolate other parts of their organization from the chaotic learning associated with emergent, transformative technologies.

    Predictability, consistency, and explainability is vital for IT Operations. Equally vital is mining the treasure trove of monitoring data available to them, and automating repetitive, error-prone tasks.

    This is why AIOps as a technology segment and transformative approach to IT Operations will “cross the chasm” in 2023. It promises to be a watershed year:

    • Technical AIOps and observability solutions have improved dramatically.
    • There is greater appreciation of the transformative power of AIOps, both technically and organizationally.
    • Expectations for and understanding of AI/ML and advanced algorithms for enterprise-scale AIOps have transitioned from hype to sanity. Consider the advent of ChatGPT and how it has opened multiple doors in natural language and conversational analytics.
    • Pressure to work more efficiently in IT Operations has reached a tipping point (again!)

    The paradigm in IT Operations is shifting. All indicators are pointing to AIOps with powerful AI/ML and automation

    For IT Operations, the most recent phase of applying AI/ML to large datasets and combining automated analytics and actions began about five years ago. This prompted Gartner to coin the term “AIOps” to encapsulate artificial intelligence/machine learning (and advanced algorithms), data analysis and automation for IT Operations.

    AIOps and Observability from Broadcom with Service Observability will help you streamline IT Operations, achieve business goals, and provide cross-functional visibility like never before.

    New blogs and additional resources on this and related topics can be found on the AIOps blog at Broadcom Software Academy.

    Tag(s): AIOps , ITOps , DX OI , ML , AI

    Adeesh Fulay

    Adeesh is the Head of Engineering for DX Operational Intelligence & Data Platform at Broadcom.

    Other Resources You might be interested In

    icon
    Blog August 20, 2025

    What’s Hiding in Your Wiring Closets?

    See why you must move from periodic audits to a state of perpetual awareness. Track every change, validate it against policy, and understand its impact.

    icon
    Blog August 15, 2025

    All Network Monitoring Tools Are Created Equal, Right?

    See how observability platforms provide a unified view across multi-vendor environments and correlate network configuration changes with performance issues.

    icon
    Blog August 15, 2025

    Scale Observability, Streamline Operations with AppNeta Monitoring Policies

    This post reveals how, with AppNeta’s monitoring policies, you can leverage a powerful framework for scalable, flexible, and accurate network observability.

    icon
    Course August 14, 2025

    AppNeta: Current Network Violation Map Dashboard

    Learn how to configure and use the Current Network Violation Map dashboard in AppNeta to identify geographic regions impacted by WAN performance issues.

    icon
    Course August 14, 2025

    AppNeta On-Prem: Minimize Unplanned Downtime

    Learn how to configure the AppNeta On-Prem environment following best practices for high availability and disaster recovery to maintain service continuity and minimize unplanned downtime.

    icon
    Office Hours August 12, 2025

    Rally Office Hours: August 7, 2025

    Get tips on how to use the Capacity Planning feature in Rally, then follow the weekly Q&A session with Rally product experts.

    icon
    Blog August 11, 2025

    dSeries Version 25.0 Boosts Insights, Security, and Operational Efficiency

    Discover how ESP dSeries Workload Automation 25.0 represents a significant leap forward, making workload automation more secure, visible, and efficient.

    icon
    Blog August 7, 2025

    What Your SD-WAN Isn't Telling You

    SD-WAN's limited view blinds it to underlay issues. Augment SD-WAN with end-to-end visibility to validate decisions and diagnose root causes for network resilience.

    icon
    Blog August 7, 2025

    How DX NetOps Topology Streamlines and Optimizes Triage

    DX NetOps Topology gives you the context and clarity to stay ahead of problems and keep your networks running smoothly. Troubleshoot quickly and seamlessly.