May 11, 2023

IT Operations in 2023: AI/ML & Automation Will Continue to Be the North Star

The use of statistics, advanced algorithms and AI/Ml is becoming omnipresent. The benefits are visible in every walk of life, from web searches, to movie and retail recommendations, to auto-completing our emails. Of course, not many anticipated the dramatic entrance of generative AI in the form of ChatGPT for writing college essays and poetry on arcane topics.

The benefits of these technologies are all around us, although less obvious, in applications such as cost optimization in manufacturing, real-time safety and fuel economy adjustments in automobiles, and life-saving solutions in healthcare. While it may take years for Smart-Quant to mature enough to meet our science fiction-level expectations, the growing investment and progress around us in these areas is undeniable.

To make progress, these models need data…lots of data.

Few parts of society generate and can capture more data than IT Operations. So, the use of the above techniques for solving IT Operations challenges is not surprising. With strong monitoring tools and increasingly more things to monitor, the volume of data proceeds unabated.

Data sources

Feeding the AIOps machine are varied data sources. To generate meaningful insights, AIOps benefits from a variety of data from direct or synthetic monitoring of applications, infrastructures, networks, and user experiences.

AIOps also needs to make sense of the data.

Powerful normalization and correlation (using statistical models and knowledge of IT assets and service architectures) is of course needed to structure the data for clean analysis.

Topologies

Most IT Operators love service maps. In order to build these service maps, AIOps must derive or ingest inventory and relationships from multiple sources. Application topologies can be extracted from cross-transaction traces, while network topologies can be established from device logs and connectivity tests between different devices.

AIOps needs to normalize and enrich this data with additional attributes from third party sources and persist in a uniform graph data model. Two elements typically comprise a graph model:

Vertices/Nodes. Vertices are the entities of the model and can hold any number of attributes or key-value pairs that help describe the entities.
Edges. Also known as dependencies or relationships, edges provide the relevant connections between two vertices. A relationship always has a direction, a start node, and an end node. Although they can be directed, relationships can be navigated in either direction

ML algorithms can easily query such a graph model and use it as a primary dimension for all analyses. For example, for a typical use case like performance problem isolation, the ML algorithm can easily identify the hosts and application components involved in a business transaction and use this to set the data scope for analyzing relevant performance metrics and alarms.

The overabundance of wonderful data will cause recursive progress for AI/ML and advanced algorithms.

More data will result in greater need for AI/ML.
More AI/ML will motivate us to capture, clean and analyze more data.

Rinse and repeat.

Given the limits of humans, we turn to automation.

For AIOps, here are a few areas that beg to be automated:

Auto-discovery of changes to entities added to or removed from the IT environment. What is new monitoring data available to the AIOps machine? What information is no longer available or relevant?
Auto-correlation of data. Which entities are associated with the business service I’m responsible for? (See my blog, “IT Operations in 2023: Business Services Become a Viable Organizing Principle”)
Auto-ticketing. Which teams or individuals should be notified when certain performance thresholds are breached, or better, when certain thresholds may be breached in the future?
Auto-remediation. Automatically remediate frequently occurring issues in the production environment. This significantly reduces unplanned downtime and the MTTR.

AIOps without automation is a non-starter.

Automation within AIOps will march ahead in 2023, perhaps in smaller incremental steps as IT practitioners test, validate and gradually trust the automation, while relinquishing some level of control in favor of greater productivity in other aspects of their job.

Practitioners will welcome and adopt automation when they have substantial oversight and control of it.

Despite the emergence of AIOps, the IT Operations community as a whole remains cautious. Early adopters who sought first-mover advantage or who had greater tolerance for risk have achieved measurable success, tuning expectations, solution requirements, and adoption plans as they learn on-the-fly.

Other, early- or mid-majority type buyers approached AIOps with more limited expectations and narrowly scoped adoption plans. By constraining AIOps adoption to a single business application team or geography, they could limit risk and isolate other parts of their organization from the chaotic learning associated with emergent, transformative technologies.

Predictability, consistency, and explainability is vital for IT Operations. Equally vital is mining the treasure trove of monitoring data available to them, and automating repetitive, error-prone tasks.

This is why AIOps as a technology segment and transformative approach to IT Operations will “cross the chasm” in 2023. It promises to be a watershed year:

Technical AIOps and observability solutions have improved dramatically.
There is greater appreciation of the transformative power of AIOps, both technically and organizationally.
Expectations for and understanding of AI/ML and advanced algorithms for enterprise-scale AIOps have transitioned from hype to sanity. Consider the advent of ChatGPT and how it has opened multiple doors in natural language and conversational analytics.
Pressure to work more efficiently in IT Operations has reached a tipping point (again!)

The paradigm in IT Operations is shifting. All indicators are pointing to AIOps with powerful AI/ML and automation

For IT Operations, the most recent phase of applying AI/ML to large datasets and combining automated analytics and actions began about five years ago. This prompted Gartner to coin the term “AIOps” to encapsulate artificial intelligence/machine learning (and advanced algorithms), data analysis and automation for IT Operations.

AIOps and Observability from Broadcom with Service Observability will help you streamline IT Operations, achieve business goals, and provide cross-functional visibility like never before.

New blogs and additional resources on this and related topics can be found on the AIOps blog at Broadcom Software Academy.

Tag(s): AIOps , ITOps , DX OI , ML , AI

Adeesh Fulay

Adeesh is the Head of Engineering for DX Operational Intelligence & Data Platform at Broadcom.

Other resources you might be interested in

Blog October 8, 2025

Nobody Cares About Your MTTR

This post outlines why IT metrics like MTTR are irrelevant to business leaders, and it emphasizes that IT teams need network observability to bridge this gap.

Read Blog

Blog October 8, 2025

Tag(ging)—You’re It: How to Leverage AppNeta Monitoring Data for Maximum Insights

Find out about tagging capabilities in AppNeta. Get strategies for making the most of tagging and see how it can be a game-changer for your operations teams.

Read Blog

Office Hours October 6, 2025

Rally Office Hours: October 2, 2025

The Rally Model Context Protocol (MCP) Server acts as a standardized interface for AI models and developer tools. Learn about this exciting new feature then follow the weekly Q&A session with Rally...

View Recording

Blog October 1, 2025

Why 1% Packet Loss Is the New 100% Outage

In an era of real-time apps and multiple clouds, the old rules about 'acceptable' network errors no longer apply. See why you need end-to-end observability.

Read Blog

Office Hours September 30, 2025

Rally Office Hours: September 25, 2025

Rally Office Hours delivers an essential product tip: Learn to transition from Legacy Custom Pages to powerful Custom Views. Plus, Q&A insights.

View Recording

Blog September 26, 2025

Defining the Network Engineer of Tomorrow

Read this post and see why the most important investment isn't in new hardware, but in transforming your team from device managers to service delivery experts.

Read Blog

Blog September 26, 2025

Harnessing AppNeta’s Browser- and HTTP-based Workflows to Track User Experience

AppNeta’s browser- and HTTP-based workflows let you see what users actually experience. Preempt issues before they become headaches for your end users.

Read Blog

Blog September 26, 2025

“Rego U” Recap: Why SPM Is Still Hot

Rego Consulting’s Annual Conference underscored why strategic portfolio management (SPM) is still essential. Leverage SPM to bridge strategy and execution.

Read Blog

Blog September 23, 2025

What's New in AutoSys 24.1: Built for the Modern Automation Landscape

See how AutoSys 24.1 is designed to streamline your daily tasks, accelerate troubleshooting, and simplify how you integrate with the latest technologies.

Read Blog