AIOps Summit Recap: Why Observability is Key, and How to Get There

“It’s not rocket science.”

In the past, we’ve all heard that statement made. Quite often, it’s applicable. It’s true we can overthink or unnecessarily overcomplicate matters. Don’t tell that to someone who’s responsible for network performance and continuity today, however. As today’s networks continue to get more dynamic, interconnected, and complex, it starts to feel very much like rocket scienceWe recently held our AIOps and Observability Virtual Summit, which was titled “Mission Control: The Journey from Data to Insights.” The event—now available on demand—offered compelling discussions around the topic of AIOps and observability. The summit revealed some striking parallels between network operations and rocket science.

The event featured keynote discussions from Nick Lippis, Co-Founder and Co-Chair of ONUG; Andre Kindness, Principle Analyst at Forrester; and Serge Lucio, Vice President and General Manager of the Enterprise Software Division at Broadcom. In addition, the summit also featured leaders from Broadridge, Fujitsu, and Telefonica O2.

In the following sections, I’ll outline some of the top insights I’ve taken from the event, highlighting some of the key parallels between rocket science and network operations today.

The Stakes Are High

It doesn’t get any more mission critical than a rocket launch. Lives are at stake, as are massive investments, and failures will inevitably be very high profile.

There are undeniable parallels between these realities and network infrastructure, where failures can likewise be devastating.

In his keynote, Nick Lippis spoke about how the network connects everything. He shared how, of the planet’s population of 7 billion people, more than 5 billion are network connected. He went on to point out that even that number is dwarfed by the 35 billion devices that rely on network connectivity.

For the global enterprise, it’s not an overstatement to say everything is network connected, and network reliant. The network is relied upon for everything from how services are delivered to customers, to how remote users get work done, to how cloud services are accessed, and the list goes on. Outages, even lagging performance that diminishes the users’ digital experience can have significant impacts on an enterprise’s finances, customer satisfaction, and brand.

Complexity is Growing Astronomically

Within the realm of aeronautics, the intensity of demands placed on components and systems is enormous. Further, the systems and processes needed to build, monitor, and control spacecraft are extremely complex.

Similarly, the reality is that networks have also become exceedingly complex. Plus, with growing reliance on network as a service, zero-trust models, multi-cloud approaches, SD-WAN, and more, the complexity network teams contend with keeps expanding rapidly.

In his keynote, Lippis outlined how usage of cloud services grows linearly, but the data generated by multi-cloud operations expands exponentially. Lippis explained how this introduces the concept of “complexity inflation.” Now, it’s not uncommon for multi-cloud environments to be generating a couple million logs—a second.

Little Issues Can Have a Big Impact

In his session, Andre Kindness spoke about his background in aerospace. He worked on pressure systems used in rockets, so when it comes to rocket science, he knows what he’s talking about.

Kindness spoke about how critical it was to observe details from across the rocket’s ecosystem. For example, he outlined how pipes that transmitted liquid oxygen had to be monitored for flow, temperature, and pressure, and how even slight variations in these metrics could potentially have a major impact.

Similarly, in today’s networks, even seemingly small glitches and errors can have a major impact if not handled effectively. This is why, like in aerospace, network environments need to be tracked and governed in a comprehensive fashion. This is driving the acute need for observability. Kindness outlined how observability is now a key enabler of pretty much every key initiative within the enterprise, supporting artificial intelligence, network automation, service chaining, zero trust, and much more.

Observability is Evolving

In the rocket control tower, having timely intelligence is essential. To avoid catastrophe, operators must be able to anticipate and adapt to changing conditions and mitigate potential problems—or face disastrous consequences.Within the world of networks, the need for observability has become foundational for the same reasons. However, it is important to recognize the concept of observability has evolved. It used to simply be about being able to understand the state of individual elements. Now it’s about taking a broader view and seeing how everything is interconnected, and ultimately managing all the components responsible for delivering a digital service. Teams then need to move from a basic understanding of metrics, to establishing a composite view and defining KPIs for their digital services that are meaningfully aligned with business outcomes.

Customers Highlight the Power of Observability

The summit also featured discussions with several Broadcom customers, including executives from Broadridge, Fujitsu, and Telefonica O2. These interviews underscored how advancements in observability are fueling real business gains. Following are a few of my highlights from these engaging interviews:

Broadridge. For this leading fintech provider, it was essential to establish an innovative infrastructure based on the latest technologies. This infrastructure has been instrumental in powering their differentiated digital services and fueling their growth. For the operations team, it was critical to establish end-to-end observability of their critical applications, so they could continue to deliver resilient, high-quality digital services to customers.
Fujitsu. The team at Fujitsu is using AIOps to support their Site Reliability Engineering approaches. In the process, the team has been able to establish up-to-date availability insights that fuel more informed planning. As a result, they can better support the needs of the business and more quickly adapt as those needs change.
Telefonica O2. Telefonica O2 has established a single point of visibility and reporting across their highly heterogeneous environment, which is composed of dynamic SDN/SD-WAN infrastructure and diverse technologies from a range of vendors. With these unified capabilities, the team is better equipped to deliver a consistent customer experience.

Conclusion

In this post, I’ve tried to provide a recap of some of the highlights from the AIOps and Observability Summit, but you owe it to yourself to see the event and all the great material provided. To watch, simply visit the AIOps and Observability Summit’s on demand page any time.