<img height="1" width="1" style="display:none;" alt="" src="https://px.ads.linkedin.com/collect/?pid=1110556&amp;fmt=gif">
August 10, 2021

Automated Remediation #4: Key Steps and Guiding Principles

by: Paul Weschler

In the following sections, we outline some key steps and principles that can promote the success of an automated remediation initiative.

Start with Business Outcomes

For any significant effort, it’s vital to define the anticipated business outcome. It’s essential that this outcome is clearly understood by all stakeholders and participants, and that it is effectively aligned with top-level business objectives.

By taking this approach, teams can benefit from a number of advantages. By collectively working towards a common business outcome, teams establish a foundation that enables fast, smart, and aligned decision making. At the same time, this approach still grants individuals the autonomy to creatively devise the best tactics and approaches. Further, common alignment around business outcomes serves as a bridge that connects different teams and enables constructive collaboration.

For teams embarking on an automated remediation initiative in particular, these strategies should be employed. As outlined [reference prior situation section], boosting efficiency in the operation of IT infrastructure represents a key imperative for virtually any business. This enhanced operational efficiency can enable a range of improved business outcomes, such as controlling costs, speeding delivery of innovative services, improving service levels, and more. It is vital to map your automated remediation initiative to specific business outcomes and ensure those outcomes are optimally aligned with top-level business objectives.

Decide Where to Focus

With AIOps and Automic Automation, there are virtually limitless possibilities. However, with so many opportunities, it can be difficult to know where to start. To get started, it is important to pick a useful target. For example, you can start with your top five incidents and use those to guide initial observation and analysis efforts. It is also important to prioritize based on risk. Based on this starting point, teams can then start to build up a set of automations that continue to increase efficiency.

Establish an Automation Team

Establishing an automation team early on can be very effective. These teams can be virtual, sharing responsibilities with existing roles and organizations, or dedicated staff. However, it is important that the team’s domain of responsibility should span the enterprise’s entire technical infrastructure. This is a critical foundation that can be invaluable in ensuring that, as much as possible, common tools are leveraged across teams.

Define a Pragmatic Scope

Early in the process, as teams perform discovery and planning, it is important to define a pragmatic scope, establishing the breadth of the technical implementation. Depending on the target business outcome, the scope may vary. However, it is important to define an achievable technical scope by identifying a target timeframe and assessing constraints. Constraints can include staffing restrictions, unavailability of specific technologies, and control limitations.

Generally, it is better to define multiple, smaller-scope efforts that are each designed to have an impact on the target outcome. Ideally, these efforts should be designed to set the stage for future implementation iterations or sprints. In order to avoid getting stuck in never-ending analysis phases, we suggest creating time limitations for each phase.

Don’t Wait for Perfect; Launch and Iterate

While we’d all like to get it right every time, it is important to recognize that looking for perfection can stifle fast progress towards your objectives. The reality is that both AIOps and Automic support flexible change, so your first implementation doesn’t need to be perfect.

Look to have your teams iterate in short cycles to keep momentum and continue to validate efforts along the way to ensure value is being enhanced. In this effort, teams can be well served by leveraging the Scaled Agile Framework (SAFe). SAFe is a formal framework that offers proven, integrated principles and practices. Through leveraging SAFe, teams can more effectively manage projects with multiple work streams. SAFe is optimally structured for time-bound iterations that aligned with defined scopes. As a result, with SAFe, each iteration can have a meaningful impact on the target outcome. Figure 1 highlights the main elements of the SAFe approach.


Figure 1. An overview of the main elements within the SAFe framework.

Build Up an Automation Library

Automations can be both very common and very specific to an enterprise. Tasks such as restarting a server, adjusting memory allocations, or reading a log file are standard tasks that apply readily across different teams and organizations. Today, there are many prepackaged capabilities and solutions for automating these standard tasks.

However, in order to be applied to a specific implementation, these tasks must all be customized with enterprise-specific information. With Automic Automation, your teams can construct, and expand upon, a library of enterprise-specific building blocks. You can start simple, and then add sophistication and refinements as you go. The solution enables you to take a modular building-block approach, so you can maximize reusability and adaptability. In this way, you can get started quickly, while adapting to enterprise-specific processes and policies.

Further, with the Automic Marketplace, you don’t have to start from scratch. The marketplace contains hundreds of downloadable action packs, solutions, and templates. Compared to open-source code repositories, the Automic Automation Marketplace is curated, helping ensure you only get proven assets from trusted developers. By using assets from the marketplace, you can build almost any type of customized automation. If you develop an automation pack, you can even share it with other Automic Automation users on the marketplace.

Define Business Services

Ultimately, business services are what really matters. The fact a set of infrastructure elements are performing well doesn’t matter if the business services customers or users rely on are down or running in a degraded fashion. Given this, it is important to map automated remediation initiatives to business services.

By defining and associating alarms and automation with service definitions, teams can significantly enhance the value realized from their automation efforts. This enables teams to evolve from a low-level infrastructure focus to building the capabilities that have a meaningful impact on what really matters: the quality of the services the business delivers.

Through the AIOps console, teams can interactively associate elements with business services, without having to do extensive or inflexible configuration. Teams can continue to expand, restructure, and refine these service definitions to ensure they stay optimally aligned with key business outcomes.

To learn more, see the “services definition” page on the Tech Docs portal.

In the next and final Skill Builder, we’ll provide some additional details you need as you embark on your automated remediation implementation.