Key Takeaways
|
|
A constant challenge in business is aligning stakeholders, customers, and employees behind a single mission. Carefully crafted plans often fail spectacularly as a result of complexity. Some of these efforts fail slowly; some even before execution begins.
Especially when collaboration is important, simplicity can make challenging work innately more understandable, measurable, and engaging. When work cuts across multiple functional groups who may have different priorities, measures of success, or perceptions of risk, a simple framework can help everyone understand how they contribute to the larger mission and why the mission matters.
Simplicity makes it easier to communicate, enlist collaborators, measure progress, and implement plans. As a result, teams can clearly understand how to fulfill a larger mission.
This principle applies to IT operations. Delivering reliable business services is both incredibly complex and massively important to modern businesses, but the overarching strategy for doing so doesn't have to be.
In this article, I outline a simple framework teams can use to assess and prioritize digital services—a critical first step for establishing, communicating, and committing to a simple mission. Related to this, read this blog by Adeesh Fulay, Head of Engineering for DX Operational Intelligence: “Business Services Become a Viable Organizing Principle.”
This “three level” framework is not unique—you will find this and similar approaches under various names that IT operations can use to assess and prioritize work, allocate resources, and so on. Here, it’s helpful to see how it can be applied to digital services.
This framework consists of the following three steps:
To illustrate, let’s consider digital business services you might find at a telecommunications company:
Most of the services in this example are self-explanatory and have direct corollaries to services in other industries. While each of these are intuitively important, it helps to evaluate them using the three-level framework.
Business Service | Impact | Quantified Impact |
Online Store: Shopping Cart Service | ||
Call Center: Incoming Call Queue Service | ||
Customer Relationship Management: New/Update Record Service | ||
HR: New Employee Onboarding Service | ||
Enterprise Reporting: Dashboarding Service | ||
IT Support: Ticketing Service | ||
Corporate Website: Analytics and Reporting Service |
Next, for each service consider the question, “What is the objective and quantifiable business impact of the service becoming unavailable.”
To help guide ourselves through the conversations with our stakeholders, we start with dividing these applications into three buckets:
At this point, it’s worth noting that not all organizations are driven by revenue or productivity. In these cases, you would substitute “Cannot fulfill the primary mission of the organization,” and “Degraded ability of staff to support the mission” respectively.
Once business impact is determined and quantified, we can classify our services using these simple rules:
Here’s the completed table:
Business Service | Impact | Quantified Impact |
Business Critical | ||
Online Store: Shopping Cart Service | Lost revenue | $50,000 per hour |
Call Center: Incoming Call Queue Service | Lost revenue | $30,000 per hour |
Productive | ||
Customer Relationship Management: New/Update Record Service | Lost productivity | 500 Internal users blocked |
HR: New Employee Onboarding Service | Lost productivity | 100 Internal users blocked |
Best Effort | ||
Enterprise Reporting: Dashboarding Service | Best effort | 20 Internal users inconvenienced |
IT Support: Ticketing Service | ||
Corporate Website: Analytics and Reporting Service | None | Unknown |
This table forms the basis for communicating our strategy to the wider organization.
For digital services classified as “critical,” the service level objective (SLO) should be ambitious. Gapless, state-of-the art monitoring should be prioritized. Follow-the-sun L1 support should be available for users. Engineers should be on call 24/7 in case of failures. Continual improvement processes should be in place to ensure these services stay at a high level of reliability. There should be an uncompromising emphasis on the quality of the user experience.
For digital services classified as “productive,” the SLO should be reasonable. A good standard of availability and infrastructure monitoring should be implemented. Support should be available during business hours. A more reactive stance may be employed in case of failures, so long as the SLO is maintained.
For digital services classified as “best effort,” the SLO can be significantly more lenient. A basic standard of availability monitoring is sufficient. Ideally these services should be outsourced to a third party. If these services must be kept in-house, there should be an expectation set that resources will be prioritized to “critical” and “productive” applications, and users may need to occasionally “make do” in the case of exceptional failures and resource constraints. If failure rates increase to the point that they have an impact on revenue, productivity, or other key metrics, a more reliable alternative should be found for these services.
So that’s it! In a nutshell, this is how the three-level IT operations framework is applied to digital business services. With this simple framework, you can assess and prioritize your digital business services and clearly communicate a simple mission to multiple teams across IT. I hope this inspires you to think about the strengths and weaknesses of your current strategy for delivering IT services to your customers, employees, and partners.