Most organizations that existed before the cloud era opt for a dual environment - public and private cloud - also known as a hybrid cloud setup. Hybrid cloud gives you the best of both worlds: the power and cost-effectiveness of the cloud combined with the control of a data center. Yet, monitoring hybrid cloud can be challenging.
Nine in ten enterprises currently use multiple cloud vendors, and eight in ten share data between public cloud and on-premises applications, according to a survey of cloud professionals by Dimensional Research. TechBeacon provides a good summary of the survey.
1) Different Metrics and Tools to Track
The growing diversity of environments across private and public clouds makes monitoring more complex. Performance metrics for each environment differ from one another. One environment may report metrics in seconds, while the other in one-minute intervals. Though tracking the same metric, names and labeling differ and need to be correlated to be useful.
The tooling is also different for each platform. While organizations have their legacy monitoring tools like Nagios, they now also have cloud vendor monitoring tools like AWS CloudWatch and open source monitoring tools like Prometheus. There is some overlap between the metrics of each of these tools, while some metrics are unique to each tool.
The challenge is to unify all these metrics and attain a unified view of the hybrid system, end-to-end. This “single pane of glass” view is the holy grail of hybrid cloud monitoring. None of the purpose-built monitoring tools can deliver an end-to-end view. That requires a separate tool that can integrate all metrics from all tools and make them available in a way that is meaningful and usable.
2) Integrating the Entire Stack
The private and public clouds need to be integrated at all levels - infrastructure, data, networking, and application. At the infrastructure layer, instances need to be spun up and destroyed between the private and public cloud environments as workloads are shifted between the two.
At the data layer, storage and transfer of data need to be seamless between the multiple environments. Additionally, some requests could require data across environments to be processed.
At the networking layer, things like load balancing and service discovery should cover all environments. Also, during these times of remote work, VPN access has taken center stage in most IT organizations. Finally, applications should be integrated via API, and these APIs should be compatible across the board.
With so many moving parts at every layer of the stack, it's easy to see why things can go wrong with hybrid cloud. SLAs aren't uniform as there are multiple vendors to be managed, which brings more responsibility in-house to the organization itself. When these failures happen, it disrupts the end-user experience.
3) Security
As the stack expands, so does the attack surface. With additional components and services to secure, security monitoring is of key importance.
For a data center, security practices start with securing the physical facility and hardware. Then, there are the network and device security measures like firewalls and anti-virus software. At the application level, user access needs to be configured via SSO or LDAP. Finally, data needs to be secured for data loss or disaster recovery.
Some of these practices, like the security of physical premises, are rendered moot in a cloud platform. But some, like data backup, need to be continued even in the cloud.
The cloud operates on a shared responsibility model in which the cloud vendor handles the security of the platform; whereas the organization would still be responsible for their security 'in' the cloud platform. Cloud security involves a completely different approach to IAM and new tools for data encryption and key management.
This makes compliance and governance all the more challenging as it needs to span both private and public clouds. Finally, throw in threat monitoring that is essential to monitor for phishing, DDoS attacks, and downloading of vulnerable container images - and you have a security nightmare.
4) Cost Control
Additional resources drive up the TCO (total cost of ownership) quickly. If unused resources were a drawback with on-prem, that problem is easily exacerbated in the cloud. The cloud is cheap at the start, but as the traffic volume grows, and the number of cloud services used increases, it's easy to inadvertently run into sticker shock.
Monitoring hybrid cloud is essential to prevent this. It requires keeping track of resource utilization at the infrastructure level. Monitoring done right should yield opportunities to reduce costs with hybrid cloud without compromising on performance. Additionally, it requires alerting whenever usage crosses a threshold.
Shift to AIOps
To counter these challenges, organizations need a completely different monitoring practice; something that leverages machine learning and artificial intelligence to augment humans and monitoring tools. AIOps (Artificial Intelligence in IT Operations) is the answer to this challenge. AIOps combines monitoring for all the purposes listed above and provides a “single pane of glass” view of hybrid cloud.
CIOs looking to make the transition to a modern and agile cloud system should leverage the power of AIOps, which can help them meet the demands of monitoring hybrid cloud. AIOps will also help make this transition seamless as it builds confidence when running and managing a newly set up hybrid cloud.
Read my other post, Best Practices for Monitoring a Hybrid Cloud Environment, for more about using AIOps to monitor hybrid cloud.
For additional resources on AIOps, visit Enterprise Software Academy’s AIOps page.
Twain Taylor
Twain Taylor is a technology analyst and contributing writer at Fixate IO.