<img height="1" width="1" style="display:none;" alt="" src="https://px.ads.linkedin.com/collect/?pid=1110556&amp;fmt=gif">
Skip to content
    April 28, 2021

    Understanding the Performance of Your AWS Services: How AIOps Can Help

    The most important part of providing software solutions for your customers is ensuring you’re meeting their needs. To do this, there are two key things that you must consider: First, your software must perform the tasks that your consumers expect, and second, your software needs to perform those tasks quickly, efficiently, and accurately. In this article, we will explain this second component, focusing specifically on performance.

    In particular, we will examine services that are deployed into the Amazon Web Services (AWS) public cloud. We’ll explain what constitutes a high-performing service and why it’s essential to monitor performance. Then we will explore performance monitoring tools that are available in the AWS environment and compare them to third-party monitoring solutions.

    This article aims to help you understand the options that are available to you for monitoring and managing your services’ performance. We also hope to help you get set up with the tools that will enable you to provide your consumers with the best possible user experience.

    Factors that Affect Performance

    When you devise a strategy for analyzing performance, you can divide what you monitor into two categories: underlying infrastructure and user experience. If the service is deployed on a virtual machine using Amazon Elastic Compute Cloud (EC2), you should be aware of the following metrics:

    • Memory utilization: available vs. used memory
    • CPU utilization
    • Network utilization: incoming and outgoing data rates
    • Disk utilization: input/output metrics and available vs. used storage

    Each of these metrics will give you a window into whether or not you are using the appropriate infrastructure as well as how it is performing. You can also gain insights and implement changes if the instance is consistently operating at the upper or lower limits of its capacity. You might want to move services that operate at the upper limit to a type of instance that has more resources. In contrast, those that operate at the lower limit could be moved to a smaller instance type, which will reduce your operating costs.

    If your application is containerized and deployed using Amazon Elastic Container Service (ECS) or Elastic Kubernetes Service (EKS), then you will have fewer infrastructure metrics to monitor. You still want to be aware of CPU and memory utilization within your nodes and clusters, and you want to adjust your configurations accordingly.

    The second part of your monitoring strategy should be monitoring user experience. If your users cannot reach your service, or if a load balancer is throttling their requests, you’ll lose customers even if your infrastructure metrics look fantastic. To monitor consumer experience, you should be aware of the following:

    • Request and response times
    • Number of requests over time
    • Error rates expanded by error type

    Each of these metrics will help you understand the volume of calls that your service is handling, how long they are taking, and whether user requests are successful. It’s essential to establish a baseline measurement for each one. Once you have a baseline, you should observe how each metric changes over time and then respond to deviations from that baseline.

    AWS-Provided Tools

    As a comprehensive cloud service provider, AWS gives all of its customers access to Amazon CloudWatch. CloudWatch is a metrics repository for all AWS-hosted services. Its standard resolution is free and provides metrics at one-minute intervals. Users can also subscribe to its high-resolution offering, which provides metrics at one-second intervals. To reduce storage demands, CloudWatch aggregates metrics over time, which means that your performance data will become less specific the longer that it’s stored.

    Broadcom Enterprise Software Academy -Understanding the Performance of Your AWS Services: How AIOps Can Help

    Fig.1 Example of CloudWatch Metrics for an EC2 Instance

    CloudWatch also allows you to set limits and alarms on your metrics. For example, you can set an alarm to trigger when the memory usage exceeds 85% on an instance for more than three minutes. You can also configure the alarm to send a message, start a predefined process, or connect to another service through a webhook.

    An unfortunate downside of CloudWatch is that, while it does provide access to a wide variety of metrics for each of its services, you need to know which metrics you’re looking for and how to combine them to get actionable insights. At its core, CloudWatch is just a collection and reporting service, so when it comes to detecting anomalies and monitoring intelligently, you’ll need something more.

    Leveraging Third-Party APM Experts

    Ideally, you’ll want your engineers to devote their time to adding new features and improving your software’s performance. One of the benefits of using standardized services and hosting them in the cloud is that you can leverage the expertise of those whose sole focus is on application performance management (APM). You can add an agent to your instances or a sidecar application to your container environment that will gather essential metrics and transmit them to an APM provider.

    APM providers typically provide standard dashboards and monitoring as part of their product offerings. In most cases, you can enable these systems quickly and begin monitoring intelligently with just a few hours of work. Some of these providers have recently started offering artificial intelligence for IT operations, or AIOps, solutions, which is an exciting option that adds exceptional value to your performance monitoring strategy.

    AIOps and Proactive Monitoring with Thresholds and Alerts

    AIOps combines machine learning and data science with a performance monitoring solution. AIOps provides automated remediation capabilities, enables you to detect problems sooner, and ultimately improves your consumers’ experience. AIOps can help you improve performance as well as identify new ways to increase your efficiency and responsiveness.

    If you would like to learn more about AIOps, how it works, and the potential benefits of using it, The Definitive Guide to AIOps is an excellent place to start. This white paper defines AIOps in more detail, explores the underlying principles and technologies, and explains how you can apply it to your organization. You can also download the AIOps from Broadcom solution brief, which provides specific details about this AIOps product offering.

    Providing a high-performing user experience is essential for meeting your users’ needs. Fortunately, partnering with experts at organizations like Broadcom makes it easy to achieve this goal, ensuring that your software is reliable and that your customers can access it easily.

    Tag(s): AIOps

    Mike Mackrory

    Mike Mackrory is a Global citizen who has settled down in the Pacific Northwest — for now. By day he works as a Lead Engineer on a DevOps team, and by night, he writes and tinkers with other technology projects. When he's not tapping on the keys, he can be found hiking, fishing, and exploring both the urban and rural...

    Other posts you might be interested in

    Explore the Catalog
    January 11, 2024

    Upgrade to DX UIM 23.4 During Broadcom Support’s Designated Weekend Upgrade Program

    Read More
    January 9, 2024

    DX UIM 23.4 Sets a New Standard for Infrastructure Observability

    Read More
    December 29, 2023

    Leverage Discovery Server for DX UIM to Optimize Infrastructure Observability

    Read More
    December 29, 2023

    Installation and Upgrade Enhancements Delivered in DX Platform 23.3

    Read More
    December 20, 2023

    Broadcom Software Academy Wins Silver in Brandon Hall Group’s Excellence in Technology Awards

    Read More
    November 4, 2023

    Kubernetes Primer: Implementation and Administration of DX APM

    Read More
    October 5, 2023

    Upgrade to DX UIM 20.4 CU9 to Leverage New Features and Security Updates

    Read More
    October 2, 2023

    Triangulate: Add Logs to Your Monitoring Mix

    Read More
    September 25, 2023

    New DX UIM Release: Start Monitoring New Linux Distributions on Day 1

    Read More