AWS Senior Architect Discusses How to Optimize Costs by Preventing Excessive IOPS

cover
30 Jul 2024

Businesses are always looking for ways to optimise storage to enhance the performance of their workloads and applications that are very demanding. Data striping is a technique where data is divided into several stripes which are written on different storage volumes, and it helps to maximize logical capacity as well as improve performance. This method is particularly useful in workloads requiring high IOPS or large block storage like enterprise-grade data warehouses or in-memory databases such as SAP HANA. However, IOPS provisioning remains the challenge. The result may be under-provisioning that affects application performance or overprovisioning leading to unnecessary costs due to unused capacity. For this reason, there should be a balance between optimal performance and cost efficiency.

In his blog post titled “Prevent IOPS over-provisioning by monitoring striped Amazon EBS volumes within EC2 instance limits”, Amazon Web Services (AWS)’s Senior Solutions Architect Ranjith Rayaprolu gave crucial solutions to this problem.

Comprehension of the problem

When managing enterprise-class workloads on Amazon EBS, it is a common practice to combine multiple EBS volumes together to aggregate their logical volume size and IOPS performance. An example of this is when you configure RAID 0 data striping across a number of Amazon EBS volumes. However, each Amazon EC2 instance type has its own limit for IOPS. For instance, the customer may set up four io2 volumes each with 40,000 provisioned IOPS to achieve a total target of 160,000 IOPS but an EC2 instance type like r6i.24xlarge can have a maximum IOPS limit of 120,000. This results in over-provisioning thereby leading to unnecessary costs and poor performance quality.

Ranjith’s Solution: Automating IOPS Monitoring

To solve this issue, Ranjith proposes using AWS Config Custom Rules with AWS Lambda to automate the monitoring and reporting of summed-up IOPS.

Here is a breakdown of the solution:

  • AWS Lambda Function: Develop a lambda functionality for computing the total striped IOPS of Amazon EBS volumes vis-à-vis its maximum possible IOPS limit that is attached to the EC2 instance.

  • AWS Config Custom Rules: Establish custom rules that call the lambda function whenever there are any changes in EC2 instances or EBS volumes’ configurations.

  • Compliance Evaluation: The lambda function checks if aggregated IOPS exceeds the instance’s limit and sends compliance status to AWS Config.

  • Notifications and Reporting: To prevent over-provisioning that results in extra costs, one should ensure that AWS Config sends alerts when such an issue arises.

Benefits of the Solution

  • Cost Minimization: With this, companies can avoid costs related to over-provisioning by making certain that their EBS volumes’ aggregated IOPS does not go beyond the limits set for the instances.

  • Maximizing Performance: This automated method guarantees optimal performance through the prevention of exceeding the instance’s IOPS thresholds.

  • Scalability: Manual monitoring of growing numbers of EC2 instances and EBS volumes cannot be maintained. This solution is designed to be elastic in order to allow for continuous compliance checks within the environment.

Real-Life Application

Imagine a situation where a customer uses an r6i.24xlarge instance to run an extremely IO-intensive database. At first, they provisioned four io2 volumes with 160,000 IOPS altogether only to discover later that the instance supports only up to 120,000 IOPS. If Ranjith’s suggestion were adopted by this customer, then it could automatically sense if there was an overprovisioning. The system would then instruct them to reduce the IOPS on each volume or resize the instance type into one that can provide such required IOPS which leads to cost and performance optimization without human involvement.

Conclusion

The method employed by Ranjith Rayaprolu for the prevention of IOPS over-provisioning incorporates the use of AWS Config alongside AWS Lambda, facilitating the automation of the monitoring and reporting processes regarding IOPS compliance. This approach aids in cost optimization and simultaneously assures optimal performance of enterprise applications, mitigating the risk associated with resource wastage. Through the deployment of this solution, enterprises can attain a sustainable and scalable strategy for managing both storage performance and cost in an effective manner.