When searching for performance gains in database-based applications, looking for ways to improve data access is often one of the first steps. When using SQL databases, common performance killers are missing indexes, out of date statistics, I/O bottlenecks, and inefficient query plans. When the tuning exercise fails to deliver the necessary performance improvements a common remedy is to employ typical database scaling methods, such as scaling up and scaling out.
This is likely the quickest approach to achieving better performance gains. In Microsoft Azure, a simple click of a button allows you to scale your database almost instantly. At times, this approach can be costly.
There are two options when it comes to scaling out Microsoft Azure SQL: sharing and adding readable replicas. Data can be shared (using an appropriate shard key) into multiple databases. Unfortunately, for many codebases, this is typically a non-trivial task that involves further costs for the additional database nodes.
The second option is to add readable replicas for the read-only workloads. Again, this is introducing additional costs (multiple databases) and is only available in the premium database tier.
There are alternate strategies that can potentially increase your data access performance and reduce costs at the same time. Here’s an actual use case to explain how:
Recently, we looked at ways to scale our system further without having to incur additional costs. The main bottleneck of the system was the data layer. More specifically, we were hitting Microsoft Azure SQL limits on both CPU utilization and concurrent requests. Using a P2 database, we were able to achieve 194 application requests per second when executing our performance test harness.
|SQL Azure Database||P2 ($930/month)|
|Database CPU %||100%|
|Application Requests Per Second*||194|
*Application Requests Per Second represents system requests that invoke a particular business process that utilizes additional services (e.g., Azure Storage, Service Bus).
The database workloads of this particular system are heavier on the reads than the writes so either of the database scaling techniques could have been applied. However, to avoid incurring additional costs we decided to look elsewhere for our performance gains. Our solution was to introduce a caching layer (Redis Cache) to our read-only workloads to alleviate some of the database pressure. By utilizing the cache-aside pattern on the heaviest read workloads we were able to improve performance by 72% as well as decrease our database utilization by 56%.
|Metric||No Caching||Caching||% Gain|
|Database CPU %||100%||46%||– 56% (this is good)|
|Application Requests Per Second||194||334||72%|
With the improvement in performance we were able to drop a database tier (P1: $465) while still handling our peak load with adequate headroom. Even with the additional costs of the cache (C2: $167) our total monthly costs have been reduced by 32% ($930 to $632).
Fig. 1: The difference in Application Requests Per Second with a cache and without a cache.
Fig. 2: The difference in monthly costs with a cache and without a cache.
When developing in the cloud, there is a multitude of services available right at your fingertips. Now is the time to utilize them.
Post is written by Marc Dugas, Lixar Technical Prime and Certified Microsoft Azure Specialist.