Skip to content

Contact sales

By filling out this form and clicking submit, you acknowledge our privacy policy.

Cooling Down Hot Partitions in Azure Cosmos DB

When working with Azure Cosmos DB, one of the primary design activities you will undertake is appropriate partition design. Here's how to go about it.

Jun 08, 2023 • 4 Minute Read

Please set an alt value for this image...

When working with Azure Cosmos DB, one of the primary design activities you will undertake is appropriate partition design. Partitioning allows you to chunk the data within individual containers, creating distinct subsets called logical partitions. Using partitioning in this way enables Cosmos DB to make good on its promise of “limitless horizontal scalability.” 

So let’s dive in and find out a bit more about what makes partitions both cool and hot.

Even Distribution is Cool

Logical partitions are defined by a key attribute in the data, such as a customer identifier, a warehouse location, or a product category. These are appropriately called partition keys. Selecting a good partition key isn’t always easy though! The idea is to select a logical partition key that will naturally result in even distribution across the underlying physical partition infrastructure. 

For example, if you have hundreds of thousands of customers in a single region, but 90 percent of them are located in the city of Springfield, then setting a partition key of “city” would result in an uneven distribution. This would then lead to high traffic and storage needs for that city’s customers, and create what is referred to as a “hot” partition. 

…Hot Partitions, Not So Cool

Now, sometimes “hot” can be good: hot tamales, hot dogs, and hot days by the swimming pool. Hot partitions aren’t quite as fun. These regularly exceed the throughput made available to them, which results in throttling and failed connections. Some hot partitions can also put storage pressure on physical resources that, while scalable to a point, do have practical limitations. 

Microsoft provides several ways to set and tune input on your Cosmos DB databases and containers. Throughput configuration and careful partitioning choices can often avoid hot partitions, and you can learn all about throughput, partition key design and other hot topics in my course, DP-420: Designing and Implementing Cloud-Native Applications Using Microsoft Azure Cosmos DB. However, there are times when even a good partition key, combined with careful throughput configurations, are not enough to stay cool.

Hierarchical Partitioning Provides Some Shade

Fortunately, Microsoft recently released two new capabilities that can help. The first is hierarchical partitioning, which allows you to choose up to three levels of partitioning. 

For example, suppose you work for a hotel chain, where many of the franchise owners have more than one location. It might make sense to choose the FranchiseID property as a key, but if that turned out to still result in hot partitions, you can now partition first by FranchiseID and then by HotelID. And if you’re still in hot water, try one more level, such as a building number or floor number. 

This is similar to the existing synthetic key partitioning capability in Cosmos DB, but with far less complexity for design development and implementation. This feature is still in preview, is not required for the DP-420 exam and, therefore, is not yet included in my certification course. But you can read more about it here: Hierarchical partition keys in Azure Cosmos DB (preview) | Microsoft Learn

Targeted Throughput when You Need a Firehose

The second preview feature, which I am even more excited about, is the ability to assign throughput across physical partitions. 

By default, Azure Cosmos DB divides up provisioned throughput equally across all physical partitions. That means that while hot partitions are choking on their own smoke, data in low-traffic partitions are sitting in the air conditioning and probably wishing they could turn it down a bit. 

For these scenarios, you now have the ability to redistribute your provisioned throughput across physical partitions, without having to increase overall throughput based on the hottest partition. It’s a little like having zoned thermostats. Suppose you have 10,000 RU/s provisioned. You can decide to assign the majority of it, 6,000 RU/s, to the hot partition and the rest to the cooler rooms of your Cosmos house. 

To learn more about how to sign up for this preview feature, along with current limitations and caveats, visit Redistribute throughput across partitions (preview) in Azure Cosmos DB | Microsoft Learn.

In the meantime, another unrelated life tip: If pool parties are but a dream for you, and it feels like Rome is burning down all around, whip out the marshmallows and pull up a chair. This too shall pass.

Amy Coughlin

Amy C.

Amy Coughlin is a Pluralsight Author and Senior Azure Training Architect with over 30 years of experience in the tech industry, mainly focused on Microsoft stack services and databases. She's living the dream of combining her love of technology with her passion for teaching others.

More about this author