Cosmos DB: Partition Key explained for architects

Why choosing the right Partition Key is critical to avoid unnecessary costs and performance bottlenecks 04/13/2024

If you've ever worked with Azure Cosmos DB, you've probably encountered this deceptively simple question during the initial setup: "What should be the Partition Key?". It looks like a minor detail, but trust me — get it wrong and you'll be paying for it. Literally. Both in RU/s (Request Units per second) and in headaches trying to figure out why your queries are so slow.

In this article, I'll break down how partitioning works in Cosmos DB, why the Partition Key is arguably the most important design decision you'll make, and how to choose it wisely. With code examples, of course — we're engineers, not philosophers.

What is partitioning in Cosmos DB?

Cosmos DB is a globally distributed, horizontally scalable NoSQL database. To achieve this scalability, it splits your data into smaller chunks called logical partitions. Each logical partition contains all items that share the same Partition Key value.

Behind the scenes, Cosmos DB groups logical partitions into physical partitions. A physical partition is an actual unit of storage and throughput. You don't control which physical partition your data lands on — Cosmos DB handles that automatically based on a hash of your Partition Key.

Think of it like a library: the Partition Key is the shelf label. If you put all your books on the same shelf, it gets overcrowded while the rest of the library sits empty. That's exactly what happens with a bad Partition Key — and it's called a hot partition.

Why does the Partition Key matter so much?

The Partition Key directly impacts three critical aspects of your Cosmos DB experience:

  • Performance — Queries that target a single partition are fast (in-partition queries). Cross-partition queries are significantly slower and more expensive.
  • Cost — Throughput (RU/s) is distributed evenly across physical partitions. A hot partition means wasted RU/s on idle partitions while the overloaded one throttles your requests.
  • Scalability — A single logical partition has a 20 GB storage limit. If one partition grows too large, Cosmos DB cannot split it further — your data is stuck.

In short: a bad Partition Key can turn a world-class distributed database into an expensive, slow, single-node experience. And no amount of money thrown at RU/s will fix a fundamentally flawed partition strategy.

How to choose the right Partition Key

Here are the golden rules I follow when choosing a Partition Key:

  • High cardinality — The property should have many distinct values. A boolean field like IsActive would give you exactly 2 partitions. Don't do that.
  • Even distribution — Data should be spread evenly across partitions. Avoid keys that concentrate most data on a few values.
  • Query alignment — The key should match your most frequent query patterns. If 90% of your queries filter by CustomerId, that's likely your best candidate.
  • Write distribution — Avoid creating write hotspots where all inserts hit the same partition.

A practical example: e-commerce orders

Let's say we're building an e-commerce platform and we need to store orders. Here's our model:


public class Order
{
    public string Id { get; set; }
    public string CustomerId { get; set; }
    public string Status { get; set; }
    public DateTime CreatedAt { get; set; }
    public decimal TotalAmount { get; set; }
    public List<OrderItem> Items { get; set; }
}
                

Let's evaluate some Partition Key candidates:

Option 1: /status — The terrible choice

Order statuses are typically a handful of values: Pending, Processing, Shipped, Delivered, Cancelled. That's 5 partitions for your entire dataset. The Delivered partition will be enormous while Cancelled might be tiny. Low cardinality + uneven distribution = disaster.

Option 2: /id — The naive choice

Using the unique ID gives you perfect distribution — every item is in its own partition. Sounds great, right? The problem: you'll never query a single partition unless you already know the exact ID. Every query by customer, by date range, or by status becomes a cross-partition fan-out query. Expensive and slow.

Option 3: /customerId — The sweet spot

Most queries in an e-commerce app are scoped to a customer: "show me my orders", "show my recent orders", "show order details". Using CustomerId as the Partition Key means all of a customer's orders live in the same partition — single-partition queries for the most common access patterns. With thousands or millions of customers, you also get excellent distribution.

Setting up the Partition Key with .NET

When creating a Cosmos DB container programmatically, the Partition Key is defined at creation time — it cannot be changed later. Choose wisely!


using Microsoft.Azure.Cosmos;

var cosmosClient = new CosmosClient("your-connection-string");

var database = await cosmosClient.CreateDatabaseIfNotExistsAsync("ECommerceDb");

var containerProperties = new ContainerProperties
{
    Id = "Orders",
    PartitionKeyPath = "/customerId"
};

var container = await database.Database.CreateContainerIfNotExistsAsync(
    containerProperties,
    throughput: 400); // Start with 400 RU/s, autoscale later if needed
                

Note that PartitionKeyPath uses the JSON property path, not the C# property name. It's case-sensitive and must match the serialized JSON output. If you're using System.Text.Json with camelCase naming policy, make sure your path reflects that.

Querying with the Partition Key

The best practice is to always include the Partition Key in your queries. This ensures Cosmos DB routes the query to a single partition:


// Efficient: single-partition query
var query = new QueryDefinition(
    "SELECT * FROM c WHERE c.customerId = @customerId AND c.status = @status")
    .WithParameter("@customerId", "customer-123")
    .WithParameter("@status", "Shipped");

var options = new QueryRequestOptions
{
    PartitionKey = new PartitionKey("customer-123")
};

using var iterator = container.GetItemQueryIterator<Order>(query, requestOptions: options);

while (iterator.HasMoreResults)
{
    var response = await iterator.ReadNextAsync();
    Console.WriteLine($"RU charge: {response.RequestCharge}");

    foreach (var order in response)
    {
        Console.WriteLine($"Order {order.Id} - {order.Status}");
    }
}
                

Now compare that with a cross-partition query (no Partition Key specified):


// Expensive: cross-partition fan-out query
var query = new QueryDefinition(
    "SELECT * FROM c WHERE c.status = @status")
    .WithParameter("@status", "Shipped");

// No PartitionKey specified — Cosmos DB queries ALL partitions
using var iterator = container.GetItemQueryIterator<Order>(query);

while (iterator.HasMoreResults)
{
    var response = await iterator.ReadNextAsync();
    // RU charge will be significantly higher!
    Console.WriteLine($"RU charge: {response.RequestCharge}");
}
                

The difference in RU consumption between these two queries can be 10x or more, depending on the number of physical partitions. That's real money on your Azure bill every month.

Hierarchical Partition Keys

Since late 2023, Cosmos DB supports hierarchical (subpartitioning) Partition Keys with up to 3 levels. This is a game-changer for multi-tenant scenarios or datasets with natural hierarchies.


// Hierarchical Partition Key: TenantId > CustomerId
var containerProperties = new ContainerProperties
{
    Id = "Orders",
    PartitionKeyPaths = new List<string> { "/tenantId", "/customerId" }
};

var container = await database.Database.CreateContainerIfNotExistsAsync(
    containerProperties,
    throughput: 400);
                

With hierarchical keys, you can query at any level of the hierarchy:


// Query all orders for a tenant (first level)
var tenantOptions = new QueryRequestOptions
{
    PartitionKey = new PartitionKeyBuilder()
        .Add("tenant-abc")
        .Build()
};

// Query a specific customer within a tenant (both levels)
var customerOptions = new QueryRequestOptions
{
    PartitionKey = new PartitionKeyBuilder()
        .Add("tenant-abc")
        .Add("customer-123")
        .Build()
};
                

This avoids the classic dilemma: "should I partition by tenant or by customer?" — now you can have both. It also helps prevent the 20 GB logical partition limit from being hit on large tenants.

Common anti-patterns

Here are the mistakes I've seen (and sometimes made myself — no judgment please) in production:

  • Using a timestamp as Partition Key — Append-only workloads will always hammer the latest partition while historical ones sit idle. Classic hot partition.
  • Using a low-cardinality field — Fields like country, status, or category will create a handful of oversized partitions.
  • Ignoring query patterns — Choosing a key that distributes data perfectly but doesn't match your read patterns means every query is a cross-partition fan-out.
  • Forgetting about the 20 GB limit — "It'll never get that big" is famous last words. Plan for growth.
  • Not setting the Partition Key on point reads — A point read (ReadItemAsync) without the Partition Key becomes a query instead of a direct lookup, costing ~3x more RU/s.

Monitoring partition usage

You can't improve what you can't measure. Use Azure Monitor and the Cosmos DB Insights workbook to keep an eye on:

  • Partition Key range throughput distribution — Check if RU consumption is evenly spread or concentrated on a few partitions.
  • Storage per partition — Watch for partitions approaching the 20 GB limit.
  • Throttled requests (429s) — If you're getting throttled on some partitions but have plenty of RU/s left globally, you have a hot partition problem.

You can also programmatically check partition statistics:


// Check container throughput and partition info
var throughput = await container.ReadThroughputAsync(new RequestOptions());
Console.WriteLine($"Current throughput: {throughput.Resource.Throughput} RU/s");

// Read container properties to verify partition key configuration
var containerResponse = await container.ReadContainerAsync();
Console.WriteLine($"Partition Key Path: {containerResponse.Resource.PartitionKeyPath}");
                

Cost impact: a real-world perspective

Let me put this into perspective with some rough numbers. Consider a container with 10 physical partitions and 10,000 RU/s provisioned:

  • Each physical partition gets 1,000 RU/s.
  • If a hot partition needs 3,000 RU/s, it will be throttled — even though you're only using 3,000 out of 10,000 globally.
  • To fix it by brute force, you'd need to scale to 30,000 RU/s. That's 3x the cost for the same workload, just because of a bad Partition Key.

At enterprise scale, that can mean thousands of dollars per month wasted. Your FinOps team will not be amused. And unlike compute resources, you can't change the Partition Key on an existing container — you'd have to migrate your data to a new container with a better key. Not exactly a fun Friday afternoon activity.

Summary

The Partition Key is the single most important design decision when working with Cosmos DB. It affects performance, cost, and scalability — the three pillars that keep architects up at night.

Key takeaways:

  • Choose a Partition Key with high cardinality and even data distribution.
  • Align your key with your most common query patterns to avoid cross-partition queries.
  • Always provide the Partition Key in queries and point reads to minimize RU consumption.
  • Consider hierarchical Partition Keys for multi-tenant or hierarchical data models.
  • Monitor your partition throughput distribution — hot partitions are silent budget killers.
  • Remember: the Partition Key cannot be changed after container creation. Design it right from day one.

For more details, I recommend the official Microsoft documentation on partitioning and the hierarchical partition keys documentation.

Happy partitioning — and may your RU/s always be evenly distributed!

Share On