Skip to content

Contact sales

By filling out this form and clicking submit, you acknowledge our privacy policy.
  • Labs icon Lab
  • A Cloud Guru
Azure icon
Labs

Optimize Query Performance in Cosmos DB for NoSQL

In this lab, you will be presented with a short series of real-world scenarios that describe performance issues in a Cosmos DB implementation. For each scenario, you will be provided with a list of optimization ideas to research and consider. Some of the suggested optimizations include adjusting throughput on a container, customizing or adjusting indexing policies, Time to Live (TTL) settings, and consistency levels. After researching and selecting the optimization idea most likely to introduce improvement, you will use the Azure portal to apply the solution. No coding experience is required, but familiarity working with Azure Cosmos DB for NoSQL databases, and navigating the Azure portal will provide you with a headstart. However, novices who are willing to do a little research using the links provided, along with the lab guide and solution videos, should also be able to complete the lab successfully.

Azure icon
Labs

Path Info

Level
Clock icon Beginner
Duration
Clock icon 30m
Published
Clock icon Dec 16, 2022

Contact sales

By filling out this form and clicking submit, you acknowledge our privacy policy.

Table of Contents

  1. Challenge

    Scenario 1: Optimize for Inconsistent Request Traffic

    Your first consultation is with a team the works with the ADT (admissions, discharges, and transfers) data for a hospital administrative application. ADT messages arrive in the Messages container on the ADT database. The team is plagued by 429 errors and clear evidence of throttling during high-traffic times that coincide with elective surgery check-ins early in the morning, and late evening admissions in emergency rooms. Team members have tried setting the container throughput to 10,000 RU/s to prevent throttling during high-traffic periods, but more than half the time, the container is consuming only about 4,000 RU/s. The team lead knows they are underutilizing the higher throughput setting and wasting a lot of money, but she is not sure how to address it. Given the optimization ideas below, what do you recommend? You may want to navigate to the Cosmos DB account, database, and container to check current settings before you settle on any solutions.

    After selecting one or more optimization ideas, navigate to the appropriate resource(s) and apply your solution(s).

    Optimization Ideas

    • Modify throughput on a container
    • Modify the indexing policy on a container
    • Modify TTL on a container
    • Modify the consistency level on the account
  2. Challenge

    Scenario 2: Optimize Common Read Queries

    Your second meeting is with a data engineer, who supports the data analytics team. Users are complaining of extremely slow performance for nearly all query executions. The analytics team works with data on the Analytics database, in a container called PrimaryCareProviders.

    The data engineer has identified the fact that nearly all of the queries are identical, except that the users return different properties in different forms, depending on the report or visualization they are populating. Specifically, the WHERE clause is very common:

    SELECT
    c.last_name
    ,c.gender
    ,c.age
    ,c.city
    ,c.pcp_id
    FROM c
    WHERE c.pcp_id = [some value]
    

    The engineer has also identified a few queries that fail entirely if they include this ORDER BY clause: ORDER BY c.age

    What can you suggest? You may want to navigate to the Cosmos DB account, database, and container to check current settings before you settle on any solutions.

    After selecting one or more optimization ideas, navigate to the appropriate resource(s) and apply your solution(s).

    Optimization Ideas

    • Modify throughput on a container
    • Modify the indexing policy on a container
    • Modify TTL on a container
    • Modify the consistency level on the account
  3. Challenge

    Scenario 3: Reduce Storage Costs

    Your third consultation is with the Internet of Things (IoT) team, who manage hospital sensors, which stream device messages to a container on Cosmos DB for NoSQL. The messages are first stored in a container called DeviceIntake, and then an Azure Function with a trigger on the DeviceIntake change feed sends the data along to Event Hubs. An Azure Stream Analytics job pulls from Event Hubs to transform and augment the data before sending it along to another Cosmos DB container called DeviceMessages. Both containers are on a database called IoT_Team. The architecture is working well, which is somewhat surprising, given that the team spun up both containers with just the normal default settings and have made no adjustments to those defaults.

    No clients read from the DeviceIntake container; once an audit process runs every night to ensure that all raw messages in DeviceIntake have safely arrived in DeviceMessages, no other read operations are performed on that container. All read operations are executed against DeviceMessages, and the typical access pattern involves point reads; SQL queries are rare.

    There are no reported performance issues, but upper management has seem some cost management reports from the IT manager, and they are very concerned by how rapidly the storage costs are racking up. Given that the cost is high enough to get the attention of upper management, the team is eager to hear your ideas for quickly addressing storage costs.

    You may want to navigate to the Cosmos DB account, database, and containers to check current settings before you settle on any solutions.

    After selecting one or more optimization ideas, navigate to the appropriate resource(s) and apply your solution(s).

    Optimization Ideas

    • Modify throughput on a container
    • Modify the indexing policy on a container
    • Modify TTL on a container
    • Modify the consistency level on the account

The Cloud Content team comprises subject matter experts hyper focused on services offered by the leading cloud vendors (AWS, GCP, and Azure), as well as cloud-related technologies such as Linux and DevOps. The team is thrilled to share their knowledge to help you build modern tech solutions from the ground up, secure and optimize your environments, and so much more!

What's a lab?

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Provided environment for hands-on practice

We will provide the credentials and environment necessary for you to practice right within your browser.

Guided walkthrough

Follow along with the author’s guided walkthrough and build something new in your provided environment!

Did you know?

On average, you retain 75% more of your learning if you get time for practice.

Start learning by doing today

View Plans