Query Real-Time IoT Data in Azure Stream Analytics

1.5 hours
  • 2 Learning Objectives

About this Hands-on Lab

Learning how to set up an Azure Stream Analytics job is well-documented — and is arguably the least important part of planning for this IoT data processing option. On the other hand, developing the expertise to write meaningful and useful queries and transformations against real-time streaming data is well worth your time. But gathering enough meaningful data to mock an IoT solution at scale can be challenging. In this lab, we supply you with a robust sample of data and a series of real-world scenarios to help you practice some of the most practical and common IoT data query patterns.

Learning Objectives

Successfully complete this lab by achieving the following learning objectives:

Set Up Sample Data and Write a Query
  1. Log in to the Azure portal using lab credentials in a new InPrivate or incognito window.
  2. Download the mock data file as described in the Additional Resources section of this lab.
  3. Navigate to the Azure Stream Analytics job in the portal.
  4. Navigate to the Query, and look for Upload sample input to upload the JSON file with mock data. Once the data is uploaded, you should see sample data below the query pane, in the Input Preview tab. This may take several seconds.
  5. Select Test query to run the sample query. The query should return 1860 rows with the following 4 column headers: time, dspl, temp, hmdt.

    Note: The point of this lab is to write queries and not to test your ability to get the sample data loaded. If you have any trouble, at this point, go ahead and check the lab guide or the solution video.

  6. Write a query that aliases 3 of the 4 column names for better readability and returns data from sensorA.
    • Reasonable column names would be: time (no alias), SensorName, Temperature, and Humidity.
    • Correctly written, your query should return 389 rows in the Test results tab below the query pane.
Write 3 Queries
  1. Building on your first query in the last objective, add a column called Status that returns the string Alert if the temperature is above 100 and the humidity is below 40. If it’s not, it should return OK in the status column.

    Hints:

    • This query uses a CASE statement.
    • This query should return the same 389 rows as the first query challenge, but with an additional column that includes an alert when the described temperature and humidity conditions are present.
  2. Suppose you want to get the average temperature, every 1 minute, over all sensors. You want to display 2 columns: the timestamp representing the minute and the average temperature for that minute. Write a query that fulfills these requirements.

    Hints:

    • Apply the appropriate aggregation to the temp column, and use the tumbling window function set to group and report every 1 minute.
    • Use the TIMESTAMP BY time clause to use the time field from the sample data as the event timestamp.
    • This query should return 32 rows with average temperatures between 101 and 114 (rounded).
    • Because streaming data queries are typically interested in understanding the data output in timeframes, a good majority of the stream analytics queries will include one of a few windowing functions. See the Additional Resources section of this lab for a links to Azure Stream Analytics query documentation.
  3. In casual observation, you note that the sample data is a slice of data from a single day, and the messages are sent at inconsistent intervals. You want to get a better sense of the shape of the data by using overlapping intervals of time to smooth out anomalies, so you write a query to check every 1 minute for the count of messages sent over the last 5 minutes. You decide to limit your query to data from Sensor E.

    Write a query to return 2 columns: the timestamp representing the interval/event timestamp and a column with the count of events over the last 5 minutes.

    Hints:

    • Use a hopping window function to "hop" each minute, with a window size of 5 minutes.
    • Use the TIMESTAMP BY time clause to use the time field from the sample data as the event timestamp.
    • This query should return 36 rows, with timestamps 1 minute apart, and the count of messages sent in the previous 5 minutes.
    • Because streaming data queries are typically interested in understanding the data output in timeframes, a good majority of the stream analytics queries will include one of a few windowing functions. See the Additional Resources section of this lab for a links to Azure Stream Analytics query documentation.

Additional Resources

Imagine you are planning an IoT solution based on weather data from windmills on a small windfarm. You have a JSON document with mock data from several sensors, supplied to you by the weather station manufacturer. You want to write and test queries over the mock data in preparation for the live, streaming data that will be flowing through Azure Stream Analytics, as soon as the weather stations come online and start pumping telemetry to the hub.

You are working with the lead data analyst to answer certain questions about the form and content of the data by using the SQL-like query language available for use in Azure Stream Analytics. The first few of several queries you expect to write are included in this lab.

Link and Instructions to Download Mock IoT Data

  1. Visit the GitHub repository to download the sample data file.
  2. Click on the Raw button in the upper right of the file viewer.
  3. Right-click or CTRL-click, depending on your OS and mouse configuration, and select the Save As option.
  4. Add a .JSON extension to the file name before saving.

Basic SQL-Like Query Structure

The query language in Azure Stream Analytics is a subset of T-SQL, but it is close enough to a recognizable form of SQL that most users refer to as such. All of the queries you write will be in the following general form:

SELECT
[columnName] AS [columnAlias],
[some kind of aggregation]([columnName]) AS [columnAlias]  .  .  .
INTO
Output
FROM
streamingInput
[Additional filtering, grouping/aggregation/windowing, transformation clauses]

Other Azure Stream Analytics Querying Resources

What are Hands-on Labs

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Sign In
Welcome Back!

Psst…this one if you’ve been moved to ACG!

Get Started
Who’s going to be learning?