Azure Data Lake Gen2 From the Command Line

30 minutes
  • 2 Learning Objectives

About this Hands-on Lab

Azure Data Lake Gen2 is built on Azure Blob Storage but offers additional features. With Data Lake Gen2, you can store unstructured Blob data hierarchically, providing greater flexibility in how your data is organized. In this lab, you will have the opportunity to work with Azure Data Lake Gen2 storage from a Linux command line. You will retrieve, edit, and upload some Azure Data Lake Gen2 data from within the Bash Azure Cloud Shell.

Learning Objectives

Successfully complete this lab by achieving the following learning objectives:

Download the configuration file.

Log in to the Azure portal. In a separate tab, log in to the Azure cloud shell (bash) at shell.azure.com.

There is an existing storage account and file share that you can use for the bash cloud shell. After selecting Bash for the Azure Cloud Shell, go to the Advanced Settings.

  • For Cloud Shell region, select West US.
  • For Storage Account, select Use existing, then choose the storage account with the name that begins with cloudshell.
  • For File Share, select UJse existing and enter cloudshell.

Authenticate with the Azure Storage service.

azcopy login

The command will provide a URL and an authentication code. Open the URL and enter the code to authenticate your azcopy cli tool.

Set an environment variable containing the name of the storage account so that you can easily refer to it. You can find the storage account name in Azure Portal. Its name begins with sattconfigs.

storage_account=<storage account name>

Download the configuration file from Azure Data Lake.

azcopy copy "https://${storage_account}.dfs.core.windows.net/configuration/inventory/processor/invprocessor.conf" invprocessor.conf

You can verify the file downloaded successfully by viewing the contents. You should see some configuration data.

cat invprocessor.conf
Make the requested changes and upload the edited configuration file.

Edit the configuration file:

vi invprocessor.conf

Change the numThreads configuration value to 100:

...
numTreads=100
...

Upload the edited file to Azure Data Lake, replacing the existing file:

azcopy copy invprocessor.conf "https://${storage_account}.dfs.core.windows.net/configuration/inventory/processor/invprocessor.conf"

Additional Resources

Your company, Store All the Things!, is using Azure Data Lake Gen2 to manage configuration data, which is used to configure some internal applications. One such application, a backend inventory data processor, requires a configuration change in order to increase the number of threads it uses.

Download the configuration data from Azure Data Lake Gen2, make the requested change to the file, and re-upload the edited file. You can do all of this using the Bash Azure Cloud Shell.

Some additional details:

  • The configuration file is located in a storage account with a name that begins with sattconfigs.
  • The configuration file is in a container called configuration.
  • The configuration file is called inventory/processor/invprocessor.conf.
  • In the configuration file, change the line that begins with numTreads= to numTreads=100.

If you get stuck, feel free to check out the solution video, or the detailed instructions under each objective. Good luck!

What are Hands-on Labs

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Sign In
Welcome Back!

Psst…this one if you’ve been moved to ACG!

Get Started
Who’s going to be learning?