Using MXNet Basic Classification (AWS SageMaker)

1 hour
  • 5 Learning Objectives

About this Hands-on Lab

MXNet is a heavy hitter in the world of machine learning and artificial intelligence. In this activity, we will use MXNet along with Gluon to create an artificial neural network that performs a basic image classification task. In this lab, we use our own data set and get one step closer to the dream of an automated Lego brick sorting machine. The files used in this lab can be found on [GitHub](

Learning Objectives

Successfully complete this lab by achieving the following learning objectives:

Navigate to the Jupyter Notebook

Navigate through the AWS console to the AWS SageMaker page. From there, load the Jupyter Notebook server that has been provided with this hands-on lab.

Load the Data Ready for Training
  1. The data has been saved in MXNet NDArrays into Python object files. Load the data by opening the files and using pickle to load the data into the running environment:

    train_fh = open('lego-simple-mx-train', 'rb')
    test_fh = open('lego-simple-mx-test', 'rb')
    train_data = pickle.load(train_fh)
    test_data = pickle.load(test_fh)
  2. Define some human-readable class_names for the image data:

    class_names = ['2x3 Brick', '2x2 Brick', '1x3 Brick', '2x1 Brick', '1x1 Brick', '2x2 Macaroni', '2x2 Curved End', 'Cog 16 Tooth', '1x2 Handles', '1x2 Grill']
  3. Now that the data is loaded, we can convert the data to MXNet tensors ahead of training:

    transformer = transforms.Compose([
        transforms.Normalize(0.13, 0.31)])
    train_data = train_data.transform_first(transformer)
    test_data = test_data.transform_first(transformer)
  4. Let’s review some of the data to make sure it loaded correctly and there is no corruption in the images. First, let’s look at just the first image in our data:

    train_image_no = 0
    images_data, label_data = train_data[train_image_no]
  5. Now, let’s look at the first 20 images in our data set. Do they look right?

    for i in range(20):
        images_data, label_data = train_data[i]
Create an MXNet Model
  1. Define an artificial neural network using the Gluon for MXNet:

    net = nn.HybridSequential(prefix='MLP_')
    with net.name_scope():
            nn.Dense(128, activation='relu'),
            nn.Dense(64, activation='relu'),
            nn.Dense(10, activation=None)
  2. Create a data loader to manage the feeding of data into our model during training:

    batch_size = 34
    train_loader =, shuffle=True, batch_size=batch_size)
  3. Initialize the model (note we pass in a variable holding the processor type here):

    ctx = mx.gpu(0) if mx.context.num_gpus() > 0 else mx.cpu(0)
    net.initialize(mx.init.Xavier(), ctx=ctx)
  4. Gluon provides a trainer object to maintain the state of the training. We create it here and use it in the training process:

    trainer = gluon.Trainer(
        optimizer_params={'learning_rate': 0.04},
Train the MXNet Model
  1. We’re almost ready to train. First, we define the metric to use while we train and the loss function to use. Gluon provides a softmax loss function, so we just use that:

    metric = mx.metric.Accuracy()
    loss_function = gluon.loss.SoftmaxCrossEntropyLoss()
  2. Now, we train. We could write the following code into a fit function, but this inline code does the job:

    num_epochs = 10
    history = []
    for epoch in range(num_epochs):
        for inputs, labels in train_loader:
            # Possibly copy inputs and labels to the GPU
            inputs = inputs.as_in_context(ctx)
            labels = labels.as_in_context(ctx)
            # Forward pass
            with autograd.record():
                outputs = net(inputs)
                loss = loss_function(outputs, labels)
            # Backpropagation
            metric.update(labels, outputs)
            # Update
        # Print the evaluation metric and reset it for the next epoch
        name, acc = metric.get()
        print('.', end='')
  3. During the training loop, we collected accuracy data in each epoch. Let’s graph this data to get a sense of how the training went:

    plt.figure(figsize=(7, 4))
    plt.title('Model accuracy')
Evaluate and Test the MXNet Model

Now, we use the test data to perform an accuracy measurement. Is the accuracy with the testing data much lower than the end of training? If so, our model might be overfit.

  1. We use the Gluon data loader again — this time with the test data:

    test_loader =, shuffle=False, batch_size=batch_size)
  2. And measure the accuracy:

    metric = mx.metric.Accuracy()
    for inputs, labels in test_loader:
        # Possibly copy inputs and labels to the GPU
        inputs = inputs.as_in_context(ctx)
        labels = labels.as_in_context(ctx)
        metric.update(labels, net(inputs))
    print('Validation: {} = {}'.format(*metric.get()))

Now, we can test the model.

  1. In order to make our tests look good, we define a couple of functions to display the results:

    # Function to display the image:
    def plot_image(predictions_array, true_label, img):
      predicted_label = np.argmax(predictions_array)
      if predicted_label == true_label:
        color = 'green'
        color = 'red'
      # Print a label with 'predicted class', 'probability %', 'actual class'
      plt.xlabel("{} [{:2.0f}] ({})".format(class_names[predicted_label],
    # Function to display the prediction results in a graph:
    def plot_value_array(predictions_array, true_label):
      plot =, predictions_array, color="#777777")
      predicted_label = np.argmax(predictions_array)
  2. Let’s test out model. Choose one of the images from our test set:

    prediction_image_number = 25
  3. Now make a prediction:

    prediction_image, prediction_label = test_data[prediction_image_number]
    predictions_single = net(prediction_image)
  4. Let’s display those results with our function defined earlier:

    plot_value_array(predictions_single[0].asnumpy(), prediction_label)
    plt.xticks(range(10), class_names, rotation=45)
  5. And which block should we have found? In other words, did we get it right?

    plot_image(predictions_single[0].asnumpy(), prediction_label, prediction_image)
  6. Now, let’s get prediction values for all the test images we have. This time we don’t use a data loader — we just iterate through the raw test image data.

    predictions = []
    test_labels = []
    for i in test_data:
        pred_image, pred_label = i
        p = net(pred_image)
  7. Finally, let’s use our helper functions to summarize the first 16 images in our test data. How did we do?

    num_rows = 8
    num_cols = 2
    num_images = num_rows*num_cols
    plt.figure(figsize=(15, 16))
    for i in range(num_images):
      plt.subplot(num_rows, 2*num_cols, 2*i+1)
      plot_image(predictions[i].asnumpy(), test_data[i][1], test_data[i][0])
      plt.subplot(num_rows, 2*num_cols, 2*i+2)
      plot_value_array(predictions[i][0].asnumpy(), test_data[i][1])

Additional Resources

This is a follow-along lab. The videos will walk through each of the steps, including loading the Jupyter Notebook server from AWS SageMaker.

Please make sure you are in the us-east-1 (N. Virginia) region when in the AWS console.

The files used in this lab can be found on GitHub.

At the end of the lab videos, take the rest of the time available for your own experimentation.

What are Hands-on Labs

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Get Started
Who’s going to be learning?

How many seats do you need?

  • $499 USD per seat per year
  • Billed Annually
  • Renews in 12 months

Ready to accelerate learning?

For over 25 licenses, a member of our sales team will walk you through a custom tailored solution for your business.


Sign In
Welcome Back!

Psst…this one if you’ve been moved to ACG!