Troubleshooting GKE Deployments

30 minutes
  • 4 Learning Objectives

About this Hands-on Lab

You have taken over as your company’s top GKE deployment wizard, but your predecessor has left you only very brief notes on how to deploy any of the company’s required applications and workloads. Through this lab, you will run some basic deployments on GKE and then troubleshoot the inevitable errors that occur, including `ErrImagePull` and `CrashLoopBackOff`.

Learning Objectives

Successfully complete this lab by achieving the following learning objectives:

Create and Connect to a GKE Cluster
  1. From the GCP menu, select Kubernetes Engine
  2. Wait for the API to be enabled. Then click Create cluster.
  3. Under Node Pools on the left, click default-pool.
  4. Under Size, change the number of nodes to "1".
  5. In Node Pools > default-pool, click Nodes and change the machine type to an e2-small.
  6. Click Create to create the cluster. After a few minutes you will see a green tick that shows that your cluster is ready.
  7. Click Connect next to your cluster. Then under Command-line access, click Run in Cloud Shell.
  8. When the Cloud Shell terminal has spawned, hit Return to run the command and click Authorize when prompted. The rest of this lab’s objectives will be completed in the Cloud Shell terminal using kubectl.
Solve a CrashLoopBackOff Problem

Your predecessor left only these instructions to run a MySQL Pod using kubectl:

kubectl run mysql --image=mysql

After a minute or so, if you check the Pod logs, they will show an unhealthy Pod.

What could be the cause of the problem?

If you check the Pod logs, you will see that the mysql container requires at least 1 environment variable in order to start up successfully.

Delete the Pod and re-create it with a password for the MySQL server.

When you check the Pod logs, they should now show a Pod in the Running state.

Delete this Pod before continuing to the next objective.

Solve an ErrImagePull Problem

Your predecessor left you a note saying that you only use the latest cutting-edge version 3.0 of the NGINX web server. You suspect he is playing a joke on you. Nevertheless, you should attempt to create the Pod.

Quite quickly, if you check the Pod logs, they will show you that this pod can’t run due to an ErrImagePull error. There is no version 3.0 of NGINX, so there is no nginx:3.0 container image to pull.

You can fix this by editing the Pod and correcting the image to nginx:latest (at spec: containers: - image:).

To exit the editor, hit Esc to exit edit mode. Then quit and save from the editor.

Your NGINX Pod should shortly be up and running.

Delete this Pod before continuing to the next objective.

Experience a Pending Pod Problem

For this step, we need to download the YAML file for our deployment. Run the following command to download the file:

curl -O https://raw.githubusercontent.com/ACloudGuru-Resources/Google-Cloud-Professional-Cloud-Developer/main/labs/troubleshooting-gke/nginx-deployment.yml

Using the downloaded deployment file, you first need to create an NGINX deployment and then scale up the deployment to 5 replicas.

After a minute or so, check on the replicas. You’ll see that 1 or 2 of the pods are stuck in a Pending state.

First, check to see if the deployment has run successfully. If there are no errors in the deployment event log, then you should go back to the logs output. Pick a pod in a Pending state to describe.

In the event log, we’ll see why this pod cannot be scheduled. There simply isn’t enough available CPU in your cluster to schedule the extra pods. Looks like your predecessor was also joking about the recommended cluster sizing!

Increasing the node size or adding more nodes to the cluster will fix this problem.

Additional Resources

To get started, log in to Google Cloud Platform by opening https://console.cloud.google.com/ in a private browser window. Then sign in using the credentials provided on the lab page.

What are Hands-on Labs

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Sign In
Welcome Back!

Psst…this one if you’ve been moved to ACG!

Get Started
Who’s going to be learning?