Setting Up Google Cloud Text-to-Speech

45 minutes
  • 5 Learning Objectives

About this Hands-on Lab

Although there are many tools to communicate, there are two basic types of communication: written and spoken. Machine learning has made it possible to convert one to the other, resulting in speech that is very close to that of a human voice. In this hands-on lab, you’ll step through the process for utilizing the Google Cloud Text-to-Speech API, transforming text in a JSON format to an audio-ready MP3 file.

Learning Objectives

Successfully complete this lab by achieving the following learning objectives:

Enable the Cloud Text-to-Speech API
  1. From the Google Cloud console’s main navigation, choose APIs & Services > Library.
  2. Search for "text", and select Cloud Text-to-Speech API.
  3. If necessary, click Enable.
Set Up Service Account
  1. From the main navigation, choose IAM & admin > Service accounts.
  2. Click Create Service Account.
  3. Enter a name for the service account (ai-text-to-speech), and click Create.
  4. Skip choosing a role, and click Continue.
  5. Click Create Key.
  6. Set the Key type to JSON, and click Create.
  7. Save the key to your system, and click Close.
  8. Click Done.
Retrieve Working Files
  1. Activate the Cloud Shell.

  2. Retrieve the working files:

    git clone https://github.com/linuxacademy/content-gc-ai-services-deepdive
  3. In the Cloud Shell, change directories:

    cd content-gc-ai-services-deepdive/ai-conversations/
  4. Click Launch Editor.

  5. In the Shell Editor, expand the ai-conversations and open text-to-speech-request.json.

  6. Review code.

  7. In your system code editor, open the stored JSON private key and copy the contents.

  8. In the Shell Editor, choose File > New File and name the file key.json.

  9. Paste the clipboard contents into the new file, and choose File > Save.

Send Request to Cloud Text-to-Speech API
  1. In the Cloud Shell enter the following command:

    export GOOGLE_APPLICATION_CREDENTIALS=key.json
  2. Call the Cloud Text-to-Speech API:

    curl -X POST 
    -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) 
    -H "Content-Type: application/json; charset=utf-8" 
    -d @text-to-speech-request.json 
    https://texttospeech.googleapis.com/v1/text:synthesize 
    > synthesize-text.txt 
  3. In the Shell Editor, open synthesize-text.txt.

  4. Remove the following from the beginning of the file:

    {
      "audioContent": "
  5. From the end of the file, remove the following:

    "
    }
  6. Save the file.

Convert Response to MP3
  1. In the Cloud Shell, enter the following command:

    base64 synthesize-text.txt --decode > synthesized-audio.mp3
  2. Download the MP3 file to your system:

    cloudshell download synthesized-audio.mp3
  3. Click Download.

  4. Open the downloaded MP3 file to hear the results.

Additional Resources

Your company wants to increase its accessibility on a number of levels, including converting text docs to audio output. You've been asked to run an initial test using Google Cloud Text-to-Speech API to validate the procedure.

You’ll need to accomplish the following steps to complete your task:

  1. Enable Cloud Text-to-Speech API.
  2. Set up a service account.
  3. Store the JSON private key.
  4. Retrieve the working files.
  5. Send request to Cloud Text-to-Speech API.
  6. Convert response to MP3.

What are Hands-on Labs

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Sign In
Welcome Back!

Psst…this one if you’ve been moved to ACG!

Get Started
Who’s going to be learning?