Although there are many tools to communicate, there are two basic types of communication: written and spoken. Machine learning has made it possible to convert one to the other, resulting in speech that is very close to that of a human voice. In this hands-on lab, you’ll step through the process for utilizing the Google Cloud Text-to-Speech API, transforming text in a JSON format to an audio-ready MP3 file.
Learning Objectives
Successfully complete this lab by achieving the following learning objectives:
- Enable the Cloud Text-to-Speech API
- From the Google Cloud console’s main navigation, choose APIs & Services > Library.
- Search for "text", and select Cloud Text-to-Speech API.
- If necessary, click Enable.
- Set Up Service Account
- From the main navigation, choose IAM & admin > Service accounts.
- Click Create Service Account.
- Enter a name for the service account (
ai-text-to-speech
), and click Create. - Skip choosing a role, and click Continue.
- Click Create Key.
- Set the Key type to JSON, and click Create.
- Save the key to your system, and click Close.
- Click Done.
- Retrieve Working Files
Activate the Cloud Shell.
Retrieve the working files:
git clone https://github.com/linuxacademy/content-gc-ai-services-deepdive
In the Cloud Shell, change directories:
cd content-gc-ai-services-deepdive/ai-conversations/
Click Launch Editor.
In the Shell Editor, expand the ai-conversations and open text-to-speech-request.json.
Review code.
In your system code editor, open the stored JSON private key and copy the contents.
In the Shell Editor, choose File > New File and name the file
key.json
.Paste the clipboard contents into the new file, and choose File > Save.
- Send Request to Cloud Text-to-Speech API
In the Cloud Shell enter the following command:
export GOOGLE_APPLICATION_CREDENTIALS=key.json
Call the Cloud Text-to-Speech API:
curl -X POST -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) -H "Content-Type: application/json; charset=utf-8" -d @text-to-speech-request.json https://texttospeech.googleapis.com/v1/text:synthesize > synthesize-text.txt
In the Shell Editor, open
synthesize-text.txt
.Remove the following from the beginning of the file:
{ "audioContent": "
From the end of the file, remove the following:
" }
Save the file.
- Convert Response to MP3
In the Cloud Shell, enter the following command:
base64 synthesize-text.txt --decode > synthesized-audio.mp3
Download the MP3 file to your system:
cloudshell download synthesized-audio.mp3
Click Download.
Open the downloaded MP3 file to hear the results.