In this lab, we are going to use BigQuery to view a public dataset with sample billing data from a Google Cloud organization.
Note: Make sure you right-click on the green “Open GCP Window” button and choose “Open Link in Incognito Window.” If the legal agreement hangs for over a minute, just use the refresh page on your browser.
Successfully complete this lab by achieving the following learning objectives:
- Launch BigQuery
- From the top left menu, scroll down to Big Data, and select BigQuery.
- Run queries and view results
Run queries and view results
Now that we are in BigQuery, let’s look at the sample dataset we are going to work with. We are going to view all columns in our example table to see what fields are included. From the large Query Editor box, copy and paste the following query, then click the Run button:
SELECT * FROM `cloud-training-prod-bucket.arch_infra.billing_data`
FROMcloud-training-prod-bucket.arch_infra.billing_data“ is the public dataset we are working with.
If we click the Results tab underneath, we can view the entire table we are going to work with. Feel free to experiment with other queries such as ordering by cost or usage amount by adding the below string to your query to sort by the column of your choice:
SELECT * FROM `cloud-training-prod-bucket.arch_infra.billing_data` ORDER BY cost DESC
In this query, we are bringing up the entire table contents, but sorting by the highest cost first. You can experiment with other fields as well.
Let’s now do some specific queries. In the same Query editor box, delete the existing contents, and enter the below query to find all charges that were more than 3 dollars:
SELECT product, resource_type, start_time, end_time, cost, project_id, project_name, project_labels_key, currency, currency_conversion_rate, usage_amount, usage_unit FROM `cloud-training-prod-bucket.arch_infra.billing_data` WHERE (cost > 3)
Next let’s find which product had the highest total number of records:
SELECT product, COUNT(*) FROM `cloud-training-prod-bucket.arch_infra.billing_data` GROUP BY product LIMIT 200
Looks like Pub/Sub is pretty popular here…
Finally, let’s see which product most frequently cost more than a dollar:
SELECT product, cost, COUNT(*) FROM `cloud-training-prod-bucket.arch_infra.billing_data` WHERE (cost > 1) GROUP BY cost, product LIMIT 200