Big Query (bq cli)
This guide shows how to create a layer of control on top of the bq command line
Requirements
Big Query Setup
- Setup the Bigquery following the topics bellow
- Create a service account with the following permissions:
- BigQuery User
- BigQuery Data Viewer
- Create a JSON type service account key and store in a safe place
For the sake of this guide, we’re following the gcp guide to interact with bigquery.
In case you already have a testing dataset, you can skip this part
Reference: https://cloud.google.com/iam/docs/creating-managing-service-accounts
Hoop Setup
In your local machine, start hoop
shellhoop start
Interacting with Datasets
Create the
bq query
query ConnectionAdd the JSON service account into the
KEY_FILE
inputThis will allow interacting with the bq command line
- Try to execute a query
shellhoop exec -v bqquery -i 'SELECT * FROM babynames.names_2014 LIMIT 1'
Interacting with Jobs
Create a
bq head
connection- Try to fetch it with
bq head -j <session-id>
The previous command was adding the session id as the job id of executed queries on bigquery, thus we could use the previous session id to fetch it.
shellhoop exec bqhead -- -j <PREVIOUS_SESSION_ID>
More Patterns
The bq command line has other actions that could be used as patterns with Hoop.
Allow jobs to be async
shell$ bq query --use_legacy_sql=false --nosynchronous_mode
This allow the pattern of consulting the previous job with
bq head --job_id ...
. It’s useful for queries that take too long to execute.Dry-run commands
Dry run allows showing how much of bytes will be consumed
shell$ bq query --use_legacy_sql=false 'SELECT * FROM babynames.names_2014 limit 5' --dry_run Query successfully validated. Assuming (...), running this query will process 654791 bytes of data.
A plugin could allow queries that has passed by the
--dry-run
mode.Command Line Quota
bq allows specifying the maximum bytes billed, this option could be used as a retriction mechanism
shell$ bq query \ --use_legacy_sql=false \ --maximum_bytes_billed 1 \ 'SELECT * FROM babynames.names_2014 limit 100;' BigQuery error in query operation: Error processing job 'myproject:bqjob_r107f63c4f31d3d37_000001860d50d5f8_1': Query exceeded limit for bytes billed: 1. 10485760 or higher required.