logo

BigQuery

Big Query (bq cli)

This guide shows how to create a layer of control on top of the bq command line

Requirements

Big Query Setup

  1. Setup the Bigquery following the topics bellow
  1. Create a service account with the following permissions:
  • BigQuery User
  • BigQuery Data Viewer
  1. Create a JSON type service account key and store in a safe place
For the sake of this guide, we’re following the gcp guide to interact with bigquery.
In case you already have a testing dataset, you can skip this part
Reference: https://cloud.google.com/iam/docs/creating-managing-service-accounts

Hoop Setup

In your local machine, start hoop
shell
hoop start

Interacting with Datasets

Create the bq query query Connection
Add the JSON service account into the KEY_FILE input
This will allow interacting with the bq command line
  1. Try to execute a query
shell
hoop exec -v bqquery -i 'SELECT * FROM babynames.names_2014 LIMIT 1'

Interacting with Jobs

Create a bq head connection
  1. Try to fetch it with bq head -j <session-id>
The previous command was adding the session id as the job id of executed queries on bigquery, thus we could use the previous session id to fetch it.
shell
hoop exec bqhead -- -j <PREVIOUS_SESSION_ID>

More Patterns

The bq command line has other actions that could be used as patterns with Hoop.
Allow jobs to be async
shell
$ bq query --use_legacy_sql=false --nosynchronous_mode
This allow the pattern of consulting the previous job with bq head --job_id .... It’s useful for queries that take too long to execute.
Dry-run commands
Dry run allows showing how much of bytes will be consumed
shell
$ bq query --use_legacy_sql=false 'SELECT * FROM babynames.names_2014 limit 5' --dry_run Query successfully validated. Assuming (...), running this query will process 654791 bytes of data.
A plugin could allow queries that has passed by the --dry-run mode.
Command Line Quota
bq allows specifying the maximum bytes billed, this option could be used as a retriction mechanism
shell
$ bq query \ --use_legacy_sql=false \ --maximum_bytes_billed 1 \ 'SELECT * FROM babynames.names_2014 limit 100;' BigQuery error in query operation: Error processing job 'myproject:bqjob_r107f63c4f31d3d37_000001860d50d5f8_1': Query exceeded limit for bytes billed: 1. 10485760 or higher required.

Powered by Notaku