🔌Setting up a Google Big Query Connector

Overview

The Google Big Query connector allows you to connect your Google Big query cloud account to Toucan.

Configuring the Google Big Query connector in Toucan

Prerequisites

You will need to have access to a Google Cloud Platform account with a project you would like to use in Toucan. Consult Google Cloud Platform documentation for how to create and manage a project. This project should have a BigQuery dataset for Toucan to connect to.

Google Cloud Platform: creating a service account and JSON file

To enable Toucan to access your BigQuery dataset, you will first require a service account JSON file. Service accounts are designed for non-human users, such as applications like Toucan, to authenticate and authorize their API requests.

Here's a step-by-step process for creating the service account JSON file, as outlined in Google's documentation for setting up a service account for your BigQuery dataset:

  1. Create a Service Account:

    • Access your Google Cloud Platform project console.

    • In the main sidebar menu on the left, navigate to the IAM & Admin section.

    • Select "Service account" Existing service accounts, if any, will be listed.

    • At the top of the screen, click on "+ CREATE SERVICE ACCOUNT."

  2. Fill Out Service Account Details:

    • Provide a name for the service account.

    • Add a description (the service account ID will be generated once you provide a name).

    • Click the "Create" button to create the service account.

  3. Grant Access to the Service Account:

    • To allow Toucan to view and run queries on your dataset, you need to assign roles to the service account.

    • Ensure that you assign the following roles to the service account:

      • BigQuery Data Viewer

      • BigQuery Metadata Viewer

      • BigQuery Job User (distinct from BigQuery User)

    • For more detailed information on roles in BigQuery, consult Google Cloud Platform's documentation.

  4. Create a Key:

    • Once you've assigned the necessary roles to the service account, click on the "Create Key" button.

    • Choose JSON as the key type.

    • The JSON file containing the credentials will be downloaded to your computer.

The key can be downloaded only once. If you delete it, you will need to create a new service account with identical roles to obtain another key.

Configuring the Google Big Query connector with Service account parameters

  • Click on Add connector

  • Fill out the form with the following info:

  • NAME: a name to retrieve your data connector such as "my-google_big_query_connector"

  • In GOOGLECREDENTIALS part, use the information, you will find in your service_account json file you have generated earlier:

    • SERVICE ACCOUNT : service_account

    • PROJECT ID : paste the project id value you will find in your service account json file

    • PRIVATE KEY ID : paste the private key id value you will find in your service account json file

    • PRIVATE KEY : paste the private key value you will find in your service account json file

    • CLIENT EMAIL : paste the client email value you will find in your service acount json file

    • CLIENT ID : paste the client id value you will find in your service account json file

    • AUTHENTICATION URI : paste the authentication URI value you will find in your service account json file

    • TOKEN URI : paste the token URI value you will find in your service account json file

    • AUTHENTICATION PROVIDER X509 CERTIFICATE URL : paste the authentication provider X509 certificate URL value you will find in your service account json file

    • CLIENT X509 CERTIFICATION URL : paste the client X509 certification URL value you will find in your service account json file

  • Dialect: Tick this case if you want to select a specific dialect used by your server between legacy and standard as query standard more information on this documentation. By default we use standard standard

  • OAuth scopes: OAuth 2.0 scopes define the level of access you need to request the Google APIs for more information see this documentation

Once you have filled all the fields, you can click on SAVE

Configuring the Google Big Query connector with JWT

We offer an alternative way to connect to Google Big Query where you can craft a JWT token and transmit this JWT token to Toucan, where it will be used to access your Google Big Query dataset.

Craft a JWT token
# the following code generate a JWT using using google.auth library
import time

# pip install google-api-python-client 
from google.auth  import crypt, jwt

def generate_jwt(
    sa_keyfile, // the path to service account json
    sa_email="account@project-id.iam.gserviceaccount.com",
    audience="your-service-name",
    expiry_length=3600,
):
    '''Generates a signed JSON Web Token using a Google API Service Account.''' 

    now = int(time.time())

    # build payload
    payload = {
        "iat": now,
        "exp": now + expiry_length,
        "iss": sa_email,
        "aud": audience,
        "sub": sa_email,
        "email": sa_email,
    }

    # sign with keyfile
    signer = crypt.RSASigner.from_service_account_file(sa_keyfile)
    jwt_token = jwt.encode(signer, payload)

    return jwt_token

Once you have generated the token.

  • Click on Add connector

  • Click on JWTCredentials

  • In JWTCredentials, you have two fields to fill:

    • PROJECT ID: corresponding to the id of your Google Big Query project

    • JSON WEB TOKEN (JWT) SIGNED: corresponding to your JWT token you obtain by signing it with your service account json

  • Once you have filled the fields, you can click on SAVE

If you use Toucan by embedding it in other software. You could set up the two credentials sections (GoogleCredentials and JWTCredentials) and use variables (for example {{user.secrets.JWT}} and {{user.secrets.project_id}} to set up the JWTCredentials.

This way, you can design your application from a Google Big Query project linked to test data. And in production, you could pass your JWT and the project id linked to your user's production data into an embedContext. They will be interpolated by the values Toucan finds in your token.

Selecting data from Google Big Query

  • Select a DB_schema

  • Select a Table

  • Only keep the columns you need

For more info, see the dedicated section Creating datasets

After selecting data from your connector you will be able to create a dataset thanks to YouPrep using the selection as "source step".

Last updated