๐Add a Google Big Query connector
Connector features
You can use the Toucan Google Big query connector to connect to Google Big Query with a service account authentication or a JWT token and access tables or views with a SQL query or by using our no-code form .
With this connection, you can fetch data from your Google Big Query to fill your charts and dashboards.
Changelog December 23 - This connector is NativeSQL compatible November 2024 - This connector supports hybrid pipelines \
July 25 - This data connector is supported for connection and NativeSQL by our new Data Execution system
Configuring a Google Big Query connection
We offer two ways to connect to Google Big Query:
with the information you can find in the service account json
or by crafting a JWT token a JWT token and transmit this JWT token to Toucan, where it will be used to access your Google Big Query dataset.
Prerequisites when using service_account to authenticate
You will need to have access to a Google Cloud Platform account with a project you would like to use in Toucan. Consult Google Cloud Platform documentation for how to create and manage a project. This project should have a BigQuery dataset for Toucan to connect to.
Google Cloud Platform: creating a service account and JSON file
To enable Toucan to access your BigQuery dataset, you will first require a service account JSON file. Service accounts are designed for non-human users, such as applications like Toucan, to authenticate and authorize their API requests.
Here's a step-by-step process for creating the service account JSON file, as outlined in Google's documentation for setting up a service account for your BigQuery dataset:
Create a Service Account:
Access your Google Cloud Platform project console.
In the main sidebar menu on the left, navigate to the IAM & Admin section.
Select "Service account" Existing service accounts, if any, will be listed.
At the top of the screen, click on "+ CREATE SERVICE ACCOUNT."
Fill Out Service Account Details:
Provide a name for the service account.
Add a description (the service account ID will be generated once you provide a name).
Click the "Create" button to create the service account.
Grant Access to the Service Account:
To allow Toucan to view and run queries on your dataset, you need to assign roles to the service account.
Ensure that you assign the following roles to the service account:
BigQuery Data Viewer
BigQuery Metadata Viewer
BigQuery Job User (distinct from BigQuery User)
For more detailed information on roles in BigQuery, consult Google Cloud Platform's documentation.
Create a Key:
Once you've assigned the necessary roles to the service account, click on the "Create Key" button.
Choose JSON as the key type.
The JSON file containing the credentials will be downloaded to your computer.
The key can be downloaded only once. If you delete it, you will need to create a new service account with identical roles to obtain another key.
Prerequisites when using JWT token to authenticate
# the following code generate a JWT using using google.auth library
import time
# pip install google-api-python-client
from google.auth import crypt, jwt
def generate_jwt(
sa_keyfile, // the path to service account json
sa_email="[email protected]",
audience="your-service-name",
expiry_length=3600,
):
'''Generates a signed JSON Web Token using a Google API Service Account.'''
now = int(time.time())
# build payload
payload = {
"iat": now,
"exp": now + expiry_length,
"iss": sa_email,
"aud": audience,
"sub": sa_email,
"email": sa_email,
}
# sign with keyfile
signer = crypt.RSASigner.from_service_account_file(sa_keyfile)
jwt_token = jwt.encode(signer, payload)
return jwt_tokenFollow the steps described in Add a connector, choose Google Big Query and fill out the form with the following info:
Name (mandatory)
String
Use it to identify your connection
my-google_big_query_conneciion
GoogleCredentials
Boolean
Use this option if you want to authenticate with the information you will find in your service_account json file
Service Account
String
service_account value located in your service account json file
(see your file)
Project ID
String
project id value located in your service account json file
(see your file)
Private Key Id
String
Private Key Id value located in your service account json file
(see your file)
Private key
String
Private key value located in your service account json file
(see your file)
Client email
String
Client email value located in your service account json file
(see your file)
Client ID
String
Client ID value located in your service account json file
(see your file)
Authentication URI
String
Auth URI value located in your service account json file
(see your file)
Token URI
String
Token URI value located in your service account json file
(see your file)
Authentication provider X509 certificate URL
String
Auth Provider Cert URL value located in your service account json file
(see your file)
Client X509 certification URL
String
Client Cert UR value located in your service account json file
(see your file)
JWTCredentials
Boolean
Use this option if you want to authenticate by sending a JWT token
Project ID
String
Project ID corresponds to the id of your Google Big Query project
project_gbq_id
Json Web token (JWT) signed
String
corresponds to your JWT token you obtain by signing it with your service account json
token
Dialect
Boolean
Tick this case if you want to select a specific dialect used by your server between legacy and standard as query standard more information on this documentation. By default we use standard standard
Oauth2 Scope
Array
OAuth 2.0 scopes define the level of access you need to request the Google APIs for more information see this documentation
List of URLs
Retry Policy (optional)
Boolean
Boolean allows to configure a retry policy if the connection is flaky.
max attempts: maximum number of retries before giving up
max_delay: in seconds, above the connection is dropped
wait_time: time in seconds between each retry
Slow Queries' Cache Expiration Time (optional)
Integer
Slow queries' cache expiration time in seconds
Click on the
TEST CONNECTIONbutton thenSAVEthe connection{% hint style="success" %} After successfully configuring the connector, you will be able to find it in the Connector section of the DataHub "Datasource" tab {% endhint %}
If you use Toucan by embedding it in other software. You could set up the two credentials sections (GoogleCredentials and JWTCredentials) and use variables (for example {{user.secrets.JWT}} and {{user.secrets.project_id}} to set up the JWTCredentials.
This way, you can design your application from a Google Big Query project linked to test data. And in production, you could pass your JWT and the project id linked to your user's production data into an embedContext. They will be interpolated by the values Toucan finds in your token.
Create a dataset from a Google Big Query connection
This data connector is supported in simple and code/SQL mode for our current Data Execution system and new Data Execution systems
Select a
DB_schemaSelect a
TableOnly keep the columns you need
For more info, see the dedicated section Create a new dataset from a dataset
After selecting data from your connector you will be able to create a dataset thanks to YouPrep using the selection as "source step".
Last updated
Was this helpful?