📦Components

This document provides an overview of all the different components of the Toucan self-hosted deployment.

This also serves as a way to guide you through the values file.

Since this document tries to describe every components, it might be outdated. You should read the values file to get the latest information.

This document won't provide any documentation about optional Kubernetes' specific parameters unless it's relevant.

NGINX/Tucana

nginx

NGINX is the authentication reverse proxy of the application. Any traffic incoming to the application is handled by NGINX. Naturally, this isn't the natural behavior of NGINX as you may know. An authentication module is installed on NGINX to send a query to the authentication service.

NGINX also serves Tucana, which is the front-end of the application. It is composed of HTML, CSS, JavaScript files and other assets. This is called a static front-end.

Configuration

Configuration-wise, you should be only interested in configuring the ingress. For more details, see the Configure HTTPS - Parameters chapter.

We do not recommend to override the NGINX image to update manually as the authentication is only compatible with a specific version of NGINX.

Tucana's configuration also shouldn't be modified.

Laputa (legacy backend)

laputa

Laputa is the legacy backend of Toucan. It was used to handle most of the logic in v2. It is still here to handle logic that wasn't migrated in other components.

Tuning

Since Laputa is using synchronous workers, we heavily recommend tuning Laputa by setting the number of gunicorn worker by reading Tuning - Configuring the threads/workers/connection pool of the components

Configuration

Configuration-wise, we recommend checking out the Configure feature toggles chapter, and eventually configure SMTP for notifications and PDF reports.

During the migration, the Data Execution Service might not be able to handle every v2 connectors, so you might still need to configure the TOUCAN_EXTRA_CONNECTORS environment variable.

External S3

If you are using a cloud provider, you might want to use its object storage to store your datasets. You can configure it in the Configure an external S3 chapter.

Layout

layout

Layout is the service responsible to handle most of the dashboard's layouts, charts configuration, and anything related to the arrangement of the front-end.

Configuration

Since the Layout service doesn't do heavy computation and simply serves configurations, the Layout service doesn't have any special configuration to configure.

Data Execution

dataexecution

Data Execution Service is the service responsible to handle the execution of the jobs. This includes queries and data processing.

Tuning

This component is heavily impacted based on the volume of data and the number of concurrent users.

We heavily recommend tuning the number of pod replicas and setup autoscaling. See Tuning - How-to set resource limits and requests.

We also recommend configuring the number of workers. See Tuning - Configuring the threads/workers/connection pool of the components

External S3

If you are using a cloud provider, you might want to use its object storage to store your datasets. You can configure it in the Configure an external S3 chapter.

Dataset

dataset

The Dataset service is the service responsible to configure the data source connector configurations and filters.

Configuration

Since the Dataset service doesn't do heavy computation and simply serves configurations, the Dataset service doesn't have any special configuration to configure.

Impersonate

The Impersonate service is the service responsible to handle impersonation. Consider this service as an extension of the Curity service.

More precisely, it is used by Laputa to be able to render PDF reports as another user, on a scheduled basis.

Configuration

Since the Impersonate service doesn't do heavy computation and simply serves configurations, the Impersonate service doesn't have any special configuration to configure.

SpiceDB

spicedb

The SpiceDB service is the service responsible to handle permissions and authorization. It is a highly scalable service and use a Google Zanzibar-inspired schema for the permissions.

This service is not maintained by Toucan Toco.

Configuration

Since the SpiceDB service doesn't do heavy computation and simply serves permissions, the SpiceDB service doesn't have any special configuration to configure.

Tuning

This component is heavily impacted based on the number of concurrent users.

However, at this moment, we are unable to provide you a proper way to configure a SpiceDB cluster. If you wish to deploy such cluster, we recommend in using the SpiceDB Operator. Though, we won't provide any support for it.

Hashicorp Vault

vault

Hashicorp Vault is used to store connector credentials. It is also used by other services to contact Curity by fetching the OAuth token from it.

Hashicorp Vault is not maintained by Toucan Toco.

Configuration

Since the Hashicorp Vault service doesn't do heavy computation and simply serves configurations, the Hashicorp Vault service doesn't have any special configuration to configure.

We don't provide any method to configure an external Hashicorp Vault, for now.

Curity

curity

The Curity service is the service responsible to handle authentication and user management.

This service is not maintained by Toucan Toco.

Configuration

Since Curity is the authentication service, you probably want to configure SMTP to send password reset emails. See Configure email notifications.

If you are using an external SSO, you can also check out the Configure an external SSO chapter.

Gotenberg

gotenberg

The Gotenberg service is the service responsible to render PDF reports. It is basically a Google Chrome headless instance.

This service is not maintained by Toucan Toco.

Configuration

You can check out the Helm Chart's documentation to learn how to configure Gotenberg.

MongoDB

mongodb

MongoDB is a NoSQL database used by Laputa, the legacy backend, to store data. Ultimately, this service will be removed.

This service is not maintained by Toucan Toco.

Configuration

Since this service will be removed, we don't recommend in reconfiguring it, nor use an external MongoDB.

Garage (S3)

garage

Garage is an S3 provider, meaning, it is a distributed object storage. It is used by Toucan to store user data and it's the store that replaces MongoDB.

This service is not maintained by Toucan Toco.

Configuration

While it is a distributed storage, the Garage embedded in the Toucan Helm Chart is a single instance.

You can tune Garage by setting the garage.configuration field. By default, the block size is 1 MB, with a consistency mode of "consistent", and a compression level of 1 using zstd. The DB engine is LMDB.

If you wish to add more buckets, you can edit the garage.buckets.

If you wish to edit the keys and permissions, you can edit the global.garage.keys/garage.keys and garage.permissions fields.

External S3

If you are using a cloud provider, you might want to use its object storage to store your datasets. You can configure it in the Configure an external S3 chapter.

Dragonflies

dragonfly

DragonflyDB is a in-memory key-value store. It is used by Toucan to cache data.

This service is not maintained by Toucan Toco.

Configuration

Due to its simplicy, there shouldn't be any reasons to configure DragonflyDB.

Other Components

This small section describe "abstract" notions which can be encountered in the documentation.

  • Tenant ID: The ID of the tenant to which the deployment belongs. This is solely used by Toucan in SaaS mode.

  • Workspace ID: The ID of the deployment. This is solely used by Toucan in SaaS mode.

Last updated

Was this helpful?