⚙️Configure an external S3

Layout

Toucan requires 2 buckets:

  • dataexecution-cache, the cache for the data execution service.

  • toucan-data, the main bucket used to store your data when users drop files as data sources.

It is recommended to set up 3 keys:

  • dataexecution which has Read-Write access to the dataexecution-cache bucket.

  • toucan_ro which has Read access to the toucan-data bucket.

  • toucan which has Read-Write access to the toucan-data bucket.

Configuration

1

Disable the embedded S3

Set these parameters in your values file:

yaml: values.override.yaml
garage:
  enabled: false
2

Set up the credentials

Set these parameters in your values file, so that Toucan can use the credentials:

yaml: values.override.yaml
global:
  s3:
    keys:
      dataexecution:
        id: <AWS_ACCESS_KEY_ID>

        secret: <AWS_SECRET_ACCESS_KEY>
        # OR
        existingSecret:
          name: '<K8S Secret Name>'
          key: '<K8S Secret Key>'

      toucan_ro:
        id: <AWS_ACCESS_KEY_ID>

        secret: <AWS_SECRET_ACCESS_KEY>
        # OR
        existingSecret:
          name: '<K8S Secret Name>'
          key: '<K8S Secret Key>'

      toucan:
        id: <AWS_ACCESS_KEY_ID>

        secret: <AWS_SECRET_ACCESS_KEY>
        # OR
        existingSecret:
          name: '<K8S Secret Name>'
          key: '<K8S Secret Key>'

You do not need to fill the name, expiration and neverExpires fields.

3

Replace references to garage

Set these parameters in your values file, so that Toucan can connect to the external S3:

yaml: values.override.yaml
laputa:
  config:
    s3_storage:
      bucket_name: '<your-toucan-data-bucket>' # 'toucan-data'
      region_name: '<your-aws-region>' # 'fr-par', check your S3 provider
      endpoint_url: '<your-external-s3-endpoint-url>' # 'https://<your-external-s3-endpoint-url>'
      verify: true # Check TLS certificate.

dataexecution:
  config:
    specific:
      bucket_name: '<your-dataexecution-cache-bucket>' # 'dataexecution-cache'
      region: '<your-aws-region>' # 'fr-par', check your S3 provider
      endpoint: '<your-external-s3-endpoint-url>' # 'https://<your-external-s3-endpoint-url>'

vault:
  bootstrap:
    s3:
      # Sadly, this is hardcoded inside the dataset service code.
      # If you wish to change it, feel free to send us a feedback.
      path: secret/{{ .Values.dataset.config.environment }}/{{ .Values.global.tenantID }}/{{ .Values.global.workspaceID }}/s3_ro
      uri: "s3://<your-toucan-data-bucket>" # 's3://toucan-data'
      region: '<your-aws-region>' # 'fr-par', check your S3 provider
      endpoint: '<your-external-s3-endpoint-url>' # 'https://<your-external-s3-endpoint-url>'

dataset:
  config:
    specific:
      vault_secret_paths:
        s3_datasource_upload_path: s3_ro # (must match the path: /secret)
4

Install

At this point, your values.override.yaml should looks like:

yaml: /work/values.override.yaml
# ...

global:
  s3:
    keys:
      dataexecution:
        id: <AWS_ACCESS_KEY_ID>

        secret: <AWS_SECRET_ACCESS_KEY>
        # OR
        existingSecret:
          name: '<K8S Secret Name>'
          key: '<K8S Secret Key>'

      toucan_ro:
        id: <AWS_ACCESS_KEY_ID>

        secret: <AWS_SECRET_ACCESS_KEY>
        # OR
        existingSecret:
          name: '<K8S Secret Name>'
          key: '<K8S Secret Key>'

      toucan:
        id: <AWS_ACCESS_KEY_ID>

        secret: <AWS_SECRET_ACCESS_KEY>
        # OR
        existingSecret:
          name: '<K8S Secret Name>'
          key: '<K8S Secret Key>'

laputa:
  config:
    s3_storage:
      bucket_name: '<your-toucan-data-bucket>' # 'toucan-data'
      region_name: '<your-aws-region>' # 'fr-par', check your S3 provider
      endpoint_url: '<your-external-s3-endpoint-url>' # 'https://<your-external-s3-endpoint-url>'
      verify: true # Check TLS certificate.

dataexecution:
  config:
    specific:
      bucket_name: '<your-dataexecution-cache-bucket>' # 'dataexecution-cache'
      region: '<your-aws-region>' # 'fr-par', check your S3 provider
      endpoint: '<your-external-s3-endpoint-url>' # 'https://<your-external-s3-endpoint-url>'

vault:
  bootstrap:
    s3:
      # Sadly, this is hardcoded inside the dataset service code.
      # If you wish to change it, feel free to send us a feedback.
      path: secret/{{ .Values.dataset.config.environment }}/{{ .Values.global.tenantID }}/{{ .Values.global.workspaceID }}/s3_ro
      uri: "s3://<your-toucan-data-bucket>" # 's3://toucan-data'
      region: '<your-aws-region>' # 'fr-par', check your S3 provider
      endpoint: '<your-external-s3-endpoint-url>' # 'https://<your-external-s3-endpoint-url>'

dataset:
  config:
    specific:
      vault_secret_paths:
        s3_datasource_upload_path: s3_ro # (must match the path: /secret)
shell: /work/
helm upgrade --install toucan-stack oci://quay.io/toucantoco/charts/toucan-stack \
  --namespace toucan \
  --values ./values.override.yaml

Last updated

Was this helpful?