🔌Setting up an AWS S3 connector
Configuring the AWS S3 connector in Toucan
The AWS S3 connector lets you access files hosted in an AWS S3 bucket. We use AWS STS (Security token Service) to authenticate to the S3 bucket via the Assume Role function.
Fill the connection parameters:
NAME* : the name of your connector.
BUCKET NAME* : the S3 bucket name you want to query data from
RetryPolicy: Boolean allows to configure a retry policy if the connection is flaky.
max attempts: maximum number of retries before giving up
max_delay: in seconds, above the connection is dropped
wait_time: time in seconds between each retry
SLOW QUERIES' CACHE EXPIRATION TIME:
PREFIX : a prefix for your object like a path folder e.g. :
marketing/
ROLE ARN* : AWS Amazon Ressources Names (ARN), identifier that provides access to AWS ressources, configured with policies. Will be given to you by Toucan support
EXTERNAL ID* : already set, represents an ID used in AWS policy configuration
After entering those informations, you can test the connection with AWS S3 bucket, to make sure your inputs are correct and working.
If all settings are valid, you are going to have a success message like this
After successfully configuring the connector, you will be able to find it in the Connector section of the DataHub "Datasource" tab
Selecting data from AWS S3
To create a dataset from AWS S3, click on the "create from icon", you will then be able to:
Select a file hosted in your S3 bucket
After selecting data from your connector you will be able to create a dataset thanks to YouPrep using the selection as "source step".
Last updated