🔌Setting up an AWS S3 connector
Configuring the AWS S3 connector in Toucan
The AWS S3 connector lets you access files hosted in an AWS S3 bucket. We use AWS STS (Security token Service) to authenticate to the S3 bucket via the Assume Role function.
Name (mandatory)
String
Use it to identify your connection
MyS3Connection
Bucket Name (mandatory)
String
the S3 bucket name you want to query data from
bucket_s3_name
Prefix (Optional)
String
a prefix for your object like a path folder
marketing/
Role ARN (mandatory)
String
AWS Amazon Ressources Names (ARN), identifier that provides access to AWS ressources, configured with policies. Will be given to you by Toucan support
ExternalId (mandatory)
String
already set, represents an ID used in AWS policy configuration
Retry Policy (optional)
Boolean
Boolean allows to configure a retry policy if the connection is flaky.
max attempts: maximum number of retries before giving up
max_delay: in seconds, above the connection is dropped
wait_time: time in seconds between each retry
Slow Queries' Cache Expiration Time (optional)
Integer
Slow queries' cache expiration time in seconds
Click on the TEST CONNECTION
button then SAVE
the connection
After successfully configuring the connector, you will be able to find it in the Connector section of the DataHub "Datasource" tab
Selecting data from AWS S3
To create a dataset from AWS S3, click on the "create from icon", you will then be able to:
Select a file hosted in your S3 bucket
After selecting data from your connector you will be able to create a dataset thanks to YouPrep using the selection as "source step".
Last updated
Was this helpful?