🔢Creating datasets

On this page, you will learn how to create datasets:

  • From connectors

  • From datasets

Limits on number of rows

The maximum number of rows is set at 1M (for both live and stored dataset). However, when being under NativeSQL that results on a dataset that has less than 1M of rows, there wont be any limitation.

Creating datasets from connectors

You already configured a connector, that you want to use in order to retrieve your data. For that:

  1. Click on the dataset creation button on from your configured connector

  2. Make the appropriate configuration in order to retrieve the data. Refer to your connector documentation in order to setup the configuration

  3. Validate the configuration

  4. Prepare your data thanks to our no-code transformation tool YouPrep™ (know more about YouPrep here)

  5. Save your new dataset by clicking on the button "Create" (at the bottom).

  6. Give a name to the dataset (the name shouldn't be already used by another dataset), and select the storage type between storing it in Toucan, or having it as LIVE data.

  7. Click on "Save" to save your dataset. If you store the dataset in Toucan, you also canto "Save and refresh" the dataset in order to make it available to use

When making the configuration in order to retrieve the appropriate data, note that you can refer to variables (more on variables in this page) instead of giving a fixed value.

Creating datasets from datasets

You already have some datasets you want to rely on for creating child datasets. It's very easy to do it within the "Datasets" tab of the Datahub.

Follow the different steps to create a child dataset from another dataset:

  1. Identify the dataset that you would like to use as a source (“Home 1” in the example) of your new dataset, and click on the “Create from” button on the right part of your listed dataset.

  2. Prepare your data thanks to our no-code transformation tool YouPrep™ (know more about YouPrep here)

  3. Save your new dataset by clicking on the button "Create" (at the bottom).

  4. Give a name to the dataset (the name shouldn't be already used by another dataset), and select the storage type between storing it in Toucan, or having it as LIVE data.

  5. Click on "Save" to save your dataset. If you store the dataset in Toucan, you also canto "Save and refresh" the dataset in order to make it available to use.

Tip: you can even create a child dataset from the Story Panel. You'll be redirected to the DataHub tab while you create your new child dataset. Take a look at this video.

Variables

If you use variables within the YouPrep transformations, you can only save your dataset as LIVE.

Dataset refresh

Refreshing a dataset will also refresh all direct and indirect parent datasets.

Stored dataset saving

Saving a "stored dataset" without refreshing it, won't make it available to create other datasets with it, or to build charts by using the dataset. However, we advice you to only save your dataset (without refreshing) if your dataset is not completely ready to be used.

Dataset download (stored only)

When the dataset is stored in Toucan, it's possible to download it as CSV file, though the action menu of the dataset (on the right part in the listing of the dataset).

Permissions

If you create a new dataset A, from dataset B that has permissions applied, and define the new dataset A as a Stored dataset, the permissions of the parent dataset won't be applied anymore. You will have to define the permissions again on your dataset if you need to secure the access to the data.

Dataset column naming

When the dataset is stored within Toucan, the column names should respect the following constraints:

  • The dataset shouldn't contain a column named "_id"

  • The dataset name shouldn't contain points (".")

Last updated