🔢Stored and Live Datasets

In this page we will dive in the difference between Stored and Live Datasets and their benefits.

Stored Datasets

Stored Datasets are datasets that Toucan will keep in its database. Those datasets will take disk space, but usually might offer better performance as it is "already computed".

Toucan offers a Data Warehousing service for customers needing a database to be used for analytics. Toucan relies on Mongo DB as its storage service.

As stored datasets are "already computed":

Using them for another dataset or to fuel a visualization don't require any computation time
They need to be refreshed in order to be updated, either manually or automatically
They can't include variables depending on

Live Datasets

Live Datasets are datasets that Toucan won't store. Those datasets will be only temporarily be kept "in memory" in order to run computation and send the result to be displayed to the end user.

Live Datasets can be built on top of:

Other live datasets: if all datasets in the lineage are live datasets, it means that Toucan will not store data at any point of the data preparation process - so no data replication outside your own systems -, and the data displayed to users will be as fresh as it exists in the external datasource

Data lineage with only live datasets in Toucan

Stored datasets: in this case the data will be as fresh at its parent dataset in Toucan. Using a live dataset on top of a stored dataset is useful if you want to use in the dataset

Data lineage with a stored parent datasets

Stored datasets vs. Live Datasets benefits

Stored datasets

Live datasets

Useful if you don't have any data warehousing solution in your data stack to build analytics

If all the parent datasets are also live: - No data replication outside of your system - Data as fresh as the data source

Already computed: can be faster than live datasets (depending on the data source performance)

Can use variables in the computation step

Last updated 1 year ago

Was this helpful?