🔢Stored and Live Datasets
Last updated
Last updated
In this page we will dive in the difference between Stored and Live Datasets and their benefits.
Stored Datasets are datasets that Toucan will keep in its database. Those datasets will take disk space, but usually might offer better performance as it is "already computed".
Toucan offers a Data Warehousing service for customers needing a database to be used for analytics. Toucan relies on Mongo DB as its storage service.
As stored datasets are "already computed":
Using them for another dataset or to fuel a visualization don't require any computation time
They need to be refreshed in order to be updated, either manually or automatically
They can't include variables depending on
Live Datasets are datasets that Toucan won't store. Those datasets will be only temporarily be kept "in memory" in order to run computation and send the result to be displayed to the end user.
Live Datasets can be built on top of:
Other live datasets: if all datasets in the lineage are live datasets, it means that Toucan will not store data at any point of the data preparation process - so no data replication outside your own systems -, and the data displayed to users will be as fresh as it exists in the external datasource
Stored datasets: in this case the data will be as fresh at its parent dataset in Toucan. Using a live dataset on top of a stored dataset is useful if you want to use in the dataset
Stored datasets | Live datasets |
---|---|
Useful if you don't have any data warehousing solution in your data stack to build analytics | If all the parent datasets are also live: - No data replication outside of your system - Data as fresh as the data source |
Already computed: can be faster than live datasets (depending on the data source performance) | Can use variables in the computation step |