Optimize data performance
Last updated
Was this helpful?
Last updated
Was this helpful?
Here are a few things you should consider regarding data in Toucan! 💡
Ensure all of your heavy data transformations are done in data preparation.
Make sure that your data preparation is factorized. Why do the same operation ten times when you can only do it once?
If your loading time is still a bit long, go to the network tab of your developer tool, and check where the loading time is high, and how long is the loading time. Therefore, you can improve your query and check later if the loading time improved.
Load mode stores data in a MongoDB database. Data is read from MongoDB in load mode to increase data read speed and, consequently, application performance. Without indexes, MongoDB must perform a collection scan, meaning it examines every document in the collection to determine if it matches the query. For a small collection, the difference in rendering may not be noticeable, but it becomes significant for larger collections (for example, several thousand lines). Indexes can accelerate queries by limiting the number of documents (lines) to scan. Indexes can be created on any attribute of a document, allowing MongoDB to locate matching data more quickly.
For more information, refer to the .
Configuring MongoDB Indexes
To access MongoDB indexes, switch to the staging mode of the application, in Settings
, go to the Advanced configuration
option and select the etl_config
section, click on edit. Indexes can be created for each domain, with each domain represented as a key in the MONGO_INDEXES
configuration block. Example configuration:
The domain domain_a
has 3 indexes:
The first is an index on a single field year
.
The second and third are compound indexes.
For compound indexes, the order of fields is important in the index but not in the query. In addition to supporting queries that match all index fields, compound indexes can support queries that match a prefix (a subset at the beginning of the set) of the index fields.
Creating indexes everywhere for everything is not a magical solution. It is time-consuming and memory-intensive.
Don’t forget to measure the improvements with the network tab of your browser inspector.
Check if queries are filtered first.
Check if they return only what is displayed on the screen.
Check if part of the query could be done in dataprep and thus increase display speed.
No hard coded values: always prefer smart rules (ex: argmax on year instead of 2021).
Check the data pipeline: is it clear and easy to read ?
Clear domain names.
Keep only used domains.
Check the date requester construction (if you are using the old one):
Is it prepared with Dataprep? (because treatment will otherwise be played at each screen loading).
Enough date format to use in all screen query filtering.
Anticipate a year + 1: will it keep on working?
Nice to have: a year -1 or month -1 date column if you need it in your screen queries calculations.
Check report requester construction:
Is it prepared with YouPrep? (because treatment will otherwise be played at each screen loading).
Nice to have: a column for the order (tip: use a conditional step to create it).
Nice to have: if you need it, a “type” column is useful, and several if you have a hierarchical report (children type, parent type).
Nice to have hierarchical: have both one parent/one children column and intermediary levels in columns for each child (tip: can be done with YouPrep as dataprep : rollup + join).
How long do I think this data architecture can last before being improved? (limits in data volume, in query preparation?).
Data validation is in place.
Check if the screens are fast enough.
Check if the home is fast enough.
Check if Mongo indexes are implemented (if needed).
Check if requesters should be used instead of filters.
Are my screens easily readable on mobile?