โ๏ธTuning resources
CPU/Memory
How-to set resource limits and requests
# Recommendation is to never use CPU limits, unless you plan to use guaranteed QoS and allocate whole cores (e.g, multi-tenant environments with strict isolation).
# For canopรฉe, you would set low memory (between 128Mi-256Mi) and cpu requests.
# Canopรฉe is limited by the network, so you would load-balance it.
# Horizontal pod autoscaling is recommended based on memory or traffic.
canopee:
resourcesPreset: 'none'
resources: {}
## You would put:
# resources:
# requests:
# cpu: '100m'
# memory: '128Mi'
# limits:
# memory: '256Mi'
# HPA
autoscaling:
enabled: false
minReplicas: ''
maxReplicas: ''
targetCPU: ''
targetMemory: ''
# For laputa, you would set high memory (between 2.5Gi-12Gi) and cpu requests (whole cores: 1 or 2 or more).
# Memory limits should be set based on the traffic as laputa does heavy computations.
# Laputa is stateful, so it's not possible to enable horizontal pod autoscaling.
laputa:
resourcesPreset: 'none'
resources: {}
autoscaling:
vpa:
enabled: false
annotations: {}
controlledResources: []
maxAllowed: {}
minAllowed: {}
updatePolicy:
updateMode: Auto
## You would put:
# vpa:
# enabled: true
# controlledResources: ["cpu", "memory"]
# maxAllowed:
# cpu: '2'
# memory: '12Gi'
# minAllowed:
# cpu: '1'
# memory: '2.5Gi'
# For layout, you would set a low memory (256Mi-386Mi) and cpu requests (10m-100m).
# Memory limits can easily be guaranteed or slightly burstable.
# Horizontal pod autoscaling is recommended based on memory.
layout:
resourcesPreset: 'none'
resources: {}
autoscaling:
vpa:
enabled: false
annotations: {}
controlledResources: []
maxAllowed: {}
minAllowed: {}
updatePolicy:
updateMode: Auto
hpa:
enabled: false
minReplicas: ''
maxReplicas: ''
targetCPU: ''
targetMemory: ''
# For dataset, you would set a medium memory (1.5Gi-3Gi) and low cpu requests (10m-100m).
dataset:
resourcesPreset: 'none'
resources: {}
autoscaling:
vpa:
enabled: false
annotations: {}
controlledResources: []
maxAllowed: {}
minAllowed: {}
updatePolicy:
updateMode: Auto
hpa:
enabled: false
minReplicas: ''
maxReplicas: ''
targetCPU: ''
targetMemory: ''
dataexecution:
config:
specific:
worker:
long:
resource_limits:
# Replace this by a number if you want to set it manually.
# This control the memory allocator of the data-execution-worker.
memory_bytes: '__AUTOTUNE_MEMORY_BYTES__'
# For dataexecution-api, you would set a low memory requests (128Mi-256mi) and low cpu requests (10m-100m).
api:
resourcesPreset: 'none'
resources: {}
autoscaling:
vpa:
enabled: false
annotations: {}
controlledResources: []
maxAllowed: {}
minAllowed: {}
updatePolicy:
updateMode: Auto
hpa:
enabled: false
minReplicas: ''
maxReplicas: ''
targetCPU: ''
targetMemory: ''
# For dataexecution-worker, you would set a bursting strategy:
# Mid memory requests (128Mi-256Mi), and High memory limits (1Gi-3Gi), depending on the dataset size.
# Low cpu requests (10m-100m).
worker:
resourcesPreset: 'none'
resources: {}
autoscaling:
vpa:
enabled: false
annotations: {}
controlledResources: []
maxAllowed: {}
minAllowed: {}
updatePolicy:
updateMode: Auto
hpa:
enabled: false
minReplicas: ''
maxReplicas: ''
targetCPU: ''
targetMemory: ''
# For impersonate, you would set a very low memory (16Mi-64Mi) and cpu requests (10m-50m).
# Memory limits can be slightly burstable.
impersonate:
resourcesPreset: 'none'
resources: {}
autoscaling:
vpa:
enabled: false
annotations: {}
controlledResources: []
maxAllowed: {}
minAllowed: {}
updatePolicy:
updateMode: Auto
hpa:
enabled: false
minReplicas: ''
maxReplicas: ''
targetCPU: ''
targetMemory: ''
# For spicedb, the authorization service, you would set a low memory (64Mi-128Mi) and cpu requests (10m-50m).
# It is important for this service to have very low latency.
# Replication is recommended, and we recommend to place spicedb alongside the layout and laputa services.
# Horizontal pod autoscaling is very recommended based on average CPU usage, or traffic.
spicedb:
resourcesPreset: 'none'
resources: {}
autoscaling:
vpa:
enabled: false
annotations: {}
controlledResources: []
maxAllowed: {}
minAllowed: {}
updatePolicy:
updateMode: Auto
hpa:
enabled: false
minReplicas: ''
maxReplicas: ''
targetCPU: ''
targetMemory: ''
# For vault, the secret management service, you would set a medium memory (128Mi-512Mi) and cpu requests (100m-500m).
vault:
server:
resourcesPreset: 'none'
resources: {}
# HPA
autoscaling:
enabled: false
minReplicas: ''
maxReplicas: ''
targetCPU: ''
targetMemory: ''
# For curity, the authentication service, which has two services: the admin and the runtime.
# For the admin, you would set memory requests slighly higher than the -Xmx option (2Gi) and low cpu requests (10m-100m).
# For the runtime, you would set memory requests slighly higher than the -Xmx option (2Gi) and medium cpu requests (100m-500m).
# Because Java has a memory pool, you can set the limit equals to the request.
# Horizontal pod autoscaling on the runtime is recommended based on average CPU usage, or traffic.
curity:
admin:
resourcesPreset: 'none'
resources: {}
extraEnvVars:
# You have to restore the original .extraEnvVars to be able adding new env vars.
- name: PG_PASSWORD
valueFrom:
secretKeyRef:
name: '{{ include "toucan-stack.database.secretName" . }}'
key: '{{ include "toucan-stack.database.keyName" . }}'
- name: JAVA_OPTS
# -Xms is the starting memory. If your Curity is stressing your node
# even if there is no traffic, you can decrease this value to 64m. The
# consequence will be an increase in latency during traffic spikes.
# -Xmx is the maximum memory, which should match the memory limit or request.
value: -Xms256m -Xmx2g
autoscaling:
vpa:
enabled: false
annotations: {}
controlledResources: []
maxAllowed: {}
minAllowed: {}
updatePolicy:
updateMode: Auto
runtime:
resourcesPreset: 'none'
resources: {}
extraEnvVars:
# You have to restore the original .extraEnvVars to be able adding new env vars.
- name: PG_PASSWORD
valueFrom:
secretKeyRef:
name: '{{ include "toucan-stack.database.secretName" . }}'
key: '{{ include "toucan-stack.database.keyName" . }}'
- name: JAVA_OPTS
# -Xms is the starting memory. If your Curity is stressing your node
# even if there is no traffic, you can decrease this value to 64m. The
# consequence will be an increase in latency during traffic spikes.
# -Xmx is the maximum memory, which should match the memory limit or request.
value: -Xms256m -Xmx2g
autoscaling:
vpa:
enabled: false
annotations: {}
controlledResources: []
maxAllowed: {}
minAllowed: {}
updatePolicy:
updateMode: Auto
hpa:
enabled: false
minReplicas: ''
maxReplicas: ''
targetCPU: ''
targetMemory: ''
# For gotenberg, the screenshot service, you would set a low memory (64Mi-128Mi) and cpu requests (10m-50m).
# Gotenberg works by receiving jobs, so a Bursting QoS is heavily recommended.
# Set the memory limits up to 2GiB. No CPU limits.
# Horizontal pod autoscaling is recommended based on queue length.
gotenberg:
# gotenberg doesn't have a resourcesPreset.
resources: {}
autoscaling:
enabled: false
minReplicas: 1
maxReplicas: 100
behavior: {}
extraMetrics: []
targetCPUUtilizationPercentage: 80
# targetMemoryUtilizationPercentage: 80
# For postgresql, you would set a low memory (128Mi-1Gi) and cpu requests (100m-500m).
# As this service is critical, we recommend setting up alerts with a threshold of 1Gi and a very high memory limit like 2Gi.
# As the database is stateful, horizontal pod autoscaling is not possible.
postgresql:
primary:
resourcesPreset: 'none'
resources: {}
# For mongodb, you would set a high memory (256Mi-4Gi) and cpu requests (200m or 1 core) depending on your dataset.
# From our experience, mongodb eats a lot of memory, in combination to high memory requests, we recommend setting up a memory limit of 4Gi, or more.
# As the database is stateful, horizontal pod autoscaling is not possible.
mongodb:
resourcesPreset: 'none'
resources: {}
configuration: |-
storage:
wiredTiger:
engineConfig:
# 0.25 is the minimum. Recommendation is 50% of (RAM - 1GiB).
cacheSizeGB: <value>
# OR:
hostInfo:
system:
# Set the value in the Kubernete's limits, in MB.
# This will compute the WiredTiger cache size.
memLimitMB: <value>
## REDIS ##
# For redis, you would set a low memory (64Mi-128Mi) and cpu requests (50m-100m).
# We recommend setting memory limits as you wish.
# As the database is stateful, horizontal pod autoscaling is not possible.
layout-redis:
resourcesPreset: 'none'
resources: {}
laputa-redis:
resourcesPreset: 'none'
resources: {}
impersonate-redis:
resourcesPreset: 'none'
resources: {}
dataexecution-redis:
resourcesPreset: 'none'
resources: {}Understanding relationship between Kubernetes resources and physical resources
Understanding concepts
Tuning strategies

Configuring the threads/workers/connection pool of the components
References
Last updated
Was this helpful?