๐งTroubleshooting Bootstraping
This guide will help you debug common bootstraping issues.
Issues appearing here only happen on bootstrap. Other issues happening at runtime are not documented here.
Post-install job is failing
The post-install job is failing.
The issue is a configuration issue, or a network issue.
Detection
kubectl get jobs --all-namespacesWith jobs looking like:
NAMESPACE NAME STATUS COMPLETIONS DURATION AGE
<namespace> toucan-stack-postinstall Failed 0/1 1m 1dPossible causes
The post-install job initializes:
The admin account.
The permissions of the admin account.
The permissions of the backend.
Therefore, the issues could be linked to:
The reachability of the authentication server Curity.
The reachability of the authorization server SpiceDB.
A programming error.
The first issue is very likely caused by configuration or network issue.
The second issue is very unlikely because the authorization server can only be accessible internally. Meaning, the Helm Chart should have already been configured for the authorization server to be reachable.
Investigation
Logs should indicates the errors. Search the errors on Google.
Issues could be linked to (but not limited to):
A network issue: DNS configuration.
A configuration issue:
Check
curity.runtime.hostnameandglobal.hostnameCheck if TLS is properly configured:
kubectl describe ingress -n <namespace> toucan-stack-curity-runtimekubectl describe ingress -n <namespace> toucan-stack-tucanaIf using cert-manager:
kubectl get certificates -n <namespace>kubectl describe certificate -n <namespace> <certificate-name>kubectl get certificaterequests -n <namespace>kubectl get challenges -n <namespace>kubectl get issuer -n <namespace>
A programming error.
Mitigation
Fix the configuration or the DNS configuration, or contact the support for help.
Vault is in CrashLoopBackOff state
CrashLoopBackOff stateThe vault container is crashing repeatedly, stuck in crash loop.
The issue is a configuration issue, or a network issue.
Detection
With events looking like:
Possible causes
On bootstrap, if the vault container is crashing, it's very likely linked to the PostStart lifecycle Hook.
The lifecycle Hook is responsible for:
Fetching or inserting the vault bootstraping secret.
Install the vault oauthapp plugin.
Connect to the auth server to setup the oauthapp client.
Setup the KV2 secrets engine.
Step 1 can fail if RBAC has been disabled: configuration issue.
Step 2 can fail if the plugin cannot be downloaded: network issue.
Step 3 can fail if the auth server cannot be reached: configuration or network issue.
Lastly, it can also fail due to implementation errors.
Investigation
Mitigation
Failed at step 1
Check if the RBAC is properly configured:
A
RoleandRoleBindingboth namedtoucan-stack-vault-server-secret-manager(or similar) must be present.A
ServiceAccountnamedtoucan-stack-vault-server(or similar) must be present.
Failed at step 2
Check your network and make sure that the plugin can be downloaded from the machine.
Find the plugin URL and checksum at vault.oauthapp.pluginURL and vault.oauthapp.checksum in the values.yaml file:
If you are working in air-gapped, you must host this file somewhere and replace the vault.oauthapp.pluginURL.
Failed at step 3
The authentication server cannot be reached.
Make sure the hostname defined at
curity.runtime.hostnameis accessible from outside Kubernetes:Make sure the DNS is properly configured to redirect the hostname to the
LoadBalancerIP:{% code title="bash" overflow="wrap" %}
{% endcode %} - Make sure no firewall is blocking the connection:
The Kubernetes gateway (traffic FROM kubernetes GOING TO the external network) should not be blocked.
Unknown failures
Contact the support.
Last updated
Was this helpful?