# Backup and Restore

![Infrastructure overview](https://1809014303-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZxYYf1KpgarKMgMsDCrw%2Fuploads%2Fgit-blob-ad184b7b26b55c052f8c3541c6fa05aed3059e09%2Fselfhosted-fullstack.png?alt=media)

Here's are the storage worth backing up:

* PostgreSQL
* Laputa Storage & MongoDB
* Curity config
* S3

## PostgreSQL

### Backup

You can run a Kubernetes CronJob to dump the database. Set these parameters in the `values.yaml`:

{% code title="yaml: values.override.yaml" %}

```yaml
postgresql:
  backup:
    ## @param backup.enabled Enable the logical dump of the database "regularly"
    enabled: false
    cronjob:
      ## @param backup.cronjob.schedule Set the cronjob parameter schedule
      schedule: '@daily'
      ## @param backup.cronjob.timeZone Set the cronjob parameter timeZone
      timeZone: ''

      ## @param backup.cronjob.command Set backup container's command to run
      command:
        - /bin/bash
        - -c
        - PGPASSWORD="${PGPASSWORD:-$(< "$PGPASSWORD_FILE")}" pg_dumpall --clean --if-exists --load-via-partition-root --quote-all-identifiers --no-password --file="${PGDUMP_DIR}/pg_dumpall-$(date '+%Y-%m-%d-%H-%M').pgdump"

      storage:
        ## @param backup.cronjob.storage.enabled Enable using a `PersistentVolumeClaim` as backup data volume
        ##
        enabled: true
        ## @param backup.cronjob.storage.existingClaim Provide an existing `PersistentVolumeClaim` (only when `architecture=standalone`)
        ## If defined, PVC must be created manually before volume will be bound
        ##
        existingClaim: ''
        ## @param backup.cronjob.storage.resourcePolicy Setting it to "keep" to avoid removing PVCs during a helm delete operation. Leaving it empty will delete PVCs after the chart deleted
        ##
        resourcePolicy: ''
        ## @param backup.cronjob.storage.storageClass PVC Storage Class for the backup data volume
        ## If defined, storageClassName: <storageClass>
        ## If set to "-", storageClassName: "", which disables dynamic provisioning
        ## If undefined (the default) or set to null, no storageClassName spec is
        ## set, choosing the default provisioner.
        ##
        storageClass: ''
        ## @param backup.cronjob.storage.accessModes PV Access Mode
        ##
        accessModes:
          - ReadWriteOnce
        ## @param backup.cronjob.storage.size PVC Storage Request for the backup data volume
        ##
        size: 8Gi
        ## @param backup.cronjob.storage.annotations PVC annotations
        ##
        annotations: {}
        ## @param backup.cronjob.storage.mountPath Path to mount the volume at
        ##
        mountPath: /backup/pgdump
        ## @param backup.cronjob.storage.subPath Subdirectory of the volume to mount at
        ## and one PV for multiple services.
        ##
        subPath: ''
        ## Fine tuning for volumeClaimTemplates
        ##
        volumeClaimTemplates:
          ## @param backup.cronjob.storage.volumeClaimTemplates.selector A label query over volumes to consider for binding (e.g. when using local volumes)
          ## A label query over volumes to consider for binding (e.g. when using local volumes)
          ## See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#labelselector-v1-meta for more details
          ##
          selector: {}
```

{% endcode %}

<details>

<summary>Example with S3</summary>

If you want to backup to S3, you would use the [`mountpoint-s3-csi-driver`](https://github.com/awslabs/mountpoint-s3-csi-driver) and mount the S3 bucket at `/backup/pgdump`:

{% code title="yaml: persistentvolume.yaml" overflow="wrap" %}

```yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  name: postgres-backups-pv
spec:
  capacity:
    storage: 50Gi
  accessModes:
    - ReadWriteMany
  storageClassName: ''
  mountOptions:
    - prefix postgres-backups/
    - endpoint-url https://s3.fr-par.scw.cloud
    - uid=1001
    - gid=1001
    - allow-other
  csi:
    driver: s3.csi.aws.com
    volumeHandle: postgres-backups-pv
    volumeAttributes:
      bucketName: backups
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: postgres-backups-pvc
  namespace: postgresql
spec:
  resources:
    requests:
      storage: 50Gi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteMany
  storageClassName: ''
  volumeName: postgres-backups-pv
```

{% endcode %}

{% code title="yaml: values.override.yaml" %}

```yaml
postgresql:
  backup:
    enabled: true
    cronjob:
      storage:
        enabled: true
        existingClaim: postgres-backups-pvc
```

{% endcode %}

</details>

### Restore

{% hint style="warning" %}
Restoring data to a database may overwrite existing records and result in data loss if not performed carefully. Ensure that a full backup is taken prior to any restore operation. Proceed only if you fully understand the implications of the restoration process. The authors of this document are not responsible for any data loss or system issues resulting from improper use.
{% endhint %}

1. Find a way to connect to the database. You can forward the service port if you have admin access to the Kubernetes cluster:

{% code title="bash" overflow="wrap" %}

```shell
kubectl port-forward -n <namespace> svc/toucan-stack-postgresql-primary 5432:5432
```

{% endcode %}

2. Restore the backup using the command:

   ```shell
   psql -h localhost -p 5432 -U toucan-admin -d "$DATABASE" -f "<backup file>" -v ON_ERROR_STOP=1
   ```

## MongoDB

{% hint style="info" %}
No backup utilities are available for MongoDB at the moment, and it is not planned as well.

We plan to deprecated Laputa and MongoDB, which are legacy services from Toucan v2.

It's possible that the [utilities in v2](https://docs.toucantoco.com/sysadmin/04-manage.html#backup-and-restore) and expect to work the same. However, this guide will use proper tools.
{% endhint %}

This guide will guide you through the steps required to backup and restore Toucan's MongoDB database **outside of Kubernetes**.

A better approach is to use a Kubernetes CronJob to dump the database, in which you won't need to run the restore command manually and won't need to forward the port.

### Backup

1. Get the MongoDB credentials:

{% code title="bash" overflow="wrap" %}

```shell
kubectl get secret --namespace toucan toucan-stack-mongodb -o jsonpath='{.data.mongodb-root-password}' | base64 --decode
```

{% endcode %}

2. Forward the MongoDB port if you have admin access to the Kubernetes cluster:

{% code title="bash" overflow="wrap" %}

```shell
kubectl port-forward -n <namespace> svc/toucan-stack-mongodb 27017:27017
```

{% endcode %}

3. Mongo dump:

{% code title="bash" overflow="wrap" %}

```shell
mongodump --uri="mongodb://admin:<password>@localhost:27017" --gzip --archive="<backup file>"
```

{% endcode %}

### Restore

1. Get the MongoDB credentials:

{% code title="bash" overflow="wrap" %}

```shell
kubectl get secret --namespace toucan toucan-stack-mongodb -o jsonpath='{.data.mongodb-root-password}' | base64 --decode
```

{% endcode %}

2. Forward the MongoDB port if you have admin access to the Kubernetes cluster:

{% code title="bash" overflow="wrap" %}

```shell
kubectl port-forward -n <namespace> svc/toucan-stack-mongodb 27017:27017
```

{% endcode %}

3. Mongo restore:

{% code title="bash" overflow="wrap" %}

```shell
mongorestore --verbose --uri="mongodb://admin:<password>@localhost:27017" --gzip --archive="<backup file>"
```

{% endcode %}

{% hint style="info" %}
Use `--dryRun` to test before restoring.
{% endhint %}

## Garage S3

Check out their documentation for [backup and restore](https://garagehq.deuxfleurs.fr/documentation/connect/backup/).

## Persistent Volume backup (Laputa Storage, Garage S3, MongoDB, Curity config, PostgreSQL)

Persistent Volume backups are filesystem snapshots, and not logical backups.

### Backup

By using Kubernetes, you have used a storage provisioner to provision a volume for Laputa. Most storage provisioner can backup the data automatically.

See either your cloud provider or the storage provisioner documentation for more information.

Here's a non-exhaustive list of storage provisioners with backup capabilities:

* [Scaleway - Block Storage snapshot](https://www.scaleway.com/en/docs/block-storage/how-to/create-a-snapshot/)
* [Ceph - Snapshot](https://docs.ceph.com/en/latest/rbd/rbd-snapshot/)
* Including OVH, GKE, EKS, ... via web UI.

You can also look at [external-snapshotter](https://github.com/kubernetes-csi/external-snapshotter/tree/master#usage) if you want a Kubernetes-native way to handle snapshots. In which, some provider implements its API:

* [Rook (Ceph CSI) - Volume Group Snapshots](https://www.rook.io/docs/rook/latest-release/Storage-Configuration/Ceph-CSI/ceph-csi-volume-group-snapshot/)
* [EKS - CSI Snapshot Controller](https://docs.aws.amazon.com/eks/latest/userguide/csi-snapshot-controller.html) (In which, they have their own in-house CSI snapshot controller).
* [GKE - Volume Snapshots](https://cloud.google.com/kubernetes-engine/docs/how-to/persistent-volumes/volume-snapshots)
* [Azure - Azure Disk CSI](https://learn.microsoft.com/en-us/azure/aks/azure-disk-csi)
* [OVH - Volume Snapshots](https://help.ovhcloud.com/csm/en-ie-public-cloud-kubernetes-backup-restore-pv-volume-snapshot?id=kb_article_view\&sysparm_article=KB0037173)
* [OpenStack - Cinder CSI](https://github.com/kubernetes/cloud-provider-openstack/blob/master/docs/cinder-csi-plugin/features.md)
* [Scaleway - Scaleway CSI](https://github.com/scaleway/scaleway-csi#volume-snapshots)
* [NFS CSI](https://github.com/kubernetes-csi/csi-driver-nfs)

{% hint style="warning" %}
There are no `VolumeSnapshot` for the `local-path-provisioner`. (You should simply snapshot the host volume via the Web UI of your cloud provider.)
{% endhint %}

{% hint style="warning" %}
`external-snapshotter` is not periodic. `VolumeSnapshot` must be created manually.

It is worth saying that there is probably an open source project offering this feature like [this one](https://github.com/ryaneorth/k8s-scheduled-volume-snapshotter).
{% endhint %}

To create a snapshot using `external-snapshotter`,

1. Create a `VolumeSnapshotClass` (if not present):

{% code title="yaml: volumesnapshotclass.yaml" overflow="wrap" %}

```yaml
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
  name: csi-cinder-snapclass-in-use-v1
deletionPolicy: Delete
driver: cinder.csi.openstack.org # OVH or OpenStack
parameters: # Read the CSI driver documentation for more information
  force-create: 'true'
```

{% endcode %}

You can verify your snapshot class by using `kubectl get volumesnapshotclass`.

2. Create a `VolumeSnapshot`:

{% code title="yaml: volumesnapshot.yaml" overflow="wrap" %}

```yaml
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: postgresql-snapshot
  namespace: toucan
spec:
  volumeSnapshotClassName: csi-cinder-snapclass-in-use-v1
  source:
    persistentVolumeClaimName: postgresql-pvc
```

{% endcode %}

After creating the snapshot, you should get a `VolumeSnapshotContent`.

### Restore

You can restore a snapshot using the Web UI of your cloud provider.

If you are using `external-snapshotter`, to restore a snapshot, you need to reference the `VolumeSnapshot` in the `PersistentVolumeClaim`. Set these parameters in the values to restore the snapshot:

{% code title="yaml: values.override.yaml" %}

```yaml
postgresql:
  persistence:
    enabled: true
    storageClass: csi-cinder-high-speed
    dataSource:
      name: postgresql-snapshot
      kind: VolumeSnapshot
      apiGroup: snapshot.storage.k8s.io
    size: 10Gi
```

{% endcode %}

**If you fear losing data, remember to set the `retainPolicy` to `Retain` on the `PersistentVolume`.** You should delete the existing `StatefulSet` and `PersistentVolumeClaim` before upgrading the chart. **If the `retainPolicy` is set to `Delete`, the previous data will be lost.**
