Datalore 2024.1 Help

Backup, Migration & Restore

General considerations

There are two places where important data resides:

  • PostgreSQL database: it could be backed up and restored using native database tools (or cloud-native tools, if managed database like Amazon RDS is used.

  • Block storage: in both Kubernetes and Docker installation methods, Datalore provides no built-in backup and restore mechanism. Instead, an underlying infrastructure provider tools should be used to backup the volume used for Datalore.

Migration

Sometimes, you might need not to backup, but to migrate an existing environment to another (for example, migrating from PoC envs to production, having different Kubernetes cluster for such cases).

If that's the case, proceed as follows:

  • Export the metadata from the old environment

  • Deploy pods to a new environment

  • Import PersistentVolume/PersistentVolumeClaims metadata

  • Patch PV with the correct UID of the just created PVC

When deployed in Kubernetes, Datalore uses two volume claims (if installed without Hub):

$ k get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE postgresql-data-datalore-0 Bound pvc-2d0f1d24-0ad0-438d-9066-568be44212ca 2Gi RWO gp2 17h storage-datalore-0 Bound pvc-1d103578-c395-4d89-9b5f-778864d4dfac 10Gi RWO gp2 17h
  1. Patch PVs to prevent them from being automatically deleted: kubectl patch pv ${PV_NAME} -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'

  2. Save metadata: kubectl get pv/${PV_NAME} --export -o yaml > ${PV_NAME}.yaml; kubectl get pvc/${PVC_NAME} --export -o yaml > ${PVC_NAME}.yaml

  3. Deploy new Datalore installation to the new cluster; then delete (new) PVC along with (new) PV: kubectl delete pvc/${PVC_NAME}

  4. Kubernetes uses UID to determine the connection between PVC and PV; therefore you'll need to create the PV with the metadata from step #3 and patch the PV with the correct UID:

    kubectl apply -f ${PVC_NAME}.yaml PVC_UID=$(kubectl get pvc/${PVC_NAME} -o jsonpath='{.metadata.uid}') kubectl apply -f ${PV_NAME}.yaml kubectl patch pv ${PV_NAME} -p "{\"spec\":{\"claimRef\":{\"uid\":\"${PVC_UID}\"}}}"
  5. Stop/remove the old deployment - it will be impossible to use this volume in the new cluster as long it is attached to the old node, so the old PV/PVC needs to be removed.

Last modified: 09 April 2024