Create a Cluster ¶
Provision new cluster ¶
-
Confirm you have the
linode-clicommand installed and configure with the IndeVets account by confirming you can list the current IndeVets clusters:linode-cli lke clusters-list -
Save a name for the new cluster to a shell variable for future commands:
clusterName="indevets-1.23"Production clusters are generally named
indevets-${kubernetesVersion} -
Create new LKE cluster via
linode-cli:linode-cli lke cluster-create \ --label "${clusterName}" \ --region us-east \ --k8s_version 1.25 \ --node_pools.type g6-standard-2 --node_pools.count 3 \ --node_pools.type g6-standard-4 --node_pools.count 2 \ --node_pools.type g6-dedicated-4 --node_pools.count 2 \ --control_plane.high_availability true \` --tags production -
List clusters:
linode-cli lke clusters-list -
Save
idof new cluster to a shell variable for future commands:clusterId=36149 -
Wait for all nodes to have
status=ready:watch -n 1 linode-cli lke pools-list $clusterId --text -
Read ids of new pools into a variable:
{ read productionPoolId; read stagingPoolId; read sandboxPoolId; } <<< $(linode-cli lke pools-list $clusterId --format 'id' --text --no-headers | awk '{print $1}' | uniq) -
Download and save
KUBECONFIGfor accessing new cluster withkubectlclient:linode-cli lke kubeconfig-view $clusterId --text --no-headers | base64 -d > ~/.kube/"${clusterName}.yaml" export KUBECONFIG=~/.kube/"${clusterName}.yaml" -
Confirm that
kubectlcan list the new and ready nodes:kubectl get nodes -
Apply environment labels and taints to nodes based on their pool ids:
kubectl label nodes -l lke.linode.com/pool-id=$productionPoolId environment=production kubectl label nodes -l lke.linode.com/pool-id=$stagingPoolId environment=staging kubectl label nodes -l lke.linode.com/pool-id=$sandboxPoolId environment=sandbox kubectl taint nodes -l environment=production environment=production:NoSchedule kubectl taint nodes -l environment=sandbox environment=sandbox:NoScheduleTip
These commands should be re-applied whenever any nodes are added or recycled. Each of the commands will gracefully fail when redundant so it’s safe to err on the side of re-running them frequently.
Load manifests into new cluster ¶
-
Change to a clean clone of the CATS repository:
cd ~/Repositories/indevets-cats -
Fetch the latest
releases/k8s-manifestsprojection from GitHub:git fetch --all git holo branch pull --all --force -
If you’re staging a new cluster ahead of putting it live, build the
k8s-manifests-nextprojection which patches all ingress definitions to use the*.indevets-next.k8s.jarv.ushostname suffix that can be pointed at the new cluster for testing without interfering with live hostnames:git holo project k8s-manifests-next --commit-to=releases/k8s-manifests --fetch='*' -
Check out the
k8s-manifestsprojection:git checkout releases/k8s-manifests -
First, apply CRDs and namespaces to the cluster:
kubectl apply -Rf ./_/CustomResourceDefinition kubectl apply -Rf ./_/Namespace -
Second, download the
k8s cluster sealed-secrets master keypairitem’s attachment from Vaultwarden and apply to the new cluster to restore private keys for decrypting sealed secrets:kubectl apply -f ~/Downloads/cluster-sealed-secrets-master.key -
Third, initialize the sealed secrets service and load all sealed secrets so that decrypted secrets are in place ahead of other services initializing:
kubectl apply -Rf ./sealed-secrets kubectl apply -f _/ClusterRole/secrets-unsealer.yaml kubectl apply -f _/ClusterRoleBinding/sealed-secrets.yaml find . \ -type d \ -name 'SealedSecret' \ -print0 \ | xargs -r0 -n 1 kubectl apply -Rf -
Finally, apply all remaining resources:
find . \ -maxdepth 1 \ -type d \ -not -name '.*' \ -print0 \ | sort -z \ | xargs -r0 -n 1 kubectl apply -RfSome resources will likely fail to apply the first and second time this command is run as resources come online that are dependencies for other resources. Keep applying the above command with a couple second delay between each until there are no errors.
-
Monitor pods coming online across all namespaces
kubectl get -A pods -
If this cluster is not going live immediately, suspend all cron jobs:
kubectl get --all-namespaces cronjobs \ --no-headers \ -o=custom-columns='NAMESPACE:.metadata.namespace,NAME:.metadata.name' \ | while read namespace name; do kubectl patch -n "${namespace}" cronjobs "${name}" \ -p '{"spec" : {"suspend" : true }}' done
Verify new cluster ¶
-
List all
CertificateRequestobjects and verify it is in the ready state -
Lit all
Secretobjects and verify the sealed secrets service has populated them -
List all ingresses and verify every service loads
Prepare old cluster to go down ¶
-
Put the CATS production pod into maintenence mode on the old cluster by opening a shell on it and running:
artisan down -
Suspend all cron jobs on the old cluster:
kubectl get --all-namespaces cronjobs \ --no-headers \ -o=custom-columns='NAMESPACE:.metadata.namespace,NAME:.metadata.name' \ | while read namespace name; do kubectl patch -n "${namespace}" cronjobs "${name}" \ -p '{"spec" : {"suspend" : true }}' done -
Dump Metabase configuration database from the old cluster:
kubectl -n metabase exec pod/database-0 -it -- pg_dumpall --clean -U metabase > /tmp/metabase.sql
Go live with new cluster ¶
-
Find the external IP for the new
ingress-nginxLoadBalancerinstance on the new cluster:kubectl -n ingress-nginx get services -
Update the
k8s.indevets.comandschedule.indevets.comA records to the new IP -
If cron jobs have all been suspended, they can be re-activated after the new cluster is ready to go live:
kubectl get --all-namespaces cronjobs \ --no-headers \ -o=custom-columns='NAMESPACE:.metadata.namespace,NAME:.metadata.name' \ | while read namespace name; do kubectl patch -n "${namespace}" cronjobs "${name}" \ -p '{"spec" : {"suspend" : false }}' done -
Reset the locally checked-out
releases/k8s-manifestsbranch to the latest canonical version on GitHub, getting rid of the version with patched ingresses projected earlier:git fetch --all git reset --hard origin/releases/k8s-manifests -
Re-apply all ingress manifests:
find . \ -type d \ -name 'Ingress' \ -print0 \ | xargs -r0 -n 1 kubectl apply -Rf -
Monitor Certificate objects progressing to the Ready state, checking the logs of the
cert-managerpod if there seem to be issues:kubectl get -A Certificate
Update GitHub Actions to deploy to the new cluster ¶
-
Generate a kubeconfig file for the
cats-api-deployerservice account and paste a base64-encoded version of it into theKUBECONFIG_BASE64in theapirepository’s actions secrets -
Generate a kubeconfig file for the
github-actionsservice account and paste a base64-encoded version of it into theKUBECONFIG_BASE64in thecorerepository’s actions secrets
Restore configuration to new Metabase instance ¶
-
Scale the
metabasedeployment down to 0 -
Scale the
metabasedatabase statefulset down to 0 -
Delete the
metabasepersistent volume claim -
Scale the
metabasedatabase statefulset back up to 1 -
Once its pod is back online, restore the dump captured above to it:
cat /tmp/metabase.sql | kubectl -n metabase exec pod/database-0 -it -- psql -U metabase -
Scale the
metabasedeployment back up to 1