Sanitized mirror from private repository - 2026-03-20 09:03:20 UTC
This commit is contained in:
931
docs/infrastructure/kubernetes-cluster-setup.md
Normal file
931
docs/infrastructure/kubernetes-cluster-setup.md
Normal file
@@ -0,0 +1,931 @@
|
||||
# ☸️ Kubernetes Cluster Setup Guide
|
||||
|
||||
**🔴 Advanced Guide**
|
||||
|
||||
This guide covers deploying and managing a production-ready Kubernetes cluster in your homelab, including high availability, storage, networking, and service deployment.
|
||||
|
||||
## 🎯 Kubernetes Architecture for Homelab
|
||||
|
||||
### **Cluster Design**
|
||||
```bash
|
||||
# Recommended cluster topology:
|
||||
|
||||
# Control Plane Nodes (3 nodes for HA)
|
||||
k8s-master-01: 192.168.10.201 (Concord-NUC)
|
||||
k8s-master-02: 192.168.10.202 (Homelab-VM)
|
||||
k8s-master-03: 192.168.10.203 (Chicago-VM)
|
||||
|
||||
# Worker Nodes (3+ nodes)
|
||||
k8s-worker-01: 192.168.10.211 (Bulgaria-VM)
|
||||
k8s-worker-02: 192.168.10.212 (Guava)
|
||||
k8s-worker-03: 192.168.10.213 (Setillo)
|
||||
|
||||
# Storage Nodes (Ceph/Longhorn)
|
||||
k8s-storage-01: 192.168.10.221 (Atlantis)
|
||||
k8s-storage-02: 192.168.10.222 (Calypso)
|
||||
k8s-storage-03: 192.168.10.223 (Anubis)
|
||||
```
|
||||
|
||||
### **Resource Requirements**
|
||||
```bash
|
||||
# Control Plane Nodes (minimum)
|
||||
CPU: 2 cores
|
||||
RAM: 4 GB
|
||||
Storage: 50 GB SSD
|
||||
Network: 1 Gbps
|
||||
|
||||
# Worker Nodes (minimum)
|
||||
CPU: 4 cores
|
||||
RAM: 8 GB
|
||||
Storage: 100 GB SSD
|
||||
Network: 1 Gbps
|
||||
|
||||
# Storage Nodes (recommended)
|
||||
CPU: 4 cores
|
||||
RAM: 16 GB
|
||||
Storage: 500 GB+ SSD + additional storage
|
||||
Network: 10 Gbps (if available)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🚀 Cluster Installation
|
||||
|
||||
### **Method 1: kubeadm (Recommended for Learning)**
|
||||
|
||||
#### **Prerequisites on All Nodes**
|
||||
```bash
|
||||
# Update system
|
||||
sudo apt update && sudo apt upgrade -y
|
||||
|
||||
# Install required packages
|
||||
sudo apt install -y apt-transport-https ca-certificates curl gpg
|
||||
|
||||
# Disable swap
|
||||
sudo swapoff -a
|
||||
sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
|
||||
|
||||
# Load kernel modules
|
||||
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
|
||||
overlay
|
||||
br_netfilter
|
||||
EOF
|
||||
|
||||
sudo modprobe overlay
|
||||
sudo modprobe br_netfilter
|
||||
|
||||
# Configure sysctl
|
||||
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
|
||||
net.bridge.bridge-nf-call-iptables = 1
|
||||
net.bridge.bridge-nf-call-ip6tables = 1
|
||||
net.ipv4.ip_forward = 1
|
||||
EOF
|
||||
|
||||
sudo sysctl --system
|
||||
```
|
||||
|
||||
#### **Install Container Runtime (containerd)**
|
||||
```bash
|
||||
# Install containerd
|
||||
sudo apt install -y containerd
|
||||
|
||||
# Configure containerd
|
||||
sudo mkdir -p /etc/containerd
|
||||
containerd config default | sudo tee /etc/containerd/config.toml
|
||||
|
||||
# Enable SystemdCgroup
|
||||
sudo sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml
|
||||
|
||||
# Restart containerd
|
||||
sudo systemctl restart containerd
|
||||
sudo systemctl enable containerd
|
||||
```
|
||||
|
||||
#### **Install Kubernetes Components**
|
||||
```bash
|
||||
# Add Kubernetes repository
|
||||
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.29/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
|
||||
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.29/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list
|
||||
|
||||
# Install Kubernetes
|
||||
sudo apt update
|
||||
sudo apt install -y kubelet kubeadm kubectl
|
||||
sudo apt-mark hold kubelet kubeadm kubectl
|
||||
|
||||
# Enable kubelet
|
||||
sudo systemctl enable kubelet
|
||||
```
|
||||
|
||||
#### **Initialize First Control Plane Node**
|
||||
```bash
|
||||
# On k8s-master-01 (192.168.10.201)
|
||||
sudo kubeadm init \
|
||||
--control-plane-endpoint="k8s-api.vish.local:6443" \
|
||||
--upload-certs \
|
||||
--apiserver-advertise-address=192.168.10.201 \
|
||||
--pod-network-cidr=10.244.0.0/16 \
|
||||
--service-cidr=10.96.0.0/12
|
||||
|
||||
# Configure kubectl for root
|
||||
mkdir -p $HOME/.kube
|
||||
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
|
||||
sudo chown $(id -u):$(id -g) $HOME/.kube/config
|
||||
|
||||
# Save join commands (output from kubeadm init)
|
||||
# Control plane join command:
|
||||
kubeadm join k8s-api.vish.local:6443 --token TOKEN \
|
||||
--discovery-token-ca-cert-hash sha256:HASH \
|
||||
--control-plane --certificate-key CERT_KEY
|
||||
|
||||
# Worker join command:
|
||||
kubeadm join k8s-api.vish.local:6443 --token TOKEN \
|
||||
--discovery-token-ca-cert-hash sha256:HASH
|
||||
```
|
||||
|
||||
#### **Install CNI Plugin (Flannel)**
|
||||
```bash
|
||||
# Install Flannel for pod networking
|
||||
kubectl apply -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml
|
||||
|
||||
# Verify installation
|
||||
kubectl get pods -n kube-flannel
|
||||
kubectl get nodes
|
||||
```
|
||||
|
||||
#### **Join Additional Control Plane Nodes**
|
||||
```bash
|
||||
# On k8s-master-02 and k8s-master-03
|
||||
# Use the control plane join command from kubeadm init output
|
||||
sudo kubeadm join k8s-api.vish.local:6443 --token TOKEN \
|
||||
--discovery-token-ca-cert-hash sha256:HASH \
|
||||
--control-plane --certificate-key CERT_KEY
|
||||
|
||||
# Configure kubectl
|
||||
mkdir -p $HOME/.kube
|
||||
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
|
||||
sudo chown $(id -u):$(id -g) $HOME/.kube/config
|
||||
```
|
||||
|
||||
#### **Join Worker Nodes**
|
||||
```bash
|
||||
# On all worker nodes
|
||||
# Use the worker join command from kubeadm init output
|
||||
sudo kubeadm join k8s-api.vish.local:6443 --token TOKEN \
|
||||
--discovery-token-ca-cert-hash sha256:HASH
|
||||
```
|
||||
|
||||
### **Method 2: k3s (Lightweight Alternative)**
|
||||
|
||||
#### **Install k3s Master**
|
||||
```bash
|
||||
# On first master node
|
||||
curl -sfL https://get.k3s.io | sh -s - server \
|
||||
--cluster-init \
|
||||
--disable traefik \
|
||||
--disable servicelb \
|
||||
--write-kubeconfig-mode 644 \
|
||||
--cluster-cidr=10.244.0.0/16 \
|
||||
--service-cidr=10.96.0.0/12
|
||||
|
||||
# Get node token
|
||||
sudo cat /var/lib/rancher/k3s/server/node-token
|
||||
```
|
||||
|
||||
#### **Join Additional Masters**
|
||||
```bash
|
||||
# On additional master nodes
|
||||
curl -sfL https://get.k3s.io | sh -s - server \
|
||||
--server https://192.168.10.201:6443 \
|
||||
--token NODE_TOKEN \
|
||||
--disable traefik \
|
||||
--disable servicelb
|
||||
|
||||
# Configure kubectl
|
||||
mkdir -p $HOME/.kube
|
||||
sudo cp /etc/rancher/k3s/k3s.yaml $HOME/.kube/config
|
||||
sudo chown $(id -u):$(id -g) $HOME/.kube/config
|
||||
```
|
||||
|
||||
#### **Join Worker Nodes**
|
||||
```bash
|
||||
# On worker nodes
|
||||
curl -sfL https://get.k3s.io | sh -s - agent \
|
||||
--server https://192.168.10.201:6443 \
|
||||
--token NODE_TOKEN
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🗄️ Storage Configuration
|
||||
|
||||
### **Longhorn Distributed Storage**
|
||||
|
||||
#### **Install Longhorn**
|
||||
```bash
|
||||
# Add Longhorn Helm repository
|
||||
helm repo add longhorn https://charts.longhorn.io
|
||||
helm repo update
|
||||
|
||||
# Create namespace
|
||||
kubectl create namespace longhorn-system
|
||||
|
||||
# Install Longhorn
|
||||
helm install longhorn longhorn/longhorn \
|
||||
--namespace longhorn-system \
|
||||
--set defaultSettings.defaultDataPath="/var/lib/longhorn" \
|
||||
--set defaultSettings.replicaCount=3 \
|
||||
--set defaultSettings.defaultDataLocality="best-effort"
|
||||
|
||||
# Verify installation
|
||||
kubectl get pods -n longhorn-system
|
||||
kubectl get storageclass
|
||||
```
|
||||
|
||||
#### **Configure Storage Classes**
|
||||
```bash
|
||||
# Create storage classes for different use cases
|
||||
cat <<EOF | kubectl apply -f -
|
||||
apiVersion: storage.k8s.io/v1
|
||||
kind: StorageClass
|
||||
metadata:
|
||||
name: longhorn-fast
|
||||
provisioner: driver.longhorn.io
|
||||
allowVolumeExpansion: true
|
||||
parameters:
|
||||
numberOfReplicas: "2"
|
||||
staleReplicaTimeout: "2880"
|
||||
fromBackup: ""
|
||||
diskSelector: "ssd"
|
||||
nodeSelector: "storage"
|
||||
---
|
||||
apiVersion: storage.k8s.io/v1
|
||||
kind: StorageClass
|
||||
metadata:
|
||||
name: longhorn-bulk
|
||||
provisioner: driver.longhorn.io
|
||||
allowVolumeExpansion: true
|
||||
parameters:
|
||||
numberOfReplicas: "3"
|
||||
staleReplicaTimeout: "2880"
|
||||
fromBackup: ""
|
||||
diskSelector: "hdd"
|
||||
EOF
|
||||
```
|
||||
|
||||
### **NFS Storage (Alternative)**
|
||||
|
||||
#### **Setup NFS Server (on Atlantis)**
|
||||
```bash
|
||||
# Install NFS server
|
||||
sudo apt install nfs-kernel-server
|
||||
|
||||
# Create NFS exports
|
||||
sudo mkdir -p /volume1/k8s-storage/{pv,dynamic}
|
||||
sudo chown nobody:nogroup /volume1/k8s-storage/
|
||||
sudo chmod 777 /volume1/k8s-storage/
|
||||
|
||||
# Configure exports
|
||||
echo "/volume1/k8s-storage 192.168.10.0/24(rw,sync,no_subtree_check,no_root_squash)" | sudo tee -a /etc/exports
|
||||
|
||||
# Apply exports
|
||||
sudo exportfs -ra
|
||||
sudo systemctl restart nfs-kernel-server
|
||||
```
|
||||
|
||||
#### **Install NFS CSI Driver**
|
||||
```bash
|
||||
# Install NFS CSI driver
|
||||
helm repo add csi-driver-nfs https://raw.githubusercontent.com/kubernetes-csi/csi-driver-nfs/master/charts
|
||||
helm install csi-driver-nfs csi-driver-nfs/csi-driver-nfs \
|
||||
--namespace kube-system \
|
||||
--version v4.5.0
|
||||
|
||||
# Create NFS storage class
|
||||
cat <<EOF | kubectl apply -f -
|
||||
apiVersion: storage.k8s.io/v1
|
||||
kind: StorageClass
|
||||
metadata:
|
||||
name: nfs-csi
|
||||
provisioner: nfs.csi.k8s.io
|
||||
parameters:
|
||||
server: atlantis.vish.local
|
||||
share: /volume1/k8s-storage/dynamic
|
||||
reclaimPolicy: Delete
|
||||
volumeBindingMode: Immediate
|
||||
mountOptions:
|
||||
- nfsvers=4.1
|
||||
EOF
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🌐 Networking Configuration
|
||||
|
||||
### **Install Ingress Controller (Nginx)**
|
||||
```bash
|
||||
# Add Nginx Ingress Helm repository
|
||||
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
|
||||
helm repo update
|
||||
|
||||
# Install Nginx Ingress Controller
|
||||
helm install ingress-nginx ingress-nginx/ingress-nginx \
|
||||
--namespace ingress-nginx \
|
||||
--create-namespace \
|
||||
--set controller.service.type=LoadBalancer \
|
||||
--set controller.service.loadBalancerIP=192.168.10.240 \
|
||||
--set controller.metrics.enabled=true \
|
||||
--set controller.podAnnotations."prometheus\.io/scrape"="true" \
|
||||
--set controller.podAnnotations."prometheus\.io/port"="10254"
|
||||
|
||||
# Verify installation
|
||||
kubectl get pods -n ingress-nginx
|
||||
kubectl get svc -n ingress-nginx
|
||||
```
|
||||
|
||||
### **Install MetalLB Load Balancer**
|
||||
```bash
|
||||
# Install MetalLB
|
||||
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.13.12/config/manifests/metallb-native.yaml
|
||||
|
||||
# Wait for MetalLB to be ready
|
||||
kubectl wait --namespace metallb-system \
|
||||
--for=condition=ready pod \
|
||||
--selector=app=metallb \
|
||||
--timeout=90s
|
||||
|
||||
# Configure IP address pool
|
||||
cat <<EOF | kubectl apply -f -
|
||||
apiVersion: metallb.io/v1beta1
|
||||
kind: IPAddressPool
|
||||
metadata:
|
||||
name: homelab-pool
|
||||
namespace: metallb-system
|
||||
spec:
|
||||
addresses:
|
||||
- 192.168.10.240-192.168.10.250
|
||||
---
|
||||
apiVersion: metallb.io/v1beta1
|
||||
kind: L2Advertisement
|
||||
metadata:
|
||||
name: homelab-l2
|
||||
namespace: metallb-system
|
||||
spec:
|
||||
ipAddressPools:
|
||||
- homelab-pool
|
||||
EOF
|
||||
```
|
||||
|
||||
### **Install Cert-Manager**
|
||||
```bash
|
||||
# Add Cert-Manager Helm repository
|
||||
helm repo add jetstack https://charts.jetstack.io
|
||||
helm repo update
|
||||
|
||||
# Install Cert-Manager
|
||||
helm install cert-manager jetstack/cert-manager \
|
||||
--namespace cert-manager \
|
||||
--create-namespace \
|
||||
--version v1.13.3 \
|
||||
--set installCRDs=true
|
||||
|
||||
# Create Let's Encrypt ClusterIssuer
|
||||
cat <<EOF | kubectl apply -f -
|
||||
apiVersion: cert-manager.io/v1
|
||||
kind: ClusterIssuer
|
||||
metadata:
|
||||
name: letsencrypt-prod
|
||||
spec:
|
||||
acme:
|
||||
server: https://acme-v02.api.letsencrypt.org/directory
|
||||
email: admin@vish.local
|
||||
privateKeySecretRef:
|
||||
name: letsencrypt-prod
|
||||
solvers:
|
||||
- http01:
|
||||
ingress:
|
||||
class: nginx
|
||||
EOF
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📊 Monitoring and Observability
|
||||
|
||||
### **Install Prometheus Stack**
|
||||
```bash
|
||||
# Add Prometheus Helm repository
|
||||
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
|
||||
helm repo update
|
||||
|
||||
# Create monitoring namespace
|
||||
kubectl create namespace monitoring
|
||||
|
||||
# Install kube-prometheus-stack
|
||||
helm install prometheus prometheus-community/kube-prometheus-stack \
|
||||
--namespace monitoring \
|
||||
--set prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.storageClassName=longhorn-fast \
|
||||
--set prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.resources.requests.storage=50Gi \
|
||||
--set grafana.persistence.enabled=true \
|
||||
--set grafana.persistence.storageClassName=longhorn-fast \
|
||||
--set grafana.persistence.size=10Gi \
|
||||
--set grafana.adminPassword="REDACTED_PASSWORD" \
|
||||
--set alertmanager.alertmanagerSpec.storage.volumeClaimTemplate.spec.storageClassName=longhorn-fast \
|
||||
--set alertmanager.alertmanagerSpec.storage.volumeClaimTemplate.spec.resources.requests.storage=10Gi
|
||||
|
||||
# Verify installation
|
||||
kubectl get pods -n monitoring
|
||||
kubectl get svc -n monitoring
|
||||
```
|
||||
|
||||
### **Create Ingress for Monitoring Services**
|
||||
```bash
|
||||
cat <<EOF | kubectl apply -f -
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: monitoring-ingress
|
||||
namespace: monitoring
|
||||
annotations:
|
||||
kubernetes.io/ingress.class: nginx
|
||||
cert-manager.io/cluster-issuer: letsencrypt-prod
|
||||
nginx.ingress.kubernetes.io/auth-type: basic
|
||||
nginx.ingress.kubernetes.io/auth-secret: basic-auth
|
||||
spec:
|
||||
tls:
|
||||
- hosts:
|
||||
- grafana.k8s.vish.local
|
||||
- prometheus.k8s.vish.local
|
||||
- alertmanager.k8s.vish.local
|
||||
secretName: monitoring-tls
|
||||
rules:
|
||||
- host: grafana.k8s.vish.local
|
||||
http:
|
||||
paths:
|
||||
- path: /
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: prometheus-grafana
|
||||
port:
|
||||
number: 80
|
||||
- host: prometheus.k8s.vish.local
|
||||
http:
|
||||
paths:
|
||||
- path: /
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: prometheus-kube-prometheus-prometheus
|
||||
port:
|
||||
number: 9090
|
||||
- host: alertmanager.k8s.vish.local
|
||||
http:
|
||||
paths:
|
||||
- path: /
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: prometheus-kube-prometheus-alertmanager
|
||||
port:
|
||||
number: 9093
|
||||
EOF
|
||||
```
|
||||
|
||||
### **Install Logging Stack (ELK)**
|
||||
```bash
|
||||
# Add Elastic Helm repository
|
||||
helm repo add elastic https://helm.elastic.co
|
||||
helm repo update
|
||||
|
||||
# Install Elasticsearch
|
||||
helm install elasticsearch elastic/elasticsearch \
|
||||
--namespace logging \
|
||||
--create-namespace \
|
||||
--set replicas=3 \
|
||||
--set volumeClaimTemplate.storageClassName=longhorn-fast \
|
||||
--set volumeClaimTemplate.resources.requests.storage=100Gi
|
||||
|
||||
# Install Kibana
|
||||
helm install kibana elastic/kibana \
|
||||
--namespace logging \
|
||||
--set service.type=ClusterIP
|
||||
|
||||
# Install Filebeat
|
||||
helm install filebeat elastic/filebeat \
|
||||
--namespace logging \
|
||||
--set daemonset.enabled=true
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🚀 Application Deployment
|
||||
|
||||
### **Migrate Docker Compose Services**
|
||||
|
||||
#### **Convert Docker Compose to Kubernetes**
|
||||
```bash
|
||||
# Install kompose for conversion
|
||||
curl -L https://github.com/kubernetes/kompose/releases/latest/download/kompose-linux-amd64 -o kompose
|
||||
chmod +x kompose
|
||||
sudo mv kompose /usr/local/bin
|
||||
|
||||
# Convert existing docker-compose files
|
||||
cd ~/homelab/Atlantis/uptime-kuma
|
||||
kompose convert -f docker-compose.yml
|
||||
|
||||
# Review and modify generated manifests
|
||||
# Add ingress, persistent volumes, etc.
|
||||
```
|
||||
|
||||
#### **Example: Uptime Kuma on Kubernetes**
|
||||
```bash
|
||||
cat <<EOF | kubectl apply -f -
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: uptime-kuma
|
||||
namespace: monitoring
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: uptime-kuma
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: uptime-kuma
|
||||
spec:
|
||||
containers:
|
||||
- name: uptime-kuma
|
||||
image: louislam/uptime-kuma:1
|
||||
ports:
|
||||
- containerPort: 3001
|
||||
volumeMounts:
|
||||
- name: data
|
||||
mountPath: /app/data
|
||||
resources:
|
||||
requests:
|
||||
memory: "256Mi"
|
||||
cpu: "100m"
|
||||
limits:
|
||||
memory: "512Mi"
|
||||
cpu: "500m"
|
||||
volumes:
|
||||
- name: data
|
||||
persistentVolumeClaim:
|
||||
claimName: uptime-kuma-data
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: PersistentVolumeClaim
|
||||
metadata:
|
||||
name: uptime-kuma-data
|
||||
namespace: monitoring
|
||||
spec:
|
||||
accessModes:
|
||||
- ReadWriteOnce
|
||||
storageClassName: longhorn-fast
|
||||
resources:
|
||||
requests:
|
||||
storage: 5Gi
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: uptime-kuma
|
||||
namespace: monitoring
|
||||
spec:
|
||||
selector:
|
||||
app: uptime-kuma
|
||||
ports:
|
||||
- protocol: TCP
|
||||
port: 3001
|
||||
targetPort: 3001
|
||||
---
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: uptime-kuma
|
||||
namespace: monitoring
|
||||
annotations:
|
||||
kubernetes.io/ingress.class: nginx
|
||||
cert-manager.io/cluster-issuer: letsencrypt-prod
|
||||
spec:
|
||||
tls:
|
||||
- hosts:
|
||||
- uptime.k8s.vish.local
|
||||
secretName: uptime-kuma-tls
|
||||
rules:
|
||||
- host: uptime.k8s.vish.local
|
||||
http:
|
||||
paths:
|
||||
- path: /
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: uptime-kuma
|
||||
port:
|
||||
number: 3001
|
||||
EOF
|
||||
```
|
||||
|
||||
### **Helm Charts for Complex Applications**
|
||||
|
||||
#### **Create Custom Helm Chart**
|
||||
```bash
|
||||
# Create new Helm chart
|
||||
helm create homelab-app
|
||||
|
||||
# Directory structure:
|
||||
homelab-app/
|
||||
├── Chart.yaml
|
||||
├── values.yaml
|
||||
├── templates/
|
||||
│ ├── deployment.yaml
|
||||
│ ├── service.yaml
|
||||
│ ├── ingress.yaml
|
||||
│ └── pvc.yaml
|
||||
└── charts/
|
||||
|
||||
# Example values.yaml for homelab services:
|
||||
cat <<EOF > homelab-app/values.yaml
|
||||
replicaCount: 1
|
||||
|
||||
image:
|
||||
repository: nginx
|
||||
tag: latest
|
||||
pullPolicy: IfNotPresent
|
||||
|
||||
service:
|
||||
type: ClusterIP
|
||||
port: 80
|
||||
|
||||
ingress:
|
||||
enabled: true
|
||||
className: nginx
|
||||
annotations:
|
||||
cert-manager.io/cluster-issuer: letsencrypt-prod
|
||||
hosts:
|
||||
- host: app.k8s.vish.local
|
||||
paths:
|
||||
- path: /
|
||||
pathType: Prefix
|
||||
tls:
|
||||
- secretName: app-tls
|
||||
hosts:
|
||||
- app.k8s.vish.local
|
||||
|
||||
persistence:
|
||||
enabled: true
|
||||
storageClass: longhorn-fast
|
||||
size: 10Gi
|
||||
|
||||
resources:
|
||||
limits:
|
||||
cpu: 500m
|
||||
memory: 512Mi
|
||||
requests:
|
||||
cpu: 100m
|
||||
memory: 128Mi
|
||||
EOF
|
||||
|
||||
# Install chart
|
||||
helm install my-app ./homelab-app
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🔒 Security Configuration
|
||||
|
||||
### **Pod Security Standards**
|
||||
```bash
|
||||
# Create Pod Security Policy
|
||||
cat <<EOF | kubectl apply -f -
|
||||
apiVersion: v1
|
||||
kind: Namespace
|
||||
metadata:
|
||||
name: secure-apps
|
||||
labels:
|
||||
pod-security.kubernetes.io/enforce: restricted
|
||||
pod-security.kubernetes.io/audit: restricted
|
||||
pod-security.kubernetes.io/warn: restricted
|
||||
EOF
|
||||
```
|
||||
|
||||
### **Network Policies**
|
||||
```bash
|
||||
# Example: Deny all traffic by default
|
||||
cat <<EOF | kubectl apply -f -
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: NetworkPolicy
|
||||
metadata:
|
||||
name: default-deny-all
|
||||
namespace: default
|
||||
spec:
|
||||
podSelector: {}
|
||||
policyTypes:
|
||||
- Ingress
|
||||
- Egress
|
||||
---
|
||||
# Allow ingress traffic
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: NetworkPolicy
|
||||
metadata:
|
||||
name: allow-ingress
|
||||
namespace: default
|
||||
spec:
|
||||
podSelector:
|
||||
matchLabels:
|
||||
app: web-app
|
||||
policyTypes:
|
||||
- Ingress
|
||||
ingress:
|
||||
- from:
|
||||
- namespaceSelector:
|
||||
matchLabels:
|
||||
name: ingress-nginx
|
||||
ports:
|
||||
- protocol: TCP
|
||||
port: 80
|
||||
EOF
|
||||
```
|
||||
|
||||
### **RBAC Configuration**
|
||||
```bash
|
||||
# Create service account for applications
|
||||
cat <<EOF | kubectl apply -f -
|
||||
apiVersion: v1
|
||||
kind: ServiceAccount
|
||||
metadata:
|
||||
name: homelab-app
|
||||
namespace: default
|
||||
---
|
||||
apiVersion: rbac.authorization.k8s.io/v1
|
||||
kind: Role
|
||||
metadata:
|
||||
namespace: default
|
||||
name: homelab-app-role
|
||||
rules:
|
||||
- apiGroups: [""]
|
||||
resources: ["pods", "services"]
|
||||
verbs: ["get", "list", "watch"]
|
||||
---
|
||||
apiVersion: rbac.authorization.k8s.io/v1
|
||||
kind: RoleBinding
|
||||
metadata:
|
||||
name: homelab-app-binding
|
||||
namespace: default
|
||||
subjects:
|
||||
- kind: ServiceAccount
|
||||
name: homelab-app
|
||||
namespace: default
|
||||
roleRef:
|
||||
kind: Role
|
||||
name: homelab-app-role
|
||||
apiGroup: rbac.authorization.k8s.io
|
||||
EOF
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🔧 Cluster Management
|
||||
|
||||
### **Backup and Restore**
|
||||
|
||||
#### **etcd Backup**
|
||||
```bash
|
||||
# Create backup script
|
||||
cat <<EOF > /usr/local/bin/etcd-backup.sh
|
||||
#!/bin/bash
|
||||
ETCDCTL_API=3 etcdctl snapshot save /backup/etcd-snapshot-\$(date +%Y%m%d-%H%M%S).db \
|
||||
--endpoints=https://127.0.0.1:2379 \
|
||||
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
|
||||
--cert=/etc/kubernetes/pki/etcd/server.crt \
|
||||
--key=/etc/kubernetes/pki/etcd/server.key
|
||||
|
||||
# Keep only last 7 days of backups
|
||||
find /backup -name "etcd-snapshot-*.db" -mtime +7 -delete
|
||||
EOF
|
||||
|
||||
chmod +x /usr/local/bin/etcd-backup.sh
|
||||
|
||||
# Schedule daily backups
|
||||
echo "0 2 * * * /usr/local/bin/etcd-backup.sh" | crontab -
|
||||
```
|
||||
|
||||
#### **Velero for Application Backup**
|
||||
```bash
|
||||
# Install Velero CLI
|
||||
wget https://github.com/vmware-tanzu/velero/releases/latest/download/velero-linux-amd64.tar.gz
|
||||
tar -xzf velero-linux-amd64.tar.gz
|
||||
sudo mv velero-*/velero /usr/local/bin/
|
||||
|
||||
# Install Velero server (using MinIO for storage)
|
||||
velero install \
|
||||
--provider aws \
|
||||
--plugins velero/velero-plugin-for-aws:v1.8.0 \
|
||||
--bucket velero-backups \
|
||||
--secret-file ./credentials-velero \
|
||||
--use-volume-snapshots=false \
|
||||
--backup-location-config region=minio,s3ForcePathStyle="true",s3Url=http://minio.vish.local:9000
|
||||
|
||||
# Create backup schedule
|
||||
velero schedule create daily-backup --schedule="0 1 * * *"
|
||||
```
|
||||
|
||||
### **Cluster Upgrades**
|
||||
```bash
|
||||
# Upgrade control plane nodes (one at a time)
|
||||
# 1. Drain node
|
||||
kubectl drain k8s-master-01 --ignore-daemonsets --delete-emptydir-data
|
||||
|
||||
# 2. Upgrade kubeadm
|
||||
sudo apt update
|
||||
sudo apt-mark unhold kubeadm
|
||||
sudo apt install kubeadm=1.29.x-00
|
||||
sudo apt-mark hold kubeadm
|
||||
|
||||
# 3. Upgrade cluster
|
||||
sudo kubeadm upgrade plan
|
||||
sudo kubeadm upgrade apply v1.29.x
|
||||
|
||||
# 4. Upgrade kubelet and kubectl
|
||||
sudo apt-mark unhold kubelet kubectl
|
||||
sudo apt install kubelet=1.29.x-00 kubectl=1.29.x-00
|
||||
sudo apt-mark hold kubelet kubectl
|
||||
sudo systemctl daemon-reload
|
||||
sudo systemctl restart kubelet
|
||||
|
||||
# 5. Uncordon node
|
||||
kubectl uncordon k8s-master-01
|
||||
|
||||
# Repeat for other control plane nodes and workers
|
||||
```
|
||||
|
||||
### **Troubleshooting**
|
||||
```bash
|
||||
# Common troubleshooting commands
|
||||
kubectl get nodes -o wide
|
||||
kubectl get pods --all-namespaces
|
||||
kubectl describe node NODE_NAME
|
||||
kubectl logs -n kube-system POD_NAME
|
||||
|
||||
# Check cluster health
|
||||
kubectl get componentstatuses
|
||||
kubectl cluster-info
|
||||
kubectl get events --sort-by=.metadata.creationTimestamp
|
||||
|
||||
# Debug networking
|
||||
kubectl run debug --image=nicolaka/netshoot -it --rm -- /bin/bash
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📋 Migration Strategy
|
||||
|
||||
### **Phase 1: Cluster Setup**
|
||||
```bash
|
||||
☐ Plan cluster architecture and resource allocation
|
||||
☐ Install Kubernetes on all nodes
|
||||
☐ Configure networking and storage
|
||||
☐ Install monitoring and logging
|
||||
☐ Set up backup and disaster recovery
|
||||
☐ Configure security policies
|
||||
☐ Test cluster functionality
|
||||
```
|
||||
|
||||
### **Phase 2: Service Migration**
|
||||
```bash
|
||||
☐ Identify services suitable for Kubernetes
|
||||
☐ Convert Docker Compose to Kubernetes manifests
|
||||
☐ Create Helm charts for complex applications
|
||||
☐ Set up ingress and SSL certificates
|
||||
☐ Configure persistent storage
|
||||
☐ Test service functionality
|
||||
☐ Update DNS and load balancing
|
||||
```
|
||||
|
||||
### **Phase 3: Production Cutover**
|
||||
```bash
|
||||
☐ Migrate non-critical services first
|
||||
☐ Update monitoring and alerting
|
||||
☐ Test disaster recovery procedures
|
||||
☐ Migrate critical services during maintenance window
|
||||
☐ Update documentation and runbooks
|
||||
☐ Train team on Kubernetes operations
|
||||
☐ Decommission old Docker Compose services
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🔗 Related Documentation
|
||||
|
||||
- [Network Architecture](networking.md) - Network design and VLANs for Kubernetes
|
||||
- [Ubiquiti Enterprise Setup](ubiquiti-enterprise-setup.md) - Enterprise networking for cluster infrastructure
|
||||
- [Laptop Travel Setup](laptop-travel-setup.md) - Remote access to Kubernetes cluster
|
||||
- [Tailscale Setup Guide](tailscale-setup-guide.md) - VPN access to cluster services
|
||||
- [Disaster Recovery Guide](../troubleshooting/disaster-recovery.md) - Cluster backup and recovery
|
||||
- [Security Model](security.md) - Security architecture and policies
|
||||
|
||||
---
|
||||
|
||||
**💡 Pro Tip**: Start with a small, non-critical service migration to Kubernetes. Learn the platform gradually before moving mission-critical services. Kubernetes has a steep learning curve, but the benefits of container orchestration, scaling, and management are worth the investment for a growing homelab!
|
||||
Reference in New Issue
Block a user