931 lines
22 KiB
Markdown
931 lines
22 KiB
Markdown
# ☸️ Kubernetes Cluster Setup Guide
|
|
|
|
**🔴 Advanced Guide**
|
|
|
|
This guide covers deploying and managing a production-ready Kubernetes cluster in your homelab, including high availability, storage, networking, and service deployment.
|
|
|
|
## 🎯 Kubernetes Architecture for Homelab
|
|
|
|
### **Cluster Design**
|
|
```bash
|
|
# Recommended cluster topology:
|
|
|
|
# Control Plane Nodes (3 nodes for HA)
|
|
k8s-master-01: 192.168.10.201 (Concord-NUC)
|
|
k8s-master-02: 192.168.10.202 (Homelab-VM)
|
|
k8s-master-03: 192.168.10.203 (Chicago-VM)
|
|
|
|
# Worker Nodes (3+ nodes)
|
|
k8s-worker-01: 192.168.10.211 (Bulgaria-VM)
|
|
k8s-worker-02: 192.168.10.212 (Guava)
|
|
k8s-worker-03: 192.168.10.213 (Setillo)
|
|
|
|
# Storage Nodes (Ceph/Longhorn)
|
|
k8s-storage-01: 192.168.10.221 (Atlantis)
|
|
k8s-storage-02: 192.168.10.222 (Calypso)
|
|
k8s-storage-03: 192.168.10.223 (Anubis)
|
|
```
|
|
|
|
### **Resource Requirements**
|
|
```bash
|
|
# Control Plane Nodes (minimum)
|
|
CPU: 2 cores
|
|
RAM: 4 GB
|
|
Storage: 50 GB SSD
|
|
Network: 1 Gbps
|
|
|
|
# Worker Nodes (minimum)
|
|
CPU: 4 cores
|
|
RAM: 8 GB
|
|
Storage: 100 GB SSD
|
|
Network: 1 Gbps
|
|
|
|
# Storage Nodes (recommended)
|
|
CPU: 4 cores
|
|
RAM: 16 GB
|
|
Storage: 500 GB+ SSD + additional storage
|
|
Network: 10 Gbps (if available)
|
|
```
|
|
|
|
---
|
|
|
|
## 🚀 Cluster Installation
|
|
|
|
### **Method 1: kubeadm (Recommended for Learning)**
|
|
|
|
#### **Prerequisites on All Nodes**
|
|
```bash
|
|
# Update system
|
|
sudo apt update && sudo apt upgrade -y
|
|
|
|
# Install required packages
|
|
sudo apt install -y apt-transport-https ca-certificates curl gpg
|
|
|
|
# Disable swap
|
|
sudo swapoff -a
|
|
sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
|
|
|
|
# Load kernel modules
|
|
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
|
|
overlay
|
|
br_netfilter
|
|
EOF
|
|
|
|
sudo modprobe overlay
|
|
sudo modprobe br_netfilter
|
|
|
|
# Configure sysctl
|
|
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
|
|
net.bridge.bridge-nf-call-iptables = 1
|
|
net.bridge.bridge-nf-call-ip6tables = 1
|
|
net.ipv4.ip_forward = 1
|
|
EOF
|
|
|
|
sudo sysctl --system
|
|
```
|
|
|
|
#### **Install Container Runtime (containerd)**
|
|
```bash
|
|
# Install containerd
|
|
sudo apt install -y containerd
|
|
|
|
# Configure containerd
|
|
sudo mkdir -p /etc/containerd
|
|
containerd config default | sudo tee /etc/containerd/config.toml
|
|
|
|
# Enable SystemdCgroup
|
|
sudo sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml
|
|
|
|
# Restart containerd
|
|
sudo systemctl restart containerd
|
|
sudo systemctl enable containerd
|
|
```
|
|
|
|
#### **Install Kubernetes Components**
|
|
```bash
|
|
# Add Kubernetes repository
|
|
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.29/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
|
|
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.29/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list
|
|
|
|
# Install Kubernetes
|
|
sudo apt update
|
|
sudo apt install -y kubelet kubeadm kubectl
|
|
sudo apt-mark hold kubelet kubeadm kubectl
|
|
|
|
# Enable kubelet
|
|
sudo systemctl enable kubelet
|
|
```
|
|
|
|
#### **Initialize First Control Plane Node**
|
|
```bash
|
|
# On k8s-master-01 (192.168.10.201)
|
|
sudo kubeadm init \
|
|
--control-plane-endpoint="k8s-api.vish.local:6443" \
|
|
--upload-certs \
|
|
--apiserver-advertise-address=192.168.10.201 \
|
|
--pod-network-cidr=10.244.0.0/16 \
|
|
--service-cidr=10.96.0.0/12
|
|
|
|
# Configure kubectl for root
|
|
mkdir -p $HOME/.kube
|
|
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
|
|
sudo chown $(id -u):$(id -g) $HOME/.kube/config
|
|
|
|
# Save join commands (output from kubeadm init)
|
|
# Control plane join command:
|
|
kubeadm join k8s-api.vish.local:6443 --token TOKEN \
|
|
--discovery-token-ca-cert-hash sha256:HASH \
|
|
--control-plane --certificate-key CERT_KEY
|
|
|
|
# Worker join command:
|
|
kubeadm join k8s-api.vish.local:6443 --token TOKEN \
|
|
--discovery-token-ca-cert-hash sha256:HASH
|
|
```
|
|
|
|
#### **Install CNI Plugin (Flannel)**
|
|
```bash
|
|
# Install Flannel for pod networking
|
|
kubectl apply -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml
|
|
|
|
# Verify installation
|
|
kubectl get pods -n kube-flannel
|
|
kubectl get nodes
|
|
```
|
|
|
|
#### **Join Additional Control Plane Nodes**
|
|
```bash
|
|
# On k8s-master-02 and k8s-master-03
|
|
# Use the control plane join command from kubeadm init output
|
|
sudo kubeadm join k8s-api.vish.local:6443 --token TOKEN \
|
|
--discovery-token-ca-cert-hash sha256:HASH \
|
|
--control-plane --certificate-key CERT_KEY
|
|
|
|
# Configure kubectl
|
|
mkdir -p $HOME/.kube
|
|
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
|
|
sudo chown $(id -u):$(id -g) $HOME/.kube/config
|
|
```
|
|
|
|
#### **Join Worker Nodes**
|
|
```bash
|
|
# On all worker nodes
|
|
# Use the worker join command from kubeadm init output
|
|
sudo kubeadm join k8s-api.vish.local:6443 --token TOKEN \
|
|
--discovery-token-ca-cert-hash sha256:HASH
|
|
```
|
|
|
|
### **Method 2: k3s (Lightweight Alternative)**
|
|
|
|
#### **Install k3s Master**
|
|
```bash
|
|
# On first master node
|
|
curl -sfL https://get.k3s.io | sh -s - server \
|
|
--cluster-init \
|
|
--disable traefik \
|
|
--disable servicelb \
|
|
--write-kubeconfig-mode 644 \
|
|
--cluster-cidr=10.244.0.0/16 \
|
|
--service-cidr=10.96.0.0/12
|
|
|
|
# Get node token
|
|
sudo cat /var/lib/rancher/k3s/server/node-token
|
|
```
|
|
|
|
#### **Join Additional Masters**
|
|
```bash
|
|
# On additional master nodes
|
|
curl -sfL https://get.k3s.io | sh -s - server \
|
|
--server https://192.168.10.201:6443 \
|
|
--token NODE_TOKEN \
|
|
--disable traefik \
|
|
--disable servicelb
|
|
|
|
# Configure kubectl
|
|
mkdir -p $HOME/.kube
|
|
sudo cp /etc/rancher/k3s/k3s.yaml $HOME/.kube/config
|
|
sudo chown $(id -u):$(id -g) $HOME/.kube/config
|
|
```
|
|
|
|
#### **Join Worker Nodes**
|
|
```bash
|
|
# On worker nodes
|
|
curl -sfL https://get.k3s.io | sh -s - agent \
|
|
--server https://192.168.10.201:6443 \
|
|
--token NODE_TOKEN
|
|
```
|
|
|
|
---
|
|
|
|
## 🗄️ Storage Configuration
|
|
|
|
### **Longhorn Distributed Storage**
|
|
|
|
#### **Install Longhorn**
|
|
```bash
|
|
# Add Longhorn Helm repository
|
|
helm repo add longhorn https://charts.longhorn.io
|
|
helm repo update
|
|
|
|
# Create namespace
|
|
kubectl create namespace longhorn-system
|
|
|
|
# Install Longhorn
|
|
helm install longhorn longhorn/longhorn \
|
|
--namespace longhorn-system \
|
|
--set defaultSettings.defaultDataPath="/var/lib/longhorn" \
|
|
--set defaultSettings.replicaCount=3 \
|
|
--set defaultSettings.defaultDataLocality="best-effort"
|
|
|
|
# Verify installation
|
|
kubectl get pods -n longhorn-system
|
|
kubectl get storageclass
|
|
```
|
|
|
|
#### **Configure Storage Classes**
|
|
```bash
|
|
# Create storage classes for different use cases
|
|
cat <<EOF | kubectl apply -f -
|
|
apiVersion: storage.k8s.io/v1
|
|
kind: StorageClass
|
|
metadata:
|
|
name: longhorn-fast
|
|
provisioner: driver.longhorn.io
|
|
allowVolumeExpansion: true
|
|
parameters:
|
|
numberOfReplicas: "2"
|
|
staleReplicaTimeout: "2880"
|
|
fromBackup: ""
|
|
diskSelector: "ssd"
|
|
nodeSelector: "storage"
|
|
---
|
|
apiVersion: storage.k8s.io/v1
|
|
kind: StorageClass
|
|
metadata:
|
|
name: longhorn-bulk
|
|
provisioner: driver.longhorn.io
|
|
allowVolumeExpansion: true
|
|
parameters:
|
|
numberOfReplicas: "3"
|
|
staleReplicaTimeout: "2880"
|
|
fromBackup: ""
|
|
diskSelector: "hdd"
|
|
EOF
|
|
```
|
|
|
|
### **NFS Storage (Alternative)**
|
|
|
|
#### **Setup NFS Server (on Atlantis)**
|
|
```bash
|
|
# Install NFS server
|
|
sudo apt install nfs-kernel-server
|
|
|
|
# Create NFS exports
|
|
sudo mkdir -p /volume1/k8s-storage/{pv,dynamic}
|
|
sudo chown nobody:nogroup /volume1/k8s-storage/
|
|
sudo chmod 777 /volume1/k8s-storage/
|
|
|
|
# Configure exports
|
|
echo "/volume1/k8s-storage 192.168.10.0/24(rw,sync,no_subtree_check,no_root_squash)" | sudo tee -a /etc/exports
|
|
|
|
# Apply exports
|
|
sudo exportfs -ra
|
|
sudo systemctl restart nfs-kernel-server
|
|
```
|
|
|
|
#### **Install NFS CSI Driver**
|
|
```bash
|
|
# Install NFS CSI driver
|
|
helm repo add csi-driver-nfs https://raw.githubusercontent.com/kubernetes-csi/csi-driver-nfs/master/charts
|
|
helm install csi-driver-nfs csi-driver-nfs/csi-driver-nfs \
|
|
--namespace kube-system \
|
|
--version v4.5.0
|
|
|
|
# Create NFS storage class
|
|
cat <<EOF | kubectl apply -f -
|
|
apiVersion: storage.k8s.io/v1
|
|
kind: StorageClass
|
|
metadata:
|
|
name: nfs-csi
|
|
provisioner: nfs.csi.k8s.io
|
|
parameters:
|
|
server: atlantis.vish.local
|
|
share: /volume1/k8s-storage/dynamic
|
|
reclaimPolicy: Delete
|
|
volumeBindingMode: Immediate
|
|
mountOptions:
|
|
- nfsvers=4.1
|
|
EOF
|
|
```
|
|
|
|
---
|
|
|
|
## 🌐 Networking Configuration
|
|
|
|
### **Install Ingress Controller (Nginx)**
|
|
```bash
|
|
# Add Nginx Ingress Helm repository
|
|
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
|
|
helm repo update
|
|
|
|
# Install Nginx Ingress Controller
|
|
helm install ingress-nginx ingress-nginx/ingress-nginx \
|
|
--namespace ingress-nginx \
|
|
--create-namespace \
|
|
--set controller.service.type=LoadBalancer \
|
|
--set controller.service.loadBalancerIP=192.168.10.240 \
|
|
--set controller.metrics.enabled=true \
|
|
--set controller.podAnnotations."prometheus\.io/scrape"="true" \
|
|
--set controller.podAnnotations."prometheus\.io/port"="10254"
|
|
|
|
# Verify installation
|
|
kubectl get pods -n ingress-nginx
|
|
kubectl get svc -n ingress-nginx
|
|
```
|
|
|
|
### **Install MetalLB Load Balancer**
|
|
```bash
|
|
# Install MetalLB
|
|
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.13.12/config/manifests/metallb-native.yaml
|
|
|
|
# Wait for MetalLB to be ready
|
|
kubectl wait --namespace metallb-system \
|
|
--for=condition=ready pod \
|
|
--selector=app=metallb \
|
|
--timeout=90s
|
|
|
|
# Configure IP address pool
|
|
cat <<EOF | kubectl apply -f -
|
|
apiVersion: metallb.io/v1beta1
|
|
kind: IPAddressPool
|
|
metadata:
|
|
name: homelab-pool
|
|
namespace: metallb-system
|
|
spec:
|
|
addresses:
|
|
- 192.168.10.240-192.168.10.250
|
|
---
|
|
apiVersion: metallb.io/v1beta1
|
|
kind: L2Advertisement
|
|
metadata:
|
|
name: homelab-l2
|
|
namespace: metallb-system
|
|
spec:
|
|
ipAddressPools:
|
|
- homelab-pool
|
|
EOF
|
|
```
|
|
|
|
### **Install Cert-Manager**
|
|
```bash
|
|
# Add Cert-Manager Helm repository
|
|
helm repo add jetstack https://charts.jetstack.io
|
|
helm repo update
|
|
|
|
# Install Cert-Manager
|
|
helm install cert-manager jetstack/cert-manager \
|
|
--namespace cert-manager \
|
|
--create-namespace \
|
|
--version v1.13.3 \
|
|
--set installCRDs=true
|
|
|
|
# Create Let's Encrypt ClusterIssuer
|
|
cat <<EOF | kubectl apply -f -
|
|
apiVersion: cert-manager.io/v1
|
|
kind: ClusterIssuer
|
|
metadata:
|
|
name: letsencrypt-prod
|
|
spec:
|
|
acme:
|
|
server: https://acme-v02.api.letsencrypt.org/directory
|
|
email: admin@vish.local
|
|
privateKeySecretRef:
|
|
name: letsencrypt-prod
|
|
solvers:
|
|
- http01:
|
|
ingress:
|
|
class: nginx
|
|
EOF
|
|
```
|
|
|
|
---
|
|
|
|
## 📊 Monitoring and Observability
|
|
|
|
### **Install Prometheus Stack**
|
|
```bash
|
|
# Add Prometheus Helm repository
|
|
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
|
|
helm repo update
|
|
|
|
# Create monitoring namespace
|
|
kubectl create namespace monitoring
|
|
|
|
# Install kube-prometheus-stack
|
|
helm install prometheus prometheus-community/kube-prometheus-stack \
|
|
--namespace monitoring \
|
|
--set prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.storageClassName=longhorn-fast \
|
|
--set prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.resources.requests.storage=50Gi \
|
|
--set grafana.persistence.enabled=true \
|
|
--set grafana.persistence.storageClassName=longhorn-fast \
|
|
--set grafana.persistence.size=10Gi \
|
|
--set grafana.adminPassword="REDACTED_PASSWORD" \
|
|
--set alertmanager.alertmanagerSpec.storage.volumeClaimTemplate.spec.storageClassName=longhorn-fast \
|
|
--set alertmanager.alertmanagerSpec.storage.volumeClaimTemplate.spec.resources.requests.storage=10Gi
|
|
|
|
# Verify installation
|
|
kubectl get pods -n monitoring
|
|
kubectl get svc -n monitoring
|
|
```
|
|
|
|
### **Create Ingress for Monitoring Services**
|
|
```bash
|
|
cat <<EOF | kubectl apply -f -
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: Ingress
|
|
metadata:
|
|
name: monitoring-ingress
|
|
namespace: monitoring
|
|
annotations:
|
|
kubernetes.io/ingress.class: nginx
|
|
cert-manager.io/cluster-issuer: letsencrypt-prod
|
|
nginx.ingress.kubernetes.io/auth-type: basic
|
|
nginx.ingress.kubernetes.io/auth-secret: basic-auth
|
|
spec:
|
|
tls:
|
|
- hosts:
|
|
- grafana.k8s.vish.local
|
|
- prometheus.k8s.vish.local
|
|
- alertmanager.k8s.vish.local
|
|
secretName: monitoring-tls
|
|
rules:
|
|
- host: grafana.k8s.vish.local
|
|
http:
|
|
paths:
|
|
- path: /
|
|
pathType: Prefix
|
|
backend:
|
|
service:
|
|
name: prometheus-grafana
|
|
port:
|
|
number: 80
|
|
- host: prometheus.k8s.vish.local
|
|
http:
|
|
paths:
|
|
- path: /
|
|
pathType: Prefix
|
|
backend:
|
|
service:
|
|
name: prometheus-kube-prometheus-prometheus
|
|
port:
|
|
number: 9090
|
|
- host: alertmanager.k8s.vish.local
|
|
http:
|
|
paths:
|
|
- path: /
|
|
pathType: Prefix
|
|
backend:
|
|
service:
|
|
name: prometheus-kube-prometheus-alertmanager
|
|
port:
|
|
number: 9093
|
|
EOF
|
|
```
|
|
|
|
### **Install Logging Stack (ELK)**
|
|
```bash
|
|
# Add Elastic Helm repository
|
|
helm repo add elastic https://helm.elastic.co
|
|
helm repo update
|
|
|
|
# Install Elasticsearch
|
|
helm install elasticsearch elastic/elasticsearch \
|
|
--namespace logging \
|
|
--create-namespace \
|
|
--set replicas=3 \
|
|
--set volumeClaimTemplate.storageClassName=longhorn-fast \
|
|
--set volumeClaimTemplate.resources.requests.storage=100Gi
|
|
|
|
# Install Kibana
|
|
helm install kibana elastic/kibana \
|
|
--namespace logging \
|
|
--set service.type=ClusterIP
|
|
|
|
# Install Filebeat
|
|
helm install filebeat elastic/filebeat \
|
|
--namespace logging \
|
|
--set daemonset.enabled=true
|
|
```
|
|
|
|
---
|
|
|
|
## 🚀 Application Deployment
|
|
|
|
### **Migrate Docker Compose Services**
|
|
|
|
#### **Convert Docker Compose to Kubernetes**
|
|
```bash
|
|
# Install kompose for conversion
|
|
curl -L https://github.com/kubernetes/kompose/releases/latest/download/kompose-linux-amd64 -o kompose
|
|
chmod +x kompose
|
|
sudo mv kompose /usr/local/bin
|
|
|
|
# Convert existing docker-compose files
|
|
cd ~/homelab/Atlantis/uptime-kuma
|
|
kompose convert -f docker-compose.yml
|
|
|
|
# Review and modify generated manifests
|
|
# Add ingress, persistent volumes, etc.
|
|
```
|
|
|
|
#### **Example: Uptime Kuma on Kubernetes**
|
|
```bash
|
|
cat <<EOF | kubectl apply -f -
|
|
apiVersion: apps/v1
|
|
kind: Deployment
|
|
metadata:
|
|
name: uptime-kuma
|
|
namespace: monitoring
|
|
spec:
|
|
replicas: 1
|
|
selector:
|
|
matchLabels:
|
|
app: uptime-kuma
|
|
template:
|
|
metadata:
|
|
labels:
|
|
app: uptime-kuma
|
|
spec:
|
|
containers:
|
|
- name: uptime-kuma
|
|
image: louislam/uptime-kuma:1
|
|
ports:
|
|
- containerPort: 3001
|
|
volumeMounts:
|
|
- name: data
|
|
mountPath: /app/data
|
|
resources:
|
|
requests:
|
|
memory: "256Mi"
|
|
cpu: "100m"
|
|
limits:
|
|
memory: "512Mi"
|
|
cpu: "500m"
|
|
volumes:
|
|
- name: data
|
|
persistentVolumeClaim:
|
|
claimName: uptime-kuma-data
|
|
---
|
|
apiVersion: v1
|
|
kind: PersistentVolumeClaim
|
|
metadata:
|
|
name: uptime-kuma-data
|
|
namespace: monitoring
|
|
spec:
|
|
accessModes:
|
|
- ReadWriteOnce
|
|
storageClassName: longhorn-fast
|
|
resources:
|
|
requests:
|
|
storage: 5Gi
|
|
---
|
|
apiVersion: v1
|
|
kind: Service
|
|
metadata:
|
|
name: uptime-kuma
|
|
namespace: monitoring
|
|
spec:
|
|
selector:
|
|
app: uptime-kuma
|
|
ports:
|
|
- protocol: TCP
|
|
port: 3001
|
|
targetPort: 3001
|
|
---
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: Ingress
|
|
metadata:
|
|
name: uptime-kuma
|
|
namespace: monitoring
|
|
annotations:
|
|
kubernetes.io/ingress.class: nginx
|
|
cert-manager.io/cluster-issuer: letsencrypt-prod
|
|
spec:
|
|
tls:
|
|
- hosts:
|
|
- uptime.k8s.vish.local
|
|
secretName: uptime-kuma-tls
|
|
rules:
|
|
- host: uptime.k8s.vish.local
|
|
http:
|
|
paths:
|
|
- path: /
|
|
pathType: Prefix
|
|
backend:
|
|
service:
|
|
name: uptime-kuma
|
|
port:
|
|
number: 3001
|
|
EOF
|
|
```
|
|
|
|
### **Helm Charts for Complex Applications**
|
|
|
|
#### **Create Custom Helm Chart**
|
|
```bash
|
|
# Create new Helm chart
|
|
helm create homelab-app
|
|
|
|
# Directory structure:
|
|
homelab-app/
|
|
├── Chart.yaml
|
|
├── values.yaml
|
|
├── templates/
|
|
│ ├── deployment.yaml
|
|
│ ├── service.yaml
|
|
│ ├── ingress.yaml
|
|
│ └── pvc.yaml
|
|
└── charts/
|
|
|
|
# Example values.yaml for homelab services:
|
|
cat <<EOF > homelab-app/values.yaml
|
|
replicaCount: 1
|
|
|
|
image:
|
|
repository: nginx
|
|
tag: latest
|
|
pullPolicy: IfNotPresent
|
|
|
|
service:
|
|
type: ClusterIP
|
|
port: 80
|
|
|
|
ingress:
|
|
enabled: true
|
|
className: nginx
|
|
annotations:
|
|
cert-manager.io/cluster-issuer: letsencrypt-prod
|
|
hosts:
|
|
- host: app.k8s.vish.local
|
|
paths:
|
|
- path: /
|
|
pathType: Prefix
|
|
tls:
|
|
- secretName: app-tls
|
|
hosts:
|
|
- app.k8s.vish.local
|
|
|
|
persistence:
|
|
enabled: true
|
|
storageClass: longhorn-fast
|
|
size: 10Gi
|
|
|
|
resources:
|
|
limits:
|
|
cpu: 500m
|
|
memory: 512Mi
|
|
requests:
|
|
cpu: 100m
|
|
memory: 128Mi
|
|
EOF
|
|
|
|
# Install chart
|
|
helm install my-app ./homelab-app
|
|
```
|
|
|
|
---
|
|
|
|
## 🔒 Security Configuration
|
|
|
|
### **Pod Security Standards**
|
|
```bash
|
|
# Create Pod Security Policy
|
|
cat <<EOF | kubectl apply -f -
|
|
apiVersion: v1
|
|
kind: Namespace
|
|
metadata:
|
|
name: secure-apps
|
|
labels:
|
|
pod-security.kubernetes.io/enforce: restricted
|
|
pod-security.kubernetes.io/audit: restricted
|
|
pod-security.kubernetes.io/warn: restricted
|
|
EOF
|
|
```
|
|
|
|
### **Network Policies**
|
|
```bash
|
|
# Example: Deny all traffic by default
|
|
cat <<EOF | kubectl apply -f -
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: NetworkPolicy
|
|
metadata:
|
|
name: default-deny-all
|
|
namespace: default
|
|
spec:
|
|
podSelector: {}
|
|
policyTypes:
|
|
- Ingress
|
|
- Egress
|
|
---
|
|
# Allow ingress traffic
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: NetworkPolicy
|
|
metadata:
|
|
name: allow-ingress
|
|
namespace: default
|
|
spec:
|
|
podSelector:
|
|
matchLabels:
|
|
app: web-app
|
|
policyTypes:
|
|
- Ingress
|
|
ingress:
|
|
- from:
|
|
- namespaceSelector:
|
|
matchLabels:
|
|
name: ingress-nginx
|
|
ports:
|
|
- protocol: TCP
|
|
port: 80
|
|
EOF
|
|
```
|
|
|
|
### **RBAC Configuration**
|
|
```bash
|
|
# Create service account for applications
|
|
cat <<EOF | kubectl apply -f -
|
|
apiVersion: v1
|
|
kind: ServiceAccount
|
|
metadata:
|
|
name: homelab-app
|
|
namespace: default
|
|
---
|
|
apiVersion: rbac.authorization.k8s.io/v1
|
|
kind: Role
|
|
metadata:
|
|
namespace: default
|
|
name: homelab-app-role
|
|
rules:
|
|
- apiGroups: [""]
|
|
resources: ["pods", "services"]
|
|
verbs: ["get", "list", "watch"]
|
|
---
|
|
apiVersion: rbac.authorization.k8s.io/v1
|
|
kind: RoleBinding
|
|
metadata:
|
|
name: homelab-app-binding
|
|
namespace: default
|
|
subjects:
|
|
- kind: ServiceAccount
|
|
name: homelab-app
|
|
namespace: default
|
|
roleRef:
|
|
kind: Role
|
|
name: homelab-app-role
|
|
apiGroup: rbac.authorization.k8s.io
|
|
EOF
|
|
```
|
|
|
|
---
|
|
|
|
## 🔧 Cluster Management
|
|
|
|
### **Backup and Restore**
|
|
|
|
#### **etcd Backup**
|
|
```bash
|
|
# Create backup script
|
|
cat <<EOF > /usr/local/bin/etcd-backup.sh
|
|
#!/bin/bash
|
|
ETCDCTL_API=3 etcdctl snapshot save /backup/etcd-snapshot-\$(date +%Y%m%d-%H%M%S).db \
|
|
--endpoints=https://127.0.0.1:2379 \
|
|
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
|
|
--cert=/etc/kubernetes/pki/etcd/server.crt \
|
|
--key=/etc/kubernetes/pki/etcd/server.key
|
|
|
|
# Keep only last 7 days of backups
|
|
find /backup -name "etcd-snapshot-*.db" -mtime +7 -delete
|
|
EOF
|
|
|
|
chmod +x /usr/local/bin/etcd-backup.sh
|
|
|
|
# Schedule daily backups
|
|
echo "0 2 * * * /usr/local/bin/etcd-backup.sh" | crontab -
|
|
```
|
|
|
|
#### **Velero for Application Backup**
|
|
```bash
|
|
# Install Velero CLI
|
|
wget https://github.com/vmware-tanzu/velero/releases/latest/download/velero-linux-amd64.tar.gz
|
|
tar -xzf velero-linux-amd64.tar.gz
|
|
sudo mv velero-*/velero /usr/local/bin/
|
|
|
|
# Install Velero server (using MinIO for storage)
|
|
velero install \
|
|
--provider aws \
|
|
--plugins velero/velero-plugin-for-aws:v1.8.0 \
|
|
--bucket velero-backups \
|
|
--secret-file ./credentials-velero \
|
|
--use-volume-snapshots=false \
|
|
--backup-location-config region=minio,s3ForcePathStyle="true",s3Url=http://minio.vish.local:9000
|
|
|
|
# Create backup schedule
|
|
velero schedule create daily-backup --schedule="0 1 * * *"
|
|
```
|
|
|
|
### **Cluster Upgrades**
|
|
```bash
|
|
# Upgrade control plane nodes (one at a time)
|
|
# 1. Drain node
|
|
kubectl drain k8s-master-01 --ignore-daemonsets --delete-emptydir-data
|
|
|
|
# 2. Upgrade kubeadm
|
|
sudo apt update
|
|
sudo apt-mark unhold kubeadm
|
|
sudo apt install kubeadm=1.29.x-00
|
|
sudo apt-mark hold kubeadm
|
|
|
|
# 3. Upgrade cluster
|
|
sudo kubeadm upgrade plan
|
|
sudo kubeadm upgrade apply v1.29.x
|
|
|
|
# 4. Upgrade kubelet and kubectl
|
|
sudo apt-mark unhold kubelet kubectl
|
|
sudo apt install kubelet=1.29.x-00 kubectl=1.29.x-00
|
|
sudo apt-mark hold kubelet kubectl
|
|
sudo systemctl daemon-reload
|
|
sudo systemctl restart kubelet
|
|
|
|
# 5. Uncordon node
|
|
kubectl uncordon k8s-master-01
|
|
|
|
# Repeat for other control plane nodes and workers
|
|
```
|
|
|
|
### **Troubleshooting**
|
|
```bash
|
|
# Common troubleshooting commands
|
|
kubectl get nodes -o wide
|
|
kubectl get pods --all-namespaces
|
|
kubectl describe node NODE_NAME
|
|
kubectl logs -n kube-system POD_NAME
|
|
|
|
# Check cluster health
|
|
kubectl get componentstatuses
|
|
kubectl cluster-info
|
|
kubectl get events --sort-by=.metadata.creationTimestamp
|
|
|
|
# Debug networking
|
|
kubectl run debug --image=nicolaka/netshoot -it --rm -- /bin/bash
|
|
```
|
|
|
|
---
|
|
|
|
## 📋 Migration Strategy
|
|
|
|
### **Phase 1: Cluster Setup**
|
|
```bash
|
|
☐ Plan cluster architecture and resource allocation
|
|
☐ Install Kubernetes on all nodes
|
|
☐ Configure networking and storage
|
|
☐ Install monitoring and logging
|
|
☐ Set up backup and disaster recovery
|
|
☐ Configure security policies
|
|
☐ Test cluster functionality
|
|
```
|
|
|
|
### **Phase 2: Service Migration**
|
|
```bash
|
|
☐ Identify services suitable for Kubernetes
|
|
☐ Convert Docker Compose to Kubernetes manifests
|
|
☐ Create Helm charts for complex applications
|
|
☐ Set up ingress and SSL certificates
|
|
☐ Configure persistent storage
|
|
☐ Test service functionality
|
|
☐ Update DNS and load balancing
|
|
```
|
|
|
|
### **Phase 3: Production Cutover**
|
|
```bash
|
|
☐ Migrate non-critical services first
|
|
☐ Update monitoring and alerting
|
|
☐ Test disaster recovery procedures
|
|
☐ Migrate critical services during maintenance window
|
|
☐ Update documentation and runbooks
|
|
☐ Train team on Kubernetes operations
|
|
☐ Decommission old Docker Compose services
|
|
```
|
|
|
|
---
|
|
|
|
## 🔗 Related Documentation
|
|
|
|
- [Network Architecture](networking.md) - Network design and VLANs for Kubernetes
|
|
- [Ubiquiti Enterprise Setup](ubiquiti-enterprise-setup.md) - Enterprise networking for cluster infrastructure
|
|
- [Laptop Travel Setup](laptop-travel-setup.md) - Remote access to Kubernetes cluster
|
|
- [Tailscale Setup Guide](tailscale-setup-guide.md) - VPN access to cluster services
|
|
- [Disaster Recovery Guide](../troubleshooting/disaster-recovery.md) - Cluster backup and recovery
|
|
- [Security Model](security.md) - Security architecture and policies
|
|
|
|
---
|
|
|
|
**💡 Pro Tip**: Start with a small, non-critical service migration to Kubernetes. Learn the platform gradually before moving mission-critical services. Kubernetes has a steep learning curve, but the benefits of container orchestration, scaling, and management are worth the investment for a growing homelab! |