====== 安裝 K3s + Rancher WebUI ======
* 主機配置 :
* VM1(Master) : Ubuntu 24.04 / 2vCore+4GRAM+60GSSD / 192.168.1.171 (rancher.ichiayi.com)
* VM2(Worker) : Ubuntu 24.04 / 2vCore+4GRAM+60GSSD / 192.168.1.172
* VM3(Worker) : Ubuntu 24.04 / 2vCore+4GRAM+60GSSD / 192.168.1.173
===== 安裝程序 =====
==== 前置準備 (所有節點) ====
- 更新系統套件 sudo apt update && sudo apt upgrade -y
- 設定主機名稱與 hosts 檔案
# VM1 (Master)
sudo hostnamectl set-hostname k3s-master-171
# VM2 (Worker)
sudo hostnamectl set-hostname k3s-worker-172
# VM3 (Worker)
sudo hostnamectl set-hostname k3s-worker-173
- 編輯 /etc/hosts (所有節點)
sudo vi /etc/hosts
Exp. 192.168.1.171 ~ 173
192.168.1.171 k3s-master-171
192.168.1.172 k3s-worker-172
192.168.1.173 k3s-worker-173
- 關閉 Swap (所有節點)
sudo swapoff -a
sudo sed -i '/swap/!b; /^#/b; s/^/#/' /etc/fstab
- 設定防火牆規則 (若有啟用 UFW)
# Master 節點
sudo ufw allow 6443/tcp # Kubernetes API
sudo ufw allow 2379:2380/tcp # etcd
sudo ufw allow 10250/tcp # Kubelet
sudo ufw allow 80/tcp # Rancher HTTP
sudo ufw allow 443/tcp # Rancher HTTPS
# Worker 節點
sudo ufw allow 10250/tcp # Kubelet
sudo ufw allow 30000:32767/tcp # NodePort Services
==== 安裝 K3s ====
=== Master 節點 (VM1) ===
- 安裝 K3s Server
curl -sfL https://get.k3s.io | sh -s - server \
--write-kubeconfig-mode 644 \
--disable traefik
* 停用內建 Traefik (Rancher 會使用自己的 Ingress)
- 驗證安裝
sudo systemctl status k3s
kubectl get nodes
- 取得 Node Token (用於 Worker 加入)
sudo cat /var/lib/rancher/k3s/server/node-token
記錄此 Token,稍後 Worker 節點會使用
=== Worker 節點 (VM2 & VM3) ===
- 安裝 K3s Agent Exp. Master 節點的 IP 位址:192.168.1.171 , 從 Master 取得的 Token xxxxxxxxxx
curl -sfL https://get.k3s.io | K3S_URL=https://192.168.1.171:6443 \
K3S_TOKEN=xxxxxxxxxx sh -
- 驗證 Worker 加入狀態, 在 Master 節點執行
kubectl get nodes
應該會看到三個節點都處於 Ready 狀態
==== 安裝 Rancher WebUI ====
=== 在 Master 節點 (VM1) 執行 ===
- 安裝 Helm
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
helm version
- 設定 K3s 權限
# 1. 設定永久環境變數
echo 'export KUBECONFIG=/etc/rancher/k3s/k3s.yaml' >> ~/.bashrc
# 2. 重新載入配置
source ~/.bashrc
# 3. 驗證
kubectl version
kubectl get nodes
helm version
helm list -A
- 新增 Rancher Helm Repository
helm repo add rancher-stable https://releases.rancher.com/server-charts/stable
helm repo update
- 建立 Rancher 命名空間
kubectl create namespace cattle-system
- 安裝 cert-manager (用於 SSL 憑證管理)
helm repo add jetstack https://charts.jetstack.io
helm repo update
helm install cert-manager jetstack/cert-manager \
--namespace cert-manager \
--create-namespace \
--set crds.enabled=true
- 驗證 cert-manager 安裝
kubectl get pods --namespace cert-manager
kubectl get crd | grep cert-manager
等待所有 Pod 都處於 Running 狀態。
- 安裝 Nginx Ingress Controller
# 安裝 Nginx Ingress
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update
helm install ingress-nginx ingress-nginx/ingress-nginx \
--namespace ingress-nginx \
--create-namespace \
--set controller.hostNetwork=true \
--set controller.kind=DaemonSet \
--set controller.service.type=ClusterIP
# 等待部署完成
kubectl wait --namespace ingress-nginx \
--for=condition=ready pod \
--selector=app.kubernetes.io/component=controller \
--timeout=120s
# 檢查 Nginx Ingress Pod
kubectl get pods -n ingress-nginx
- 安裝 Rancher Exp. hostname : rancher.ichiayi.com, 預設 admin 密碼 admin@123
helm install rancher rancher-stable/rancher \
--namespace cattle-system \
--set hostname=rancher.ichiayi.com \
--set replicas=1 \
--set ingress.tls.source=secret \
--set bootstrapPassword="admin@123" \
--set ingress.ingressClassName=nginx
- 驗證 Rancher 部署狀態
kubectl -n cattle-system rollout status deploy/rancher
kubectl -n cattle-system get pods
=== 存取 Rancher WebUI ===
- 瀏覽器開啟 Rancher URL Exp. https://rancher.ichiayi.com
- 登入 Rancher
* 使用預設密碼: admin@123
* 首次登入會要求設定新密碼
- 查看叢集狀態
* 登入後會看到 local 叢集 (即當前 K3s 叢集),點選進入可管理所有節點、工作負載和服務。
==== 常用管理指令 ====
* K3s 服務管理
# 查看服務狀態
sudo systemctl status k3s # Master
sudo systemctl status k3s-agent # Worker
# 重啟服務
sudo systemctl restart k3s
sudo systemctl restart k3s-agent
# 停止服務
sudo systemctl stop k3s
sudo systemctl stop k3s-agent
* Kubectl 指令
# 查看節點
kubectl get nodes -o wide
# 查看所有 Pod
kubectl get pods --all-namespaces
# 查看 Rancher 狀態
kubectl -n cattle-system get all
* 解除安裝
# Master 節點
/usr/local/bin/k3s-uninstall.sh
# Worker 節點
/usr/local/bin/k3s-agent-uninstall.sh
===== Storage 設定 =====
==== NFS Subdir External Provisioner (動態佈建) ====
* 已經有 NFS Server 提供給 K3s 共用儲存空間 Exp. nfs - 192.168.1.159
* 所有 K3s 節點都需要安裝 nfs-common sudo apt-get update && sudo apt-get install -y nfs-common
* 安裝 NFS Provisioner(產生乾淨不含 UUID 的路徑)
# 使用 Helm 安裝
helm repo add nfs-subdir-external-provisioner https://kubernetes-sigs.github.io/nfs-subdir-external-provisioner/
helm install nfs-subdir-external-provisioner nfs-subdir-external-provisioner/nfs-subdir-external-provisioner \
--set nfs.server=192.168.1.159 \
--set nfs.path=/swarmdata \
--set nfs.mountOptions='{nfsvers=4,rw,noatime,rsize=8192,wsize=8192,tcp,timeo=14}' \
--set storageClass.name=nfs-client \
--set storageClass.defaultClass=false \
--set storageClass.pathPattern='${.PVC.namespace}/${.PVC.name}' \
--set storageClass.archiveOnDelete=false
=== 提供給 app1 一個可以永久儲存的空間 ===
- 為 app1 建立專用的 PVC Exp. app1-pvc.yaml
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: app1-data-pvc
namespace: default # 改成您的 namespace
spec:
storageClassName: nfs-client
accessModes:
- ReadWriteMany
resources:
requests:
storage: 10Gi # 根據需求調整大小(只是註記, 無法真的限制)
kubectl apply -f app1-pvc.yaml
- 驗證 PV 和 PVC 狀態
kubectl get pv
kubectl get pvc -n default
Exp.
jonathan@k3s-master-171:~/app1$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS VOLUMEATTRIBUTESCLASS REASON AGE
pv-nfs-subdir-external-provisioner 10Mi RWO Retain Bound default/pvc-nfs-subdir-external-provisioner 4m10s
pvc-ea1739ec-04dd-4549-952b-490bf07ec186 10Gi RWX Delete Bound default/app1-data-pvc nfs-client 56s
jonathan@k3s-master-171:~/app1$ kubectl get pvc -n default
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
app1-data-pvc Bound pvc-ea1739ec-04dd-4549-952b-490bf07ec186 10Gi RWX nfs-client 63s
pvc-nfs-subdir-external-provisioner Bound pv-nfs-subdir-external-provisioner 10Mi RWO 4m17s
在 NFS Server 上建立的路徑為 /sharenfsdir/{PVC.namespace}/${.PVC.name} Exp.
swarm-nfs-159:/swarmdata# tree | more
.
├── default
│ └── app1-data-pvc
:
- 部署應用 Exp. app1-deployment.yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: app1
namespace: default
labels:
app: app1
spec:
replicas: 1
selector:
matchLabels:
app: app1
template:
metadata:
labels:
app: app1
spec:
containers:
- name: app1
image: busybox:latest
command: ["/bin/sh"]
args:
- "-c"
- |
# 建立測試檔案
echo "Container started at $(date)" > /data/startup.log
echo "DATA_DIR: $DATA_DIR" >> /data/startup.log
# 每 60 秒寫入一次心跳
while true; do
echo "Heartbeat: $(date)" >> /data/heartbeat.log
ls -la /data/ > /data/file-list.txt
sleep 60
done
volumeMounts:
- name: app1-data
mountPath: /data
env:
- name: DATA_DIR
value: /data
resources:
requests:
memory: "64Mi"
cpu: "100m"
limits:
memory: "128Mi"
cpu: "200m"
volumes:
- name: app1-data
persistentVolumeClaim:
claimName: app1-data-pvc
kubectl apply -f app1-deployment.yaml
- 檢查 Pod 狀態
kubectl get pods -n default
kubectl describe pod -n default
查看 NFS 目錄內是否有正確產生檔案
swarm-nfs-159:/swarmdata/default/app1-data-pvc# ls -lt
total 12
-rw-r--r-- 1 root root 339 Nov 26 11:40 file-list.txt
-rw-r--r-- 1 root root 80 Nov 26 11:40 heartbeat.log
-rw-r--r-- 1 root root 66 Nov 26 11:39 startup.log
swarm-nfs-159:/swarmdata/default/app1-data-pvc# cat startup.log
Container started at Wed Nov 26 03:39:31 UTC 2025
DATA_DIR: /data
swarm-nfs-159:/swarmdata/default/app1-data-pvc# cat heartbeat.log
Heartbeat: Wed Nov 26 03:39:31 UTC 2025
Heartbeat: Wed Nov 26 03:40:31 UTC 2025
swarm-nfs-159:/swarmdata/default/app1-data-pvc# cat file-list.txt
total 16
drwxrwxrwx 2 root root 4096 Nov 26 03:39 .
drwxr-xr-x 1 root root 4096 Nov 26 03:39 ..
-rw-r--r-- 1 root root 0 Nov 26 03:40 file-list.txt
-rw-r--r-- 1 root root 80 Nov 26 03:40 heartbeat.log
-rw-r--r-- 1 root root 66 Nov 26 03:39 startup.log
- 驗證與除錯
# 檢查 PVC 是否綁定成功
kubectl get pvc app1-data-pvc -n default
# 查看 Pod-name
kubectl get pods -n default | grep app1
# 查看 Pod 內的掛載情況 Exp. app1-584b58d766-qwrqk
kubectl exec -it app1-584b58d766-qwrqk -n default -- df -h
# 測試寫入
kubectl exec -it app1-584b58d766-qwrqk -n default -- sh -c "echo 'test' > /data/test.txt"
# 在 NFS Server 上確認
# 檢查檔案是否出現在 192.168.1.159:/swarmdata/default/app1-data-pvc/test.txt
=== 移除設定給 app1 使用的永久儲存空間 ===
- 停止使用 PVC 的應用
# 先刪除正在使用 PVC 的 Deployment/Pod
kubectl delete deployment app1 -n default
# 確認 Pod 已完全終止
kubectl get pods -n default | grep app1
# 如果有 StatefulSet 或其他資源也在使用,也需要刪除
kubectl get all -n default | grep app1
- 刪除 PVC
# 刪除 PVC
kubectl delete pvc app1-data-pvc -n default
# 檢查 PVC 狀態
kubectl get pvc -n default
如果 PVC 卡在 Terminating 狀態:
# 查看是否有 finalizer 阻止刪除
kubectl get pvc app1-data-pvc -n default -o yaml | grep finalizers
# 如果需要強制刪除 (謹慎使用)
kubectl patch pvc app1-data-pvc -n default -p '{"metadata":{"finalizers":null}}'
- 清理 NFS Server 上的檔案
* 進入 NFS Server 到將產生給 app1 使用的目錄移除
* 檔案通常在 /nfssharedir/namespace/pvc-name/ 下 Exp. /swarmdata/default/app1-data-pvc/
===== 常見問題 =====
==== 1. 如何乾淨移除 Rancher Web UI ====
- 使用 helm uninstall 移除
helm uninstall rancher -n cattle-system
# 可能會出現 Error: uninstallation completed with 1 error(s): 1 error occurred: * job rancher-post-delete failed: BackoffLimitExceeded
- 使用 kubectl 命令移除與檢查
# Step 1:刪除卡住的 post-delete job
kubectl -n cattle-system delete job rancher-post-delete
# Step 2:手動刪除 Rancher 相關所有資源
# 刪除 deployment / pod
kubectl -n cattle-system delete deployment rancher
kubectl -n cattle-system delete pod -l app=rancher
# 刪 webhook
kubectl -n cattle-system delete deployment rancher-webhook
# 刪所有 secret(⚠️不會刪掉 cluster,其它 workload 都不會受影響)
kubectl -n cattle-system delete secret --all
# 刪 configmap
kubectl -n cattle-system delete configmap --all
# Step 3:確保 namespace 乾淨
kubectl get all -n cattle-system
# 應該只剩下 K3s 建立的一個 service, 如果有其他的 Job or Pod 要全部刪除
==== 2. 如何更新 Rancher Web UI SSL 憑證 ====
* 以下是以 cert-manager 管理透過 Cloudflare DNS 認證取得 Let's Encrypt 的憑證做法
- 取得 Cloudflare API Token Exp. 具有編輯 DNS : ichiayi.com 權限的 Token Exp. xxxxxxcfapitkoenxxxxxx
- 創建 Cloudflare API Token Secret
kubectl create secret generic cloudflare-api-token-secret \
--from-literal=api-token=xxxxxxcfapitkoenxxxxxx \
-n cert-manager
- 創建 ClusterIssuer Exp. letsencrypt-cloudflare-issuer.yaml
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
# Let's Encrypt production server
server: https://acme-v02.api.letsencrypt.org/directory
email: your-email@ichiayi.com # 修改為您的 email
privateKeySecretRef:
name: letsencrypt-prod
solvers:
- dns01:
cloudflare:
apiTokenSecretRef:
name: cloudflare-api-token-secret
key: api-token
套用設定
kubectl apply -f letsencrypt-cloudflare-issuer.yaml
- 為 Rancher 創建 Certificate Exp. rancher-certificate.yaml
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: rancher-tls
namespace: cattle-system
spec:
secretName: tls-rancher-ingress # Rancher 使用的 secret 名稱
issuerRef:
name: letsencrypt-prod
kind: ClusterIssuer
commonName: rancher.ichiayi.com # 修改為您的域名
dnsNames:
- rancher.ichiayi.com # 修改為您的域名
套用設定
kubectl apply -f rancher-certificate.yaml
- 驗證憑證狀態
# 查看 Certificate 狀態
kubectl get certificate -n cattle-system
# 查看詳細資訊
kubectl describe certificate rancher-tls -n cattle-system
# 查看 cert-manager 日誌
kubectl logs -n cert-manager -l app=cert-manager -f
- 查看憑證續期狀態
kubectl get certificate -n cattle-system -w
=== 2-1 如何建立給其他服務通用的 SSL 憑證 ===
- 在 DNS 建立一筆萬用記錄, 對應到 K3s Node 的 IP Exp. *.k3s.ichiayi.com -> 192.168.1.171
- 沿用上面的 Cloudflare API Token Secret / ClusterIssuer
- 建立萬用字元憑證 Exp. *.k3s.ichiayi.com -> k3s-certificate.yaml
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: wildcard-k3s-ichiayi-com
namespace: default # 或你要使用的 namespace
spec:
secretName: wildcard-k3s-ichiayi-com-tls
issuerRef:
name: letsencrypt-prod
kind: ClusterIssuer
commonName: "*.k3s.ichiayi.com"
dnsNames:
- "*.k3s.ichiayi.com"
kubectl apply -f k3s-certificate.yaml
==== 3. 如何備份 Rancher Web UI ====
* 參考 - https://ranchermanager.docs.rancher.com/v2.13/how-to-guides/new-user-guides/backup-restore-and-disaster-recovery/back-up-rancher
- 透過 Web UI 的 App Chart 安裝 Rancher Backups \\ {{:tech:螢幕擷取畫面_2025-11-26_122727.png?600|}}
- 在選單新增的 Rancher Backups 選項點選 Buckups -> Create -> 選擇備份的目標 Exp. StorageClasses -> Edit YAML 設定每 8 小時備份一次 \\ {{:tech:螢幕擷取畫面_2025-11-26_134759.png|}} \\ {{:tech:螢幕擷取畫面_2025-11-26_123038.png?1000}}
==== 4. 如何進行 Rancher Web UI 更新 ====
- 更新 Helm Repositoryhelm repo update
- 查看可用的版本helm search repo rancher-stable/rancher --versions
- 備份當前配置kubectl get all -n cattle-system -o yaml > rancher-backup.yaml
- 執行更新
helm upgrade rancher rancher-stable/rancher \
--namespace cattle-system \
--reuse-values
- 驗證更新狀態
kubectl -n cattle-system rollout status deploy/rancher
kubectl -n cattle-system get pods
* 更新過程中 Rancher UI 會暫時無法訪問
==== 5. 如何設定與取消 K3s 自動更新 ====
=== 設定 K3s 自動更新 ===
- 安裝 System Upgrade Controller
kubectl apply -f https://github.com/rancher/system-upgrade-controller/releases/latest/download/system-upgrade-controller.yaml
- 建立自動升級計畫(監看 k3s 的 stable channel 版本自動升級)
cat <
- 查看升級進度
# 查看升級計畫
kubectl get plans -n system-upgrade
# 查看升級任務
kubectl get jobs -n system-upgrade
# 查看節點狀態
watch kubectl get nodes
* ++看執行命令的輸出結果|
jonathan@k3s-master-171:~$ kubectl get plans -n system-upgrade
NAME IMAGE CHANNEL VERSION COMPLETE MESSAGE
agent-plan rancher/k3s-upgrade https://update.k3s.io/v1-release/channels/stable False
server-plan rancher/k3s-upgrade https://update.k3s.io/v1-release/channels/stable False
jonathan@k3s-master-171:~$ kubectl get jobs -n system-upgrade
NAME STATUS COMPLETIONS DURATION AGE
apply-agent-plan-on-k3s-worker-173-with-776e91b05dc4d9c78-42442 Running 0/1 27s 27s
apply-server-plan-on-k3s-master-171-with-776e91b05dc4d9c7-b57b4 Running 0/1 27s 27s
jonathan@k3s-master-171:~$ kubectl get jobs -n system-upgrade
NAME STATUS COMPLETIONS DURATION AGE
apply-agent-plan-on-k3s-worker-172-with-776e91b05dc4d9c78-0fa71 Running 0/1 42s 42s
apply-agent-plan-on-k3s-worker-173-with-776e91b05dc4d9c78-42442 Complete 1/1 2m59s 3m42s
apply-server-plan-on-k3s-master-171-with-776e91b05dc4d9c7-b57b4 Complete 1/1 80s 3m42s
jonathan@k3s-master-171:~$ kubectl get jobs -n system-upgrade
NAME STATUS COMPLETIONS DURATION AGE
apply-agent-plan-on-k3s-worker-172-with-776e91b05dc4d9c78-0fa71 Complete 1/1 87s 3m57s
apply-agent-plan-on-k3s-worker-173-with-776e91b05dc4d9c78-42442 Complete 1/1 2m59s 6m57s
apply-server-plan-on-k3s-master-171-with-776e91b05dc4d9c7-b57b4 Complete 1/1 80s 6m57s
++
* ++看 Rancher Cluster Nodes 的畫面|{{:tech:螢幕擷取畫面_2025-12-13_094446.png|}}\\{{:tech:螢幕擷取畫面_2025-12-13_100652.png|}}++
=== 取消 K3s 自動更新 ===
- 刪除 Plan(停止所有自動升級)kubectl delete plan server-plan agent-plan -n system-upgrade
- 修改為固定版本 Exp. v1.33.6+k3s1(不再自動追蹤新版本)kubectl patch plan server-plan -n system-upgrade --type=merge -p '{"spec":{"version":"v1.33.6+k3s1","channel":null}}'
- 刪除整個 controller(完全停用)kubectl delete ns system-upgrade
==== 6. 如何設定 K3s 自動更新結果透過 Discord 通知 ====
- 取得 Discord Webhook URL Exp. https://discord.com/api/webhooks/144xxxxxxxxxxxxxxxx/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxV5ffPyEp
- 修改配置並部署
# 下載 k3s-discord-notifier.yaml
curl -o k3s-discord-notifier.yaml https://raw.githubusercontent.com/tryweb/k3s/refs/heads/main/systools/k3s-discord-notifier.yaml
# 替換你的 Discord Webhook URL Exp. https://discord.com/api/webhooks/144xxxxxxxxxxxxxxxx/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxV5ffPyEp
sed -i 's|https://discord.com/api/webhooks/YOUR_WEBHOOK_ID/YOUR_WEBHOOK_TOKEN|https://discord.com/api/webhooks/144xxxxxxxxxxxxxxxx/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxV5ffPyEp|' k3s-discord-notifier.yaml
# 修改叢集名稱(可選) Exp. ichiayi K3s
sed -i 's|我的 K3s 叢集|ichiayi K3s|' k3s-discord-notifier.yaml
# 部署 Discord 通知
kubectl apply -f k3s-discord-notifier.yaml
- 驗證部署
# 檢查 notifier 是否運行
kubectl get deployment -n system-upgrade k3s-upgrade-notifier
# 查看日誌
kubectl logs -n system-upgrade -l app=k3s-upgrade-notifier -f
# 測試 Discord 升級成功通知
cat <
* Discord 頻道應該可以看到這樣的測試通知訊息 \\ {{:tech:螢幕擷取畫面_2025-12-13_111255.png?1000|}}
==== 7. 如何確認目前K3s 穩定版最新的版本 ====
* $ curl -s https://update.k3s.io/v1-release/channels/stable
Found.
* 穩定版 : **v1.34.5+k3s1**
==== 8. 如何重啟 K3s cluster 主機 ====
* 原則 : 先重啟 Server 完成恢復服務後, 再來重啟 Worker(Agent)
* 可參考執行 [[tech/k3s/k3s-reboot-manager|k3s-reboot-manager.sh]] 這 script 來重啟
{{tag>rancher k8s k3s}}