目錄表

Rancher 的異常排解紀錄

無法正確啟動的判別方式

  1. 找出哪個 rancher pod 是 leader

    $ kubectl describe configMap cattle-controllers -n kube-system
    Name:         cattle-controllers
    Namespace:    kube-system
    Labels:       <none>
    Annotations:  control-plane.alpha.kubernetes.io/leader:
                    {"holderIdentity":"rancher-98d8d5cf5-hbjjv","leaseDurationSeconds":45,"acquireTime":"2021-09-08T06:40:25Z","renewTime":"2021-09-08T07:02:5...
    
    Data
    ====
    Events:  <none>

  2. 可以看到目前的 leader : rancher-98d8d5cf5-hbjjv , 所以可以看一下這 pod 的紀錄

    $ kubectl logs rancher-98d8d5cf5-hbjjv -n cattle-system
    2021/09/08 06:38:27 [INFO] Rancher version v2.4.15 (cdb64d640) is starting
    2021/09/08 06:38:27 [INFO] Rancher arguments {ACMEDomains:[] AddLocal:auto Embedded:false HTTPListenPort:80 HTTPSListenPort:443 K8sMode:auto Debug:false Trace:false NoCACerts:false AuditLog
    Path:/var/log/auditlog/rancher-api-audit.log AuditLogMaxage:10 AuditLogMaxsize:100 AuditLogMaxbackup:10 AuditLevel:0 Features:}
    2021/09/08 06:38:27 [INFO] Listening on /tmp/log.sock
    I0908 06:38:27.719747       6 http.go:122] HTTP2 has been explicitly disabled
    :
    2021/09/08 06:56:18 [ERROR] AppController p-gn54t/test-20210831-master-sq [helm-controller] failed with : Get "https://10.43.0.1:443/apis/project.cattle.io/v3/namespaces/p-gn54t/apprevisions?labelSelector=io.cattle.field%!F(MISSING)appId%!D(MISSING)test-20210831-master-sq&timeout=30s": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
    2021/09/08 06:57:04 [ERROR] PipelineExecutionController p-gn54t/p-qp9qq-1 [pipeline-execution-controller] failed with : pipeline.project.cattle.io "p-gn54t/p-qp9qq" not found
    2021/09/08 07:01:20 [ERROR] PipelineExecutionController p-gn54t/p-qp9qq-1 [pipeline-execution-controller] failed with : pipeline.project.cattle.io "p-gn54t/p-qp9qq" not found

不小心砍了 pipeline 的 jenlins POD

Rancher 異常無法啟動重新安裝

修改 Rancher server url 的方式