创建部署问题时,请参考下面模板,你提供的信息越多,越容易及时获得解答。如果未按模板创建问题,管理员有权关闭问题。
确保帖子格式清晰易读,用 markdown code block 语法格式化代码块。
你只花一分钟创建的问题,不能指望别人花上半个小时给你解答。
操作系统信息
虚拟机,CentOS7.8,内存32G
Kubernetes版本信息
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.6", GitCommit:"ad3338546da947756e8a88aa6822e9c11e7eac22", GitTreeState:"clean", BuildDate:"2022-04-14T08:49:13Z", GoVersion:"go1.17.9", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.6", GitCommit:"ad3338546da947756e8a88aa6822e9c11e7eac22", GitTreeState:"clean", BuildDate:"2022-04-14T08:43:11Z", GoVersion:"go1.17.9", Compiler:"gc", Platform:"linux/amd64"}
容器运行时
Client: Docker Engine - Community
Version: 24.0.6
API version: 1.43
Go version: go1.20.7
Git commit: ed223bc
Built: Mon Sep 4 12:35:25 2023
OS/Arch: linux/amd64
Context: default
Server: Docker Engine - Community
Engine:
Version: 24.0.6
API version: 1.43 (minimum version 1.12)
Go version: go1.20.7
Git commit: 1a79695
Built: Mon Sep 4 12:34:28 2023
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.6.22
GitCommit: 8165feabfdfe38c65b599c4993d227328c231fca
runc:
Version: 1.1.8
GitCommit: v1.1.8-0-g82f18fe
docker-init:
Version: 0.19.0
GitCommit: de40ad0
KubeSphere版本信息
v3.3.2,在已有k8s上最小化安装
问题是什么
通过控制台修改ClusterConfiguration的multicluster.clusterRole为host,启动之后报错如下:
Collecting installation results ...
Task 'multicluster' failed:
******************************************************************************************************************************************************
{
"counter": 65,
"created": "2024-01-23T04:14:07.810518",
"end_line": 67,
"event": "runner_on_failed",
"event_data": {
"duration": 678.907787,
"end": "2024-01-23T04:14:07.810371",
"event_loop": null,
"host": "localhost",
"ignore_errors": null,
"play": "localhost",
"play_pattern": "localhost",
"play_uuid": "421c99aa-7291-a74f-4a6d-000000000005",
"playbook": "/kubesphere/playbooks/multicluster.yaml",
"playbook_uuid": "33cce249-c436-437d-8941-3b783fc8d285",
"remote_addr": "127.0.0.1",
"res": {
"_ansible_no_log": false,
"attempts": 10,
"changed": true,
"cmd": "/usr/local/bin/helm upgrade --install kubefed /kubesphere/kubesphere/kubefed/kubefed -f /kubesphere/kubesphere/kubefed/custom-values-kubefed.yaml --namespace kube-federation-system --wait --timeout 1800s\n",
"delta": "0:00:04.931908",
"end": "2024-01-23 12:14:07.736910",
"invocation": {
"module_args": {
"_raw_params": "/usr/local/bin/helm upgrade --install kubefed /kubesphere/kubesphere/kubefed/kubefed -f /kubesphere/kubesphere/kubefed/custom-values-kubefed.yaml --namespace kube-federation-system --wait --timeout 1800s\n",
"_uses_shell": true,
"argv": null,
"chdir": null,
"creates": null,
"executable": null,
"removes": null,
"stdin": null,
"stdin_add_newline": true,
"strip_empty_ends": true,
"warn": true
}
},
"msg": "non-zero return code",
"rc": 1,
"start": "2024-01-23 12:14:02.805002",
"stderr": "Error: UPGRADE FAILED: failed to create resource: Internal error occurred: failed calling webhook \"kubefedconfigs.core.kubefed.io\": failed to call webhook: Post \"https://kubefed-admission-webhook.kube-federation-system.svc:443/default-kubefedconfig?timeout=10s\": x509: certificate signed by unknown authority (possibly because of \"crypto/rsa: verification error\" while trying to verify candidate authority certificate \"kubefed-admission-webhook-ca\")",
"stderr_lines": [
"Error: UPGRADE FAILED: failed to create resource: Internal error occurred: failed calling webhook \"kubefedconfigs.core.kubefed.io\": failed to call webhook: Post \"https://kubefed-admission-webhook.kube-federation-system.svc:443/default-kubefedconfig?timeout=10s\": x509: certificate signed by unknown authority (possibly because of \"crypto/rsa: verification error\" while trying to verify candidate authority certificate \"kubefed-admission-webhook-ca\")"
],
"stdout": "",
"stdout_lines": []
},
"resolved_action": "command",
"role": "ks-multicluster",
"start": "2024-01-23T04:02:48.902584",
"task": "Kubefed | Initing kube-federation-system",
"task_action": "command",
"task_args": "",
"task_path": "/kubesphere/installer/roles/ks-multicluster/tasks/main.yml:51",
"task_uuid": "421c99aa-7291-a74f-4a6d-00000000001f",
"uuid": "9dd9cfa6-c1f5-4f69-ad74-631e22c2d6a6"
},
"parent_uuid": "421c99aa-7291-a74f-4a6d-00000000001f",
"pid": 27433,
"runner_ident": "multicluster",
"start_line": 66,
"stdout": "fatal: [localhost]: FAILED! => {\"attempts\": 10, \"changed\": true, \"cmd\": \"/usr/local/bin/helm upgrade --install kubefed /kubesphere/kubesphere/kubefed/kubefed -f /kubesphere/kubesphere/kubefed/custom-values-kubefed.yaml --namespace kube-federation-system --wait --timeout 1800s\\n\", \"delta\": \"0:00:04.931908\", \"end\": \"2024-01-23 12:14:07.736910\", \"msg\": \"non-zero return code\", \"rc\": 1, \"start\": \"2024-01-23 12:14:02.805002\", \"stderr\": \"Error: UPGRADE FAILED: failed to create resource: Internal error occurred: failed calling webhook \\\"kubefedconfigs.core.kubefed.io\\\": failed to call webhook: Post \\\"https://kubefed-admission-webhook.kube-federation-system.svc:443/default-kubefedconfig?timeout=10s\\\": x509: certificate signed by unknown authority (possibly because of \\\"crypto/rsa: verification error\\\" while trying to verify candidate authority certificate \\\"kubefed-admission-webhook-ca\\\")\", \"stderr_lines\": [\"Error: UPGRADE FAILED: failed to create resource: Internal error occurred: failed calling webhook \\\"kubefedconfigs.core.kubefed.io\\\": failed to call webhook: Post \\\"https://kubefed-admission-webhook.kube-federation-system.svc:443/default-kubefedconfig?timeout=10s\\\": x509: certificate signed by unknown authority (possibly because of \\\"crypto/rsa: verification error\\\" while trying to verify candidate authority certificate \\\"kubefed-admission-webhook-ca\\\")\"], \"stdout\": \"\", \"stdout_lines\": []}",
"uuid": "9dd9cfa6-c1f5-4f69-ad74-631e22c2d6a6"
}
观察发现存在两个Pod状态异常。
[root@k8s-master ~]# kubectl get pod -n kube-federation-system
NAME READY STATUS RESTARTS AGE
kubefed-admission-webhook-86f58cbd9b-hwb4w 1/1 Running 0 58m
kubefed-controller-manager-755cb75766-25vjt 0/1 CrashLoopBackOff 16 (60s ago) 58m
kubefed-controller-manager-755cb75766-fjw2g 0/1 CrashLoopBackOff 16 (2m21s ago) 58m
通过logs -f查看pod报错日志如下:
[root@k8s-master ~]# kubectl logs -f kubefed-controller-manager-755cb75766-25vjt -n kube-federation-system
KubeFed controller-manager version: version.Info{Version:"v0.0.1-alpha.0", GitCommit:"unknown", GitTreeState:"unknown", BuildDate:"unknown", GoVersion:"go1.16.5", Compiler:"gc", Platform:"linux/amd64"}
I0123 05:00:54.610397 1 controller-manager.go:398] FLAG: --add_dir_header="false"
I0123 05:00:54.610461 1 controller-manager.go:398] FLAG: --alsologtostderr="false"
I0123 05:00:54.610468 1 controller-manager.go:398] FLAG: --healthz-addr=":8080"
I0123 05:00:54.610475 1 controller-manager.go:398] FLAG: --help="false"
I0123 05:00:54.610481 1 controller-manager.go:398] FLAG: --kubeconfig=""
I0123 05:00:54.610485 1 controller-manager.go:398] FLAG: --kubefed-config=""
I0123 05:00:54.610489 1 controller-manager.go:398] FLAG: --kubefed-namespace=""
I0123 05:00:54.610493 1 controller-manager.go:398] FLAG: --log-flush-frequency="5s"
I0123 05:00:54.610499 1 controller-manager.go:398] FLAG: --log_backtrace_at=":0"
I0123 05:00:54.610508 1 controller-manager.go:398] FLAG: --log_dir=""
I0123 05:00:54.610513 1 controller-manager.go:398] FLAG: --log_file=""
I0123 05:00:54.610517 1 controller-manager.go:398] FLAG: --log_file_max_size="1800"
I0123 05:00:54.610521 1 controller-manager.go:398] FLAG: --logtostderr="true"
I0123 05:00:54.610525 1 controller-manager.go:398] FLAG: --master=""
I0123 05:00:54.610530 1 controller-manager.go:398] FLAG: --metrics-addr=":9090"
I0123 05:00:54.610534 1 controller-manager.go:398] FLAG: --one_output="false"
I0123 05:00:54.610538 1 controller-manager.go:398] FLAG: --rest-config-burst="100"
I0123 05:00:54.610544 1 controller-manager.go:398] FLAG: --rest-config-qps="50"
I0123 05:00:54.610552 1 controller-manager.go:398] FLAG: --skip_headers="false"
I0123 05:00:54.610557 1 controller-manager.go:398] FLAG: --skip_log_headers="false"
I0123 05:00:54.610561 1 controller-manager.go:398] FLAG: --stderrthreshold="2"
I0123 05:00:54.610569 1 controller-manager.go:398] FLAG: --v="2"
I0123 05:00:54.610573 1 controller-manager.go:398] FLAG: --version="false"
I0123 05:00:54.610577 1 controller-manager.go:398] FLAG: --vmodule=""
W0123 05:00:54.610707 1 client_config.go:615] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
I0123 05:00:54.610952 1 controller-manager.go:428] starting metrics server path /metrics
I0123 05:00:54.702521 1 controller-manager.go:225] Cannot retrieve KubeFedConfig "kube-federation-system/kubefed": kubefedconfigs.core.kubefed.io "kubefed" not found. Default options will be used.
I0123 05:00:54.702556 1 controller-manager.go:328] Creating KubeFedConfig "kube-federation-system/kubefed" with default values
F0123 05:00:54.803728 1 controller-manager.go:299] Error creating KubeFedConfig "kube-federation-system/kubefed": Internal error occurred: failed calling webhook "kubefedconfigs.core.kubefed.io": failed to call webhook: Post "https://kubefed-admission-webhook.kube-federation-system.svc:443/default-kubefedconfig?timeout=10s": x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubefed-admission-webhook-ca")
补充描述:之前在该集群内安装过一次Kubesphere,后面通过卸载脚本卸载重装过一次,启动后无异常,修改集群模式为多集群