安装kubernetes后，coredns和calico kube controller 一直处于失败的状态

Nickcao0813 · 2022年1月2日

创建部署问题时，请参考下面模板，你提供的信息越多，越容易及时获得解答。如果未按模板创建问题，管理员有权关闭问题。
确保帖子格式清晰易读，用 markdown code block 语法格式化代码块。
你只花一分钟创建的问题，不能指望别人花上半个小时给你解答。

操作系统信息
物理机，Centos 7.9 ,8C /32G

3Masters /7 workers

Kubernetes版本信息
Client Version: version.Info{Major:“1”, Minor:“21”, GitVersion:“v1.21.5”, GitCommit:“aea7bbadd2fc0cd689de94a54e5b7b758869d691”, GitTreeState:“clean”, BuildDate:“2021-09-15T21:10:45Z”, GoVersion:“go1.16.8”, Compiler:“gc”, Platform:“linux/amd64”}

Server Version: version.Info{Major:“1”, Minor:“21”, GitVersion:“v1.21.5”, GitCommit:“aea7bbadd2fc0cd689de94a54e5b7b758869d691”, GitTreeState:“clean”, BuildDate:“2021-09-15T21:04:16Z”, GoVersion:“go1.16.8”, Compiler:“gc”, Platform:“linux/amd64”}

容器运行时

Docker Version

—

Client:

Version: 20.10.8

API version: 1.41

Go version: go1.16.6

Git commit: 3967b7d

Built: Fri Jul 30 19:50:40 2021

OS/Arch: linux/amd64

Context: default

Experimental: true

Server: Docker Engine - Community

Engine:

Version: 20.10.8

API version: 1.41 (minimum version 1.12)

Go version: go1.16.6

Git commit: 75249d8

Built: Fri Jul 30 19:55:09 2021

OS/Arch: linux/amd64

Experimental: false

containerd:

Version: v1.4.9

KubeSphere版本信息

尚未安装kubesphere，用的是 3.2.0版 kk，准备分步骤安装：

只安装kubernetes –with-kubenetes
安装kubernetes dashboard
安装rook-ceph
安装kubesphere

目前只安装了第一步以后，发现集群有问题。

问题是什么

kubectl describe pod calico-kube-controller -n kube-system

Events:

Type Reason Age From Message

—- —— —- —- ——-

Normal Scheduled 28m default-scheduler Successfully assigned kube-system/calico-kube-controllers-68586cd975-zqb7v to k8s-n0

Normal Pulled 27m (x4 over 28m) kubelet Container image “registry.cn-beijing.aliyuncs.com/kubesphereio/kube-controllers:v3.20.0” already present on machine

Normal Created 27m (x4 over 28m) kubelet Created container calico-kube-controllers

Normal Started 27m (x4 over 28m) kubelet Started container calico-kube-controllers

Warning Unhealthy 27m (x4 over 28m) kubelet Readiness probe failed: Failed to read status file /status/status.json: unexpected end of JSON input

Warning BackOff 3m18s (x119 over 28m) kubelet Back-off restarting failed container
1. kubectl logs -n kube-system calico-kube-controllers-68586cd975-zqb7v
  
  2022-01-02 08:02:21.709 [INFO][1] main.go 94: Loaded configuration from environment config=&config.Config{LogLevel:“info”, WorkloadEndpointWorkers:1, ProfileWorkers:1, PolicyWorkers:1, NodeWorkers:1, Kubeconfig:"", DatastoreType:“kubernetes”}
  
  W0102 08:02:21.711702 1 client_config.go:615] Neither –kubeconfig nor –master was specified. Using the inClusterConfig. Thismight not work.
  
  2022-01-02 08:02:21.712 [INFO][1] main.go 115: Ensuring Calico datastore is initialized
  
  2022-01-02 08:02:24.720 [ERROR][1] client.go 261: Error getting cluster information config ClusterInformation=“default” error=Get “https://10.233.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default”: dial tcp 10.233.0.1:443: connect: no route to host
  
  2022-01-02 08:02:24.720 [FATAL][1] main.go 120: Failed to initialize Calico datastore error=Get “https://10.233.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default”: dial tcp 10.233.0.1:443: connect: no route to host
2. coredns也是类似的报错信息。

BTW：我在安装的时候，纯内网环境，用httpproxy 和 https_proxy利用另一台主机作为正向代理机下载的images.

Cchauncey · 2022年1月4日

你的config.yaml提供一下呢

镜像是正常下载, 报错信息看起来是 coredns不能连接的kube-apiserver,网络不通

Nickcao0813 · 2022年1月10日

chauncey

已解决，后来发现犯了一个极简单的错误。

因为最开始是内网外网两根网线，所以默认网关是在外网网卡上。在部署k8s时，把所有的外网网线去掉了，导致没有默认路由了。

后来把每个内网网卡配上默认网关以后，问题就解决了。