【还有点遗留问题】江湖救急，使用kk部署报错

heimao0307

先贴上出错原因：

原因：由于服务器不够，把HA放在了master节点，并且第一次用HA，按照论坛里一篇帖子照猫画虎配置的，LB的端口也设置成了6443，导致端口冲突。
非常感谢 @chen-shiwei 一步一步排查问题，最终完美解决。感谢您的专业，感谢您的耐心，感谢您答疑解惑~

十万火急，非常感谢~

环境

Kubernetes：1.17.9
KubeSphere：3.0.0
3master+2worker，配置见下图

报错

所有日志文字版：

+---------+------+------+---------+----------+-------+-------+-----------+--------+------------+----------
| name    | sudo | curl | openssl | ebtables | socat | ipset | conntrack | docker | nfs client | ceph clie
+---------+------+------+---------+----------+-------+-------+-----------+--------+------------+----------
| worker2 | y    | y    | y       | y        | y     | y     | y         | y      |            |
| worker1 | y    | y    | y       | y        | y     | y     | y         | y      |            |
| master1 | y    | y    | y       | y        | y     | y     | y         | y      |            |
| master3 | y    | y    | y       | y        | y     | y     | y         | y      |            |
| master2 | y    | y    | y       | y        | y     | y     | y         | y      |            |
+---------+------+------+---------+----------+-------+-------+-----------+--------+------------+----------

This is a simple check of your environment.
Before installation, you should ensure that your machines meet all requirements specified at
https://github.com/kubesphere/kubekey#requirements-and-recommendations

Continue this installation? [yes/no]: yes
INFO[16:23:21 CST] Downloading Installation Files
INFO[16:23:21 CST] Downloading kubeadm ...
INFO[16:23:22 CST] Downloading kubelet ...
INFO[16:23:22 CST] Downloading kubectl ...
INFO[16:23:22 CST] Downloading kubecni ...
INFO[16:23:22 CST] Downloading helm ...
INFO[16:23:22 CST] Configurating operating system ...
[worker2 10.14.6.11] MSG:
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-arptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_local_reserved_ports = 30000-32767
[worker1 10.14.6.10] MSG:
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-arptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_local_reserved_ports = 30000-32767
[master2 10.14.6.8] MSG:
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-arptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_local_reserved_ports = 30000-32767
[master3 10.14.6.9] MSG:
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-arptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_local_reserved_ports = 30000-32767
[master1 10.14.6.6] MSG:
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-arptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_local_reserved_ports = 30000-32767
INFO[16:23:24 CST] Installing docker ...
INFO[16:23:25 CST] Start to download images on all nodes
[master1] Downloading image: kubesphere/etcd:v3.3.12
[worker2] Downloading image: kubesphere/pause:3.1
[master3] Downloading image: kubesphere/etcd:v3.3.12
[worker1] Downloading image: kubesphere/pause:3.1
[master2] Downloading image: kubesphere/etcd:v3.3.12
[master1] Downloading image: kubesphere/pause:3.1
[master2] Downloading image: kubesphere/pause:3.1
[worker1] Downloading image: kubesphere/kube-proxy:v1.17.9
[worker2] Downloading image: kubesphere/kube-proxy:v1.17.9
[master2] Downloading image: kubesphere/kube-apiserver:v1.17.9
[master1] Downloading image: kubesphere/kube-apiserver:v1.17.9
[worker1] Downloading image: coredns/coredns:1.6.9
[worker2] Downloading image: coredns/coredns:1.6.9
[master2] Downloading image: kubesphere/kube-controller-manager:v1.17.9
[master1] Downloading image: kubesphere/kube-controller-manager:v1.17.9
[worker1] Downloading image: kubesphere/k8s-dns-node-cache:1.15.12
[master3] Downloading image: kubesphere/pause:3.1
[worker2] Downloading image: kubesphere/k8s-dns-node-cache:1.15.12
[master2] Downloading image: kubesphere/kube-scheduler:v1.17.9
[master1] Downloading image: kubesphere/kube-scheduler:v1.17.9
[worker1] Downloading image: calico/kube-controllers:v3.15.1
[worker2] Downloading image: calico/kube-controllers:v3.15.1
[master2] Downloading image: kubesphere/kube-proxy:v1.17.9
[master1] Downloading image: kubesphere/kube-proxy:v1.17.9
[worker1] Downloading image: calico/cni:v3.15.1
[master3] Downloading image: kubesphere/kube-apiserver:v1.17.9
[worker2] Downloading image: calico/cni:v3.15.1
[master2] Downloading image: coredns/coredns:1.6.9
[master1] Downloading image: coredns/coredns:1.6.9
[worker1] Downloading image: calico/node:v3.15.1
[worker2] Downloading image: calico/node:v3.15.1
[master2] Downloading image: kubesphere/k8s-dns-node-cache:1.15.12
[master1] Downloading image: kubesphere/k8s-dns-node-cache:1.15.12
[worker1] Downloading image: calico/pod2daemon-flexvol:v3.15.1
[master2] Downloading image: calico/kube-controllers:v3.15.1
[master1] Downloading image: calico/kube-controllers:v3.15.1
[worker2] Downloading image: calico/pod2daemon-flexvol:v3.15.1
[master3] Downloading image: kubesphere/kube-controller-manager:v1.17.9
[master2] Downloading image: calico/cni:v3.15.1
[master1] Downloading image: calico/cni:v3.15.1
[master2] Downloading image: calico/node:v3.15.1
[master1] Downloading image: calico/node:v3.15.1
[master3] Downloading image: kubesphere/kube-scheduler:v1.17.9
[master2] Downloading image: calico/pod2daemon-flexvol:v3.15.1
[master1] Downloading image: calico/pod2daemon-flexvol:v3.15.1
[master3] Downloading image: kubesphere/kube-proxy:v1.17.9
[master3] Downloading image: coredns/coredns:1.6.9
[master3] Downloading image: kubesphere/k8s-dns-node-cache:1.15.12
[master3] Downloading image: calico/kube-controllers:v3.15.1
[master3] Downloading image: calico/cni:v3.15.1
[master3] Downloading image: calico/node:v3.15.1
[master3] Downloading image: calico/pod2daemon-flexvol:v3.15.1
INFO[16:24:04 CST] Generating etcd certs
INFO[16:24:07 CST] Synchronizing etcd certs
INFO[16:24:08 CST] Creating etcd service
INFO[16:24:11 CST] Starting etcd cluster
[master1 10.14.6.6] MSG:
Configuration file already exists
[master2 10.14.6.8] MSG:
Configuration file already exists
[master3 10.14.6.9] MSG:
Configuration file already exists
INFO[16:24:12 CST] Refreshing etcd configuration
INFO[16:24:12 CST] Get cluster status
[master1 10.14.6.6] MSG:
Cluster will be created.
[master2 10.14.6.8] MSG:
Cluster will be created.
[master3 10.14.6.9] MSG:
Cluster will be created.
INFO[16:24:13 CST] Installing kube binaries
Push /root/kubekey/v1.17.9/amd64/kubeadm to 10.14.6.6:/tmp/kubekey/kubeadm   Done
Push /root/kubekey/v1.17.9/amd64/kubelet to 10.14.6.6:/tmp/kubekey/kubelet   Done
Push /root/kubekey/v1.17.9/amd64/kubectl to 10.14.6.6:/tmp/kubekey/kubectl   Done
Push /root/kubekey/v1.17.9/amd64/helm to 10.14.6.6:/tmp/kubekey/helm   Done
Push /root/kubekey/v1.17.9/amd64/cni-plugins-linux-amd64-v0.8.6.tgz to 10.14.6.6:/tmp/kubekey/cni-plugins-linux-amd64-v0.8.6.tgz   Done
Push /root/kubekey/v1.17.9/amd64/kubeadm to 10.14.6.11:/tmp/kubekey/kubeadm   Done
Push /root/kubekey/v1.17.9/amd64/kubeadm to 10.14.6.9:/tmp/kubekey/kubeadm   Done
Push /root/kubekey/v1.17.9/amd64/kubeadm to 10.14.6.8:/tmp/kubekey/kubeadm   Done
Push /root/kubekey/v1.17.9/amd64/kubeadm to 10.14.6.10:/tmp/kubekey/kubeadm   Done
Push /root/kubekey/v1.17.9/amd64/kubelet to 10.14.6.10:/tmp/kubekey/kubelet   Done
Push /root/kubekey/v1.17.9/amd64/kubectl to 10.14.6.10:/tmp/kubekey/kubectl   Done
Push /root/kubekey/v1.17.9/amd64/kubelet to 10.14.6.11:/tmp/kubekey/kubelet   Done
Push /root/kubekey/v1.17.9/amd64/kubelet to 10.14.6.9:/tmp/kubekey/kubelet   Done
Push /root/kubekey/v1.17.9/amd64/kubelet to 10.14.6.8:/tmp/kubekey/kubelet   Done
Push /root/kubekey/v1.17.9/amd64/helm to 10.14.6.10:/tmp/kubekey/helm   Done
Push /root/kubekey/v1.17.9/amd64/kubectl to 10.14.6.11:/tmp/kubekey/kubectl   Done
Push /root/kubekey/v1.17.9/amd64/kubectl to 10.14.6.9:/tmp/kubekey/kubectl   Done
Push /root/kubekey/v1.17.9/amd64/kubectl to 10.14.6.8:/tmp/kubekey/kubectl   Done
Push /root/kubekey/v1.17.9/amd64/cni-plugins-linux-amd64-v0.8.6.tgz to 10.14.6.10:/tmp/kubekey/cni-plugins-linux-amd64-v0.8.6.tgz   Done
Push /root/kubekey/v1.17.9/amd64/helm to 10.14.6.9:/tmp/kubekey/helm   Done
Push /root/kubekey/v1.17.9/amd64/helm to 10.14.6.8:/tmp/kubekey/helm   Done
Push /root/kubekey/v1.17.9/amd64/helm to 10.14.6.11:/tmp/kubekey/helm   Done
Push /root/kubekey/v1.17.9/amd64/cni-plugins-linux-amd64-v0.8.6.tgz to 10.14.6.9:/tmp/kubekey/cni-plugins-linux-amd64-v0.8.6.tgz   Done
Push /root/kubekey/v1.17.9/amd64/cni-plugins-linux-amd64-v0.8.6.tgz to 10.14.6.11:/tmp/kubekey/cni-plugins-linux-amd64-v0.8.6.tgz   Done
Push /root/kubekey/v1.17.9/amd64/cni-plugins-linux-amd64-v0.8.6.tgz to 10.14.6.8:/tmp/kubekey/cni-plugins-linux-amd64-v0.8.6.tgz   Done
INFO[16:24:24 CST] Initializing kubernetes cluster
[master1 10.14.6.6] MSG:
[preflight] Running pre-flight checks
W0203 16:24:26.066567    6017 removeetcdmember.go:79] [reset] No kubeadm config, using etcd pod spec to get data directory
[reset] No etcd config found. Assuming external etcd
[reset] Please, manually reset etcd to prevent further issues
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
W0203 16:24:26.071949    6017 cleanupnode.go:99] [reset] Failed to evaluate the "/var/lib/kubelet" directory. Skipping its unmount and cleanup: lstat /var/lib/kubelet: no such file or directory
[reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]
[reset] Deleting contents of stateful directories: [/var/lib/dockershim /var/run/kubernetes /var/lib/cni]

The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d

The reset process does not reset or clean up iptables rules or IPVS tables.
If you wish to reset iptables, you must do so manually by using the "iptables" command.

If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
to reset your system's IPVS tables.

The reset process does not clean your kubeconfig files and you must remove them manually.
Please, check the contents of the $HOME/.kube/config file.
[master1 10.14.6.6] MSG:
[preflight] Running pre-flight checks
W0203 16:24:27.083749    6299 removeetcdmember.go:79] [reset] No kubeadm config, using etcd pod spec to get data directory
[reset] No etcd config found. Assuming external etcd
[reset] Please, manually reset etcd to prevent further issues
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
W0203 16:24:27.089160    6299 cleanupnode.go:99] [reset] Failed to evaluate the "/var/lib/kubelet" directory. Skipping its unmount and cleanup: lstat /var/lib/kubelet: no such file or directory
[reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]
[reset] Deleting contents of stateful directories: [/var/lib/dockershim /var/run/kubernetes /var/lib/cni]

The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d

The reset process does not reset or clean up iptables rules or IPVS tables.
If you wish to reset iptables, you must do so manually by using the "iptables" command.

If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
to reset your system's IPVS tables.

The reset process does not clean your kubeconfig files and you must remove them manually.
Please, check the contents of the $HOME/.kube/config file.
ERRO[16:24:27 CST] Failed to init kubernetes cluster: Failed to exec command: sudo -E /bin/sh -c "/usr/local/bin/kubeadm init --config=/etc/kubernetes/kubeadm-config.yaml"
W0203 16:24:27.253753    6350 defaults.go:186] The recommended value for "clusterDNS" in "KubeletConfiguration" is: [10.233.0.10]; the provided value is: [169.254.25.10]
W0203 16:24:27.254029    6350 validation.go:28] Cannot validate kubelet config - no validator is available
W0203 16:24:27.254053    6350 validation.go:28] Cannot validate kube-proxy config - no validator is available
[init] Using Kubernetes version: v1.17.9
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
        [ERROR Port-6443]: Port 6443 is in use
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher: Process exited with status 1  node=10.14.6.6
WARN[16:24:27 CST] Task failed ...
WARN[16:24:27 CST] error: interrupted by error
Error: Failed to init kubernetes cluster: interrupted by error
Usage:
  kk create cluster [flags]

Flags:
  -f, --filename string          Path to a configuration file
  -h, --help                     help for cluster
      --skip-pull-images         Skip pre pull images
      --with-kubernetes string   Specify a supported version of kubernetes
      --with-kubesphere          Deploy a specific version of kubesphere (default v3.0.0)
  -y, --yes                      Skip pre-check of the installation

Global Flags:
      --debug   Print detailed information (default true)

Failed to init kubernetes cluster: interrupted by error

报错日志截图版

yuswift

向日葵环境方便发一下 kubesphere@yunify.com 吗

heimao0307

yuswift 您好，已发送，非常感谢~

yuswift

heimao0307 没有收到麻烦再发一下到 yuswiftli@yunify.com 吧

leiyu1982

这个地方一般保持默认就可以：

heimao0307

yuswift 已发送，非常感谢~

chen-shiwei

kubeadm init的是否检查6443 端口被占用了

heimao0307

问题已解决。
原因：由于服务器不够，把HA放在了master节点，并且第一次用HA，按照论坛里一篇帖子照猫画虎配置的，LB的端口也设置成了6443，导致端口冲突。

非常感谢 @chen-shiwei 一步一步排查问题，最终完美解决。感谢您的专业，感谢您的耐心，感谢您答疑解惑~

部署成功结果图

heimao0307

@chen-shiwei 还得麻烦您一下，使用admin账户登录一直在转。

查看pods和nodes：

heimao0307

等了一会，成功了，还是转~

我给它点思考时间吧，晚上或者明天早上再看看效果如何~

heimao0307

早上登录了下，超时

heimao0307

重启docker失败

不知是不是因为不小心覆盖了/etc/docker/daemon.json文件

RolandMa1986

huanggze Json换行加个 “,” 逗号看看。

heimao0307

RolandMa1986 按照您说的，已解决docker重启问题。非常感谢~

heimao0307

RolandMa1986 请问下，加载慢的问题如何解决，感觉节点等都正常，就是有时慢有时快~
3个master节点的服务器配置是一样的。

是否跟haproxy配置的不好有关？
haproxy配置如下：

global
    log /dev/log    local0
    log /dev/log    local1 notice
    chroot /var/lib/haproxy
    stats socket /var/run/haproxy-admin.sock mode 660 level admin
    stats timeout 30s
    user haproxy
    group haproxy
    daemon
    nbproc 1

  
defaults
    log global
    timeout connect 5000
    timeout client 10m
    timeout server 10m

frontend kube-apiserver
    bind *:8443
    mode tcp
    option tcplog
    default_backend kube-apiserver

backend kube-apiserver
    mode tcp
    option tcplog
    option tcp-check
    balance roundrobin
    server k8s-master-1 10.14.6.6:6443 check inter 2000 fall 2 rise 2 weight 1
    server k8s-master-2 10.14.6.8:6443 check inter 2000 fall 2 rise 2 weight 1
    server k8s-master-3 10.14.6.9:6443 check inter 2000 fall 2 rise 2 weight 1

keepalived配置如下：

# keepalived配置--master1节点--10.14.6.6
global_defs {
    router_id k8s-router-id
}

vrrp_script check-haproxy {
    script "killall -0 haproxy"
    interval 5
    weight -30
}

vrrp_instance VI-kube-master {
    unicast_src_ip 10.14.6.6
    unicast_peer {
        10.14.6.8
        10.14.6.9
    }
    state BACKUP
    nopreempt
    priority 120
    dont_track_primary
    interface em4
    virtual_router_id 68
    advert_int 3
    track_script {
        check-haproxy
    }
    virtual_ipaddress {
        10.14.6.14/24
    }
}

# keepalived配置--master2节点--10.14.6.8
global_defs {
    router_id k8s-router-id
}

vrrp_script check-haproxy {
    script "killall -0 haproxy"
    interval 5
    weight -30
}

vrrp_instance VI-kube-master {
    unicast_src_ip 10.14.6.8
    unicast_peer {
        10.14.6.6
        10.14.6.9
    }
    state BACKUP
    nopreempt
    priority 120
    dont_track_primary
    interface eno4
    virtual_router_id 68
    advert_int 3
    track_script {
        check-haproxy
    }
    virtual_ipaddress {
        10.14.6.14/24
    }
}

# keepalived配置--master2节点--10.14.6.9
global_defs {
    router_id k8s-router-id
}

vrrp_script check-haproxy {
    script "killall -0 haproxy"
    interval 5
    weight -30
}

vrrp_instance VI-kube-master {
    unicast_src_ip 10.14.6.9
    unicast_peer {
        10.14.6.6
        10.14.6.8
    }
    state BACKUP
    nopreempt
    priority 120
    dont_track_primary
    interface em1
    virtual_router_id 68
    advert_int 3
    track_script {
        check-haproxy
    }
    virtual_ipaddress {
        10.14.6.14/24
    }
}

lan-liang

加载慢的定义是什么？换句话说，什么是加载慢？具体表现是什么，是指前端页面出来的慢还是按钮处理慢？

heimao0307

lan-liang 感谢您回复。楼上也有上午的登录试验。登录超时。

有时快有时慢，所以想知道啥原因引起的~

RolandMa1986

heimao0307 先测试一下本地的跨node pod连接有没有问题。简单做法可以把ks-console,ks-apiserver 都改成1个副本

heimao0307

RolandMa1986 请问可以远程协助吗，谢谢~