基于 k8s 安装 kubesphere

chuanning

本总结基于雷风阳老师的教程及kubesphere开发者社区几位大佬的指导，结合多次安装尝试，特作此笔记，仅为其他同学初次安装提供一些参考。

系统及硬件配置

操作系统：Windows10专业版
CPU：CORE i5
内存：24G

安装顺序

VirualBox
Vagrant(创建三台虚拟CentOS7.0系统)
Docker
Kubeadm、Kubelet、Kubectl
k8s
Helm、Tiller、PV
Kubespaere

1. VirualBox

1> 下载 VirtualBox-6.0.10-132072-Win.exe
2> 安装

2. Vagrant

1> 下载 vagrant_2.2.5_x86_64.msi
2> 安装 vagrant
3> 创建三台虚拟机
创建 Vagrantfile 文件，内容如下：

Vagrant.configure("2") do |config|
   (1..3).each do |i|
        config.vm.define "k8s-node#{i}" do |node|
            node.vm.box = "centos/7"
            node.vm.hostname="k8s-node#{i}"
            node.vm.network "private_network", ip: "192.168.56.#{99+i}", netmask: "255.255.255.0"
            # node.vm.synced_folder "~/Documents/vagrant/share", "/home/vagrant/share"
            node.vm.provider "virtualbox" do |v|
                v.name = "k8s-node#{i}"
                v.memory = 4096
                v.cpus = 4
            end
        end
   end
end

地址栏中输入 cmd，进入当前文件的目录，输入 vagrant up 命令，执行下载镜像和安装

D:\TecDoc\Note-K8S\config>vagrant up
Bringing machine 'k8s-node1' up with 'virtualbox' provider...
Bringing machine 'k8s-node2' up with 'virtualbox' provider...
Bringing machine 'k8s-node3' up with 'virtualbox' provider...

开启SSH访问
进入虚拟机
D:\TecDoc\Note-K8S\config>vagrant ssh k8s-node1
切换 root 账号
[vagrant@k8s-node1 _]$ su root
Password: [此处会要求输入密码 vagrant]
配置密码访问

[root@k8s-node1 vagrant]# vi /etc/ssh/sshd_config

修改 PasswordAuthentication no 为 yes
重启服务

service sshd restart

退出

exit
exit

再将 k8s-node2、k8s-node3 进行同样配置

配置网卡
1> 打开 VirtualBox，将三台虚拟机关机，管理 → 全局设定 → 网络，添加一个 NAT网络 → OK；
2> 分别进入三台虚拟机，设置 → 网络，网卡1 连接方式选刚创建的 NAT网络，点高级 → 刷新MAC地址；
3> 重启三台虚拟机并用 Xshell连接，群发 ping www.baidu.com 命令，能够正确拼通

关闭防火墙
Xshell连接三台虚拟机，开启命令群发模式
关闭防火墙

systemctl stop firewalld
systemctl disable firewalld

关闭selinux

sed -i 's/enforcing/disabled/' /etc/selinux/config
setenforce 0

禁用swap交换分区

sed -ri 's/.*swap.*/#&/' /etc/fstab

配置集群地址
Xshell 连接三台虚拟机，开启命令群发模式
vi /etc/hosts
将三台虚拟机IP对应主机名配置到三台虚拟机

10.0.2.5 k8s-node1
10.0.2.4 k8s-node2
10.0.2.15 k8s-node3

注意：有可能每个人安装的虚拟机之后的IP不一样，请注意根据自己实际地址配置

3. Docker

开启命令群发模式，每一步操作确保全部虚拟机执行完成

查看内核版本(uname -r)，docker 要求内核大于 3.10

设置 docker repo 的 yum 位置

sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo

安装 docker

sudo yum install -y docker-ce

验证

[root@k8s-node1 docker]# docker version
Client: Docker Engine - Community
 Version:           19.03.8
 API version:       1.40
 Go version:        go1.12.17
 Git commit:        afacb8b
 Built:             Wed Mar 11 01:27:04 2020
 OS/Arch:           linux/amd64
 Experimental:      false

配置镜像加速，这是个人阿里云加速地址

sudo mkdir -p /etc/docker
sudo tee /etc/docker/daemon.json <<-'EOF'
{
  "registry-mirrors": ["https://g9obgf7p.mirror.aliyuncs.com"]
}
EOF

重载配置

sudo systemctl daemon-reload

重启 docker

sudo systemctl restart docker

配置开机启动

systemctl enable docker

常见问题
Job for docker.service failed because the control process exited with error code. See “systemctl status docker.service” and “journalctl -xe” for details.
解决办法：可能的原因是 /etc/docker/daemon.json 文件中错误的配置引起的，删除即可

4. Kubeadm、Kubelet、Kubectl

配置 yum 源

cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

检查安装源

yum list|grep kube

安装

yum install -y kubelet-1.17.3 kubeadm-1.17.3 kubectl-1.17.3

配置开机启动

systemctl enable kubelet
systemctl start kubelet

5. k8s(安装 master)

只在 k8s-node1 上执行

配置一个下载脚本 master_images.sh，上传到 master(k8s-node1)，内容如下：

#!/bin/bash
images=(
	kube-apiserver:v1.17.3
    kube-proxy:v1.17.3
	kube-controller-manager:v1.17.3
	kube-scheduler:v1.17.3
	coredns:1.6.5
	etcd:3.4.3-0
    pause:3.1
)

for imageName in ${images[@]} ; do
    docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/$imageName
#   docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/$imageName  k8s.gcr.io/$imageName
done

增加可执行权限

chmod +x master_images.sh

执行脚本下载镜像

./master_images.sh

master 节点初始化（10.0.2.5 是 k8s-node1 地址），默认是从k8s.gcr.io拉取镜像，这里指定阿里云镜像仓库以及版本，service、pod网络地址

kubeadm init \
--apiserver-advertise-address=10.0.2.5 \
--image-repository registry.cn-hangzhou.aliyuncs.com/google_containers \
--kubernetes-version v1.17.3 \
--service-cidr=10.96.0.0/16 \
--pod-network-cidr=10.244.0.0/16

操作成功，在最后几行会看到如下日志，（注意：将以下信息保存，在其它节点加入主节点时会用到这里的 token 和 sha256 的值）

[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 10.0.2.5:6443 --token 8fh3fy.wlm5532mhidj51o9 \
    --discovery-token-ca-cert-hash sha256:db7357833e141fa031b0cfdec5f43d5ef01963b479a27aa7bb9ccfac48a333bc

根据上述结果
执行第一步：复制配置文件

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

执行第二步：安装网络插件flannel
上传 kube-flannel.yml 到 master /mydata 目录下，内容如下

---
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: psp.flannel.unprivileged
  annotations:
    seccomp.security.alpha.kubernetes.io/allowedProfileNames: docker/default
    seccomp.security.alpha.kubernetes.io/defaultProfileName: docker/default
    apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default
    apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default
spec:
  privileged: false
  volumes:
    - configMap
    - secret
    - emptyDir
    - hostPath
  allowedHostPaths:
    - pathPrefix: "/etc/cni/net.d"
    - pathPrefix: "/etc/kube-flannel"
    - pathPrefix: "/run/flannel"
  readOnlyRootFilesystem: false
  runAsUser:
    rule: RunAsAny
  supplementalGroups:
    rule: RunAsAny
  fsGroup:
    rule: RunAsAny
  allowPrivilegeEscalation: false
  defaultAllowPrivilegeEscalation: false
  allowedCapabilities: ['NET_ADMIN']
  defaultAddCapabilities: []
  requiredDropCapabilities: []
  hostPID: false
  hostIPC: false
  hostNetwork: true
  hostPorts:
  - min: 0
    max: 65535
  seLinux:
    rule: 'RunAsAny'
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: flannel
rules:
  - apiGroups: ['extensions']
    resources: ['podsecuritypolicies']
    verbs: ['use']
    resourceNames: ['psp.flannel.unprivileged']
  - apiGroups:
      - ""
    resources:
      - pods
    verbs:
      - get
  - apiGroups:
      - ""
    resources:
      - nodes
    verbs:
      - list
      - watch
 - apiGroups:
      - ""
    resources:
      - nodes/status
    verbs:
      - patch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: flannel
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: flannel
subjects:
- kind: ServiceAccount
  name: flannel
  namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: flannel
  namespace: kube-system
---
kind: ConfigMap
apiVersion: v1
metadata:
  name: kube-flannel-cfg
  namespace: kube-system
  labels:
    tier: node
    app: flannel
data:
  cni-conf.json: |
    {
      "name": "cbr0",
      "cniVersion": "0.3.1",
      "plugins": [
        {
          "type": "flannel",
          "delegate": {
            "hairpinMode": true,
            "isDefaultGateway": true
          }
        },
        {
          "type": "portmap",
          "capabilities": {
            "portMappings": true
          }
        }
      ]
    }
  net-conf.json: |
    {
      "Network": "10.244.0.0/16",
      "Backend": {
        "Type": "vxlan"
      }
    }
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: kube-flannel-ds-amd64
  namespace: kube-system
  labels:
    tier: node
    app: flannel
spec:
  selector:
    matchLabels:
      app: flannel
  template:
    metadata:
      labels:
        tier: node
        app: flannel
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: beta.kubernetes.io/os
                    operator: In
                    values:
                      - linux
                  - key: beta.kubernetes.io/arch
                    operator: In
                    values:
                      - amd64
      hostNetwork: true
      tolerations:
      - operator: Exists
        effect: NoSchedule
      serviceAccountName: flannel
      initContainers:
      - name: install-cni
        image: quay.io/coreos/flannel:v0.11.0-amd64
        command:
        - cp
        args:
        - -f
        - /etc/kube-flannel/cni-conf.json
        - /etc/cni/net.d/10-flannel.conflist
        volumeMounts:
        - name: cni
          mountPath: /etc/cni/net.d
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      containers:
      - name: kube-flannel
        image: quay.io/coreos/flannel:v0.11.0-amd64
        command:
        - /opt/bin/flanneld
        args:
        - --ip-masq
        - --kube-subnet-mgr
        resources:
          requests:
            cpu: "100m"
            memory: "50Mi"
          limits:
            cpu: "100m"
            memory: "50Mi"
        securityContext:
          privileged: false
          capabilities:
            add: ["NET_ADMIN"]
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        volumeMounts:
        - name: run
          mountPath: /run/flannel
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      volumes:
        - name: run
          hostPath:
            path: /run/flannel
        - name: cni
          hostPath:
            path: /etc/cni/net.d
        - name: flannel-cfg
          configMap:
            name: kube-flannel-cfg
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: kube-flannel-ds-arm64
  namespace: kube-system
  labels:
    tier: node
    app: flannel
spec:
  selector:
    matchLabels:
      app: flannel
  template:
    metadata:
      labels:
        tier: node
        app: flannel
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: beta.kubernetes.io/os
                    operator: In
                    values:
                      - linux
                  - key: beta.kubernetes.io/arch
                    operator: In
                    values:
                      - arm64
      hostNetwork: true
      tolerations:
      - operator: Exists
        effect: NoSchedule
      serviceAccountName: flannel
      initContainers:
      - name: install-cni
        image: quay.io/coreos/flannel:v0.11.0-arm64
        command:
        - cp
        args:
        - -f
        - /etc/kube-flannel/cni-conf.json
        - /etc/cni/net.d/10-flannel.conflist
        volumeMounts:
        - name: cni
          mountPath: /etc/cni/net.d
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      containers:
      - name: kube-flannel
        image: quay.io/coreos/flannel:v0.11.0-arm64
        command:
        - /opt/bin/flanneld
        args:
        - --ip-masq
        - --kube-subnet-mgr
        resources:
          requests:
            cpu: "100m"
            memory: "50Mi"
          limits:
            cpu: "100m"
            memory: "50Mi"
        securityContext:
          privileged: false
          capabilities:
             add: ["NET_ADMIN"]
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        volumeMounts:
        - name: run
          mountPath: /run/flannel
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      volumes:
        - name: run
          hostPath:
            path: /run/flannel
        - name: cni
          hostPath:
            path: /etc/cni/net.d
        - name: flannel-cfg
          configMap:
            name: kube-flannel-cfg
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: kube-flannel-ds-arm
  namespace: kube-system
  labels:
    tier: node
    app: flannel
spec:
  selector:
    matchLabels:
      app: flannel
  template:
    metadata:
      labels:
        tier: node
        app: flannel
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: beta.kubernetes.io/os
                    operator: In
                    values:
                      - linux
                  - key: beta.kubernetes.io/arch
                    operator: In
                    values:
                      - arm
      hostNetwork: true
      tolerations:
      - operator: Exists
        effect: NoSchedule
      serviceAccountName: flannel
      initContainers:
      - name: install-cni
        image: quay.io/coreos/flannel:v0.11.0-arm
        command:
        - cp
        args:
        - -f
        - /etc/kube-flannel/cni-conf.json
        - /etc/cni/net.d/10-flannel.conflist
        volumeMounts:
        - name: cni
          mountPath: /etc/cni/net.d
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      containers:
      - name: kube-flannel
        image: quay.io/coreos/flannel:v0.11.0-arm
        command:
        - /opt/bin/flanneld
        args:
        - --ip-masq
        - --kube-subnet-mgr
        resources:
          requests:
            cpu: "100m"
            memory: "50Mi"
          limits:
            cpu: "100m"
            memory: "50Mi"
        securityContext:
          privileged: false
          capabilities:
             add: ["NET_ADMIN"]
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        volumeMounts:
        - name: run
          mountPath: /run/flannel
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      volumes:
        - name: run
          hostPath:
            path: /run/flannel
        - name: cni
          hostPath:
            path: /etc/cni/net.d
        - name: flannel-cfg
          configMap:
            name: kube-flannel-cfg
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: kube-flannel-ds-ppc64le
  namespace: kube-system
  labels:
    tier: node
    app: flannel
spec:
  selector:
    matchLabels:
      app: flannel
  template:
    metadata:
      labels:
        tier: node
        app: flannel
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: beta.kubernetes.io/os
                    operator: In
                    values:
                      - linux
                  - key: beta.kubernetes.io/arch
                    operator: In
                    values:
                      - ppc64le
      hostNetwork: true
      tolerations:
      - operator: Exists
        effect: NoSchedule
      serviceAccountName: flannel
      initContainers:
      - name: install-cni
        image: quay.io/coreos/flannel:v0.11.0-ppc64le
        command:
        - cp
        args:
        - -f
        - /etc/kube-flannel/cni-conf.json
        - /etc/cni/net.d/10-flannel.conflist
        volumeMounts:
        - name: cni
          mountPath: /etc/cni/net.d
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      containers:
      - name: kube-flannel
        image: quay.io/coreos/flannel:v0.11.0-ppc64le
        command:
        - /opt/bin/flanneld
        args:
        - --ip-masq
        - --kube-subnet-mgr
        resources:
          requests:
            cpu: "100m"
            memory: "50Mi"
          limits:
            cpu: "100m"
            memory: "50Mi"
        securityContext:
          privileged: false
          capabilities:
             add: ["NET_ADMIN"]
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        volumeMounts:
        - name: run
          mountPath: /run/flannel
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      volumes:
        - name: run
          hostPath:
            path: /run/flannel
        - name: cni
          hostPath:
            path: /etc/cni/net.d
        - name: flannel-cfg
          configMap:
            name: kube-flannel-cfg
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: kube-flannel-ds-s390x
  namespace: kube-system
  labels:
    tier: node
    app: flannel
spec:
  selector:
    matchLabels:
      app: flannel
  template:
    metadata:
      labels:
        tier: node
        app: flannel
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: beta.kubernetes.io/os
                    operator: In
                    values:
                      - linux
                  - key: beta.kubernetes.io/arch
                    operator: In
                    values:
                      - s390x
      hostNetwork: true
      tolerations:
      - operator: Exists
        effect: NoSchedule
      serviceAccountName: flannel
      initContainers:
      - name: install-cni
        image: quay.io/coreos/flannel:v0.11.0-s390x
        command:
        - cp
        args:
        - -f
        - /etc/kube-flannel/cni-conf.json
        - /etc/cni/net.d/10-flannel.conflist
        volumeMounts:
        - name: cni
          mountPath: /etc/cni/net.d
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      containers:
      - name: kube-flannel
        image: quay.io/coreos/flannel:v0.11.0-s390x
        command:
        - /opt/bin/flanneld
        args:
        - --ip-masq
        - --kube-subnet-mgr
        resources:
          requests:
            cpu: "100m"
            memory: "50Mi"
          limits:
            cpu: "100m"
            memory: "50Mi"
        securityContext:
          privileged: false
          capabilities:
             add: ["NET_ADMIN"]
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        volumeMounts:
        - name: run
          mountPath: /run/flannel
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      volumes:
        - name: run
          hostPath:
            path: /run/flannel
        - name: cni
          hostPath:
            path: /etc/cni/net.d
        - name: flannel-cfg
          configMap:
            name: kube-flannel-cfg

应用配置

[root@k8s-node1 mydata]# kubectl apply -f kube-flannel.yml
podsecuritypolicy.policy/psp.flannel.unprivileged created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds-amd64 created
daemonset.apps/kube-flannel-ds-arm64 created
daemonset.apps/kube-flannel-ds-arm created
daemonset.apps/kube-flannel-ds-ppc64le created
daemonset.apps/kube-flannel-ds-s390x created

验证(namespace、pods、nodes)
查看namespace

[root@k8s-node1 mydata]# kubectl get ns
NAME              STATUS   AGE
default           Active   9m23s
kube-node-lease   Active   9m24s
kube-public       Active   9m24s
kube-system       Active   9m24s

查看全部 pods，（注意：flannel 需要等待全部 STATUS 为 Running）

[root@k8s-node1 mydata]# kubectl get pods --all-namespaces
NAMESPACE     NAME                                READY   STATUS    RESTARTS   AGE
kube-system   coredns-7f9c544f75-jqtj2            1/1     Running   0          9m39s
kube-system   coredns-7f9c544f75-wccn7            1/1     Running   0          9m39s
kube-system   etcd-k8s-node1                      1/1     Running   0          9m35s
kube-system   kube-apiserver-k8s-node1            1/1     Running   0          9m35s
kube-system   kube-controller-manager-k8s-node1   1/1     Running   0          9m35s
kube-system   kube-flannel-ds-amd64-xftpp         1/1     Running   0          2m9s
kube-system   kube-proxy-mjkhw                    1/1     Running   0          9m39s
kube-system   kube-scheduler-k8s-node1            1/1     Running   0          9m34s

查看 nodes

[root@k8s-node1 mydata]# kubectl get nodes
NAME        STATUS   ROLES    AGE   VERSION
k8s-node1   Ready    master   12m   v1.17.3

将节点 k8s-node2、k8s-node3 加入到集群
分别在 k8s-node2、k8s-node3 上执行如下命令

kubeadm join 10.0.2.5:6443 --token 0ppodt.z1cti482pzo1f3l9 \
    --discovery-token-ca-cert-hash sha256:e50cb03684f8cce41b40db7455292b2c82465b1783e4311a586fe3df13b19d06
说明：10.0.2.5:6443 是 master地址，token是安装master时生成的，如果生成后过期（10分钟）需要重新生成一次 token，命令如下：
kubeadm token create --ttl 0 --print-join-command

常见问题

ImagePullBackOff 或者 CrashLoopBackOff
[root@k8s-node1 mydata]# kubectl get pods --all-namespaces
kube-system   kube-flannel-ds-amd64-hpt7c         0/1     Init:ImagePullBackOff
原因：通常是因为网络故障，镜像下载失败造成
解决办法：
去 https://github.com/coreos/flannel/releases 下载 flanneld-v0.11.0-amd64.docker
上传到问题节点（k8s-node3），载入到 docker 仓库
[root@k8s-node3 k8s]# docker load < flanneld-v0.11.0-amd64.docker
验证
docker images
主节点上执行 kubectl get pod -n kube-system -o wide 命令，对应故障节点是 Running 状态

6. Helm、Tiller、PV

k8s-node1 上执行
参考文档：https://kubesphere.com.cn/docs/zh-CN/installation/prerequisites/
安装 Helm、Tiller
从网站 https://github.com/helm/helm/releases/tag/v2.16.3，下载 helm-v2.16.3-linux-amd64.tar.gz，我下载了其它几个版本，安装时都不太顺利，最后选择这个版本没问题。
上传到服务器，解压移动 helm、tiller 到 /usr/local/bin/

[root@k8s-node1 linux-amd64]# cp helm /usr/local/bin/
[root@k8s-node1 linux-amd64]# cp tiller /usr/local/bin/

验证

helm help

创建权限
helm_rbac.yaml，内容如下：

apiVersion: v1
kind: ServiceAccount
metadata:
  name: tiller
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: tiller
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
  - kind: ServiceAccount
    name: tiller
    namespace: kube-system

应用

[root@k8s-node1 k8s]# kubectl apply -f helm_rbac.yaml
serviceaccount/tiller created
clusterrolebinding.rbac.authorization.k8s.io/tiller created

初始化

helm init --service-account=tiller --tiller-image=registry.cn-hangzhou.aliyuncs.com/google_containers/tiller:v2.16.3 --history-max 300

验证 helm 或 tiller

kubectl -n kube-system get pods|grep tiller

如果状态是 ImagePullBackOff，表示镜像未拉取成功，需要手动拉取

检查 tiller 是否部署到 k8s

[root@k8s-node1 local]# kubectl get pod -n kube-system -l app=helm
NAME                             READY   STATUS    RESTARTS   AGE
tiller-deploy-7b76b656b5-m4k2x   1/1     Running   0          94s

安装PV
查看是否有污点

[root@k8s-node1 local]# kubectl describe node k8s-node1|grep Taint
Taints:             node-role.kubernetes.io/master:NoSchedule

去掉污点(注意：待安装完 kubesphere 之后再添加污点)

[root@k8s-node1 local]# kubectl taint nodes k8s-node1 node-role.kubernetes.io/master:NoSchedule-
node/k8s-node1 untainted

安装OpenEBS
创建 namespace

[root@k8s-node1 local]# kubectl create ns openebs
namespace/openebs created

安装

[root@k8s-node1 local]# helm install --namespace openebs --name openebs stable/openebs --version 1.5.0

验证(等待安装完成后，有时需要很长时间，甚至一晚上，我是装了四遍)

[root@k8s-node1 local]# kubectl get pods -n openebs
NAMESPACE NAME                                         READY   STATUS    RESTARTS   AGE
openebs   openebs-admission-server-5cf6864fbf-q8spk    1/1     Running   0          6m38s
openebs   openebs-apiserver-bc55cd99b-5hw6j            1/1     Running   0          6m38s
openebs   openebs-localpv-provisioner-85ff89dd44-f289w 1/1     Running   0          6m38s
openebs   openebs-ndm-8bt9l                            1/1     Running   0          6m38s
openebs   openebs-ndm-cpkv4                            1/1     Running   0          6m38s
openebs   openebs-ndm-m85s7                            1/1     Running   0          6m38s
openebs   openebs-ndm-operator-87df44d9-plkb5          1/1     Running   1          6m38s
openebs   openebs-provisioner-7f86c6bb64-cmz7n         1/1     Running   0          6m38s
openebs   openebs-snapshot-operator-54b9c886bf-zcsjg   2/2     Running   0          6m38s

将 openebs-hostpath设置为默认的 StorageClass

[root@k8s-node1 local]# kubectl patch storageclass openebs-hostpath -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
storageclass.storage.k8s.io/openebs-hostpath patched

验证

[root@k8s-node1 local]# kubectl get sc
NAME                         PROVISIONER                                                RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
openebs-device               openebs.io/local                                           Delete          WaitForFirstConsumer   false                  7m20s
openebs-hostpath (default)   openebs.io/local                                           Delete          WaitForFirstConsumer   false                  7m20s
openebs-jiva-default         openebs.io/provisioner-iscsi                               Delete          Immediate              false                  7m21s
openebs-snapshot-promoter    volumesnapshot.external-storage.k8s.io/snapshot-promoter   Delete          Immediate              false                  7m20s
openebs-hostpath 变为 default

7. Kubespaere

参考文档：
https://github.com/kubesphere/kubesphere
https://github.com/kubesphere/kubesphere/blob/master/README_zh.md
https://kubesphere.com.cn/docs/zh-CN/installation/install-on-k8s/

最小化安装
提前下载好 https://raw.githubusercontent.com/kubesphere/ks-installer/master/kubesphere-minimal.yaml 文件，上传到服务器

[root@k8s-node1 k8s]# kubectl apply -f kubesphere-minimal.yaml 
namespace/kubesphere-system created
configmap/ks-installer created
serviceaccount/ks-installer created
clusterrole.rbac.authorization.k8s.io/ks-installer created
clusterrolebinding.rbac.authorization.k8s.io/ks-installer created
deployment.apps/ks-installer created

查看实时安装日志
注意：需要等待 ks-install pod 创建并Running后，才能看到相关日志信息

kubectl logs -n kubesphere-system $(kubectl get pod -n kubesphere-system -l app=ks-install -o jsonpath='{.items[0].metadata.name}') -f

最后显示如下信息表示安装完成

Start installing monitoring
**************************************************
task monitoring status is successful
total: 1     completed:1
**************************************************
#####################################################
###              Welcome to KubeSphere!           ###
#####################################################
Console: http://10.0.2.5:30880
Account: admin
Password: P@88w0rd
NOTES：
  1. After logging into the console, please check the
     monitoring status of service components in
     the "Cluster Status". If the service is not
     ready, please wait patiently. You can start
     to use when all components are ready.
  2. Please modify the default password after login.
#####################################################

查看所有 Pod 状态，必须全部为 Running，否则还没有安装完成

[root@k8s-node1 k8s]# kubectl get pods --all-namespaces

所有 Pod 状态必须为 Running，使用 IP:30880访问 KubeSphere UI 界面，默认的集群管理员账号为 admin/P@88w0rd。
查看 crd

kubectl get crd

查看 workspaces

kubectl get workspaces

最后安装成功后，添加污点

[root@k8s-node1 local]# kubectl taint nodes k8s-node1 node-role.kubernetes.io/master=:NoSchedule

安装过程中常见问题

1. pod 状态为 ImagePullBackOff、CrashLoopBackOff、Pending

表示网络原因造成镜像拉取失败，或者创建不成功，只能等待。如果一天还未完成，可能已经无法正常继续安装了，解决办法：重新安装

2. workspace failed，造成安装停止，不能继续进行

说明：此日志出现在安装 kubesphere 最后阶段，从日志上看是工作空间安装失败，类似还有系统空间、kubesphere配置等组件；
解决办法：重新安装(kubectl rollout restart deploy -n kubesphere-system ks-installer)

3. prometheus failed，导致管理平台监控信息都为0

说明：此现象是安装成功，管理平台也能登录，但监控数据为0，如果细心留意了当时的安装日志，里面可能会发现相关安装组件失败的信息，可通过如下命令查看 prometheuses

[root@k8s-node1 ~]# kubectl get prometheuses -n kubesphere-monitoring-system
No resources found in kubesphere-monitoring-system namespace.

解决办法：重新安装

Feynman

chuanning 感谢！这篇技术帖写得非常详细！牛 🐂
小建议：文章中还有一些部分的代码没有用代码块样式，建议对其优化，提升阅读体验。

xiaomingbudushu

第一次安装最后阶段monitoring报错，没有 prometheus-k8s-0 、prometheus-k8s-system-0 和 openldap-0，ks-account 一直 Init:½
尝试重装（kubectl rollout restart deploy -n kubesphere-system ks-installer），前面两个有了，但是 openldap-0还是没有，而且ks-account 出现两个，两个都是Init:½

chuanning Feynman

xinghai

Feynman 着急！！！！！ ks-account 安装中 Init:0/2 问题不管是 dns 问题还是其他问题能不能给个完整解决方案七天了查了所有还是不能解决啊给个教程啊

zhangyuxiansen2017

安装完进入界面出现
500 Internal Privoxy Error
Privoxy encountered an error while processing your request:
Could not load template file no-server-data or one of its included components.
Please contact your proxy administrator.
If you are the proxy administrator, please put the required file(s)in the (confdir)/templates directory. The location of the (confdir) directory is specified in the main Privoxy config file. (It’s typically the Privoxy install directory).

chuanning

zhangyuxiansen2017
这个要具体看一下 pod 是否都是 Running 状态，我在前二次安装时，k8s-account、k8s-getway 经常不是 Running 状态，导致登录后页面就报500错误，建议先查看一下pod状态，例如：

[root@k8s-node1 ~]# kubectl get pods --all-namespaces
NAMESPACE                      NAME                                           READY   STATUS    RESTARTS   AGE
kube-system                    coredns-7f9c544f75-2lj7z                       1/1     Running   1          14d
kube-system                    coredns-7f9c544f75-7zwbr                       1/1     Running   1          14d
kube-system                    etcd-k8s-node1                                 1/1     Running   1          14d
kube-system                    kube-apiserver-k8s-node1                       1/1     Running   1          14d
kube-system                    kube-controller-manager-k8s-node1              1/1     Running   5          14d
kube-system                    kube-flannel-ds-amd64-2smws                    1/1     Running   1          14d
kube-system                    kube-flannel-ds-amd64-5k4jk                    1/1     Running   0          14d
kube-system                    kube-flannel-ds-amd64-krmg4                    1/1     Running   0          14d
kube-system                    kube-proxy-5wvhk                               1/1     Running   0          14d
kube-system                    kube-proxy-gvm8b                               1/1     Running   1          14d
kube-system                    kube-proxy-qm6lh                               1/1     Running   0          14d
kube-system                    kube-scheduler-k8s-node1                       1/1     Running   4          14d
kube-system                    tiller-deploy-7b76b656b5-xxz62                 1/1     Running   0          9d
kubesphere-controls-system     default-http-backend-5d464dd566-nbcs5          1/1     Running   4          8d
kubesphere-controls-system     kubectl-admin-6c664db975-qkfxf                 1/1     Running   0          6d16h
kubesphere-monitoring-system   kube-state-metrics-566cdbcb48-xfhbn            4/4     Running   0          6d16h
kubesphere-monitoring-system   node-exporter-ptb8p                            2/2     Running   0          6d16h
kubesphere-monitoring-system   node-exporter-skr7x                            2/2     Running   0          6d16h
kubesphere-monitoring-system   node-exporter-tdmzt                            2/2     Running   0          6d16h
kubesphere-monitoring-system   prometheus-k8s-0                               3/3     Running   1          34h
kubesphere-monitoring-system   prometheus-k8s-system-0                        3/3     Running   1          34h
kubesphere-monitoring-system   prometheus-operator-6b97679cfd-4rhln           1/1     Running   0          6d16h
kubesphere-system              ks-account-794477c45d-n2q6n                    1/1     Running   0          34h
kubesphere-system              ks-apigateway-546d4df545-g4q7c                 1/1     Running   0          34h
kubesphere-system              ks-apiserver-689675d48c-69n67                  1/1     Running   0          34h
kubesphere-system              ks-console-86599887c-g4xwt                     1/1     Running   0          34h
kubesphere-system              ks-controller-manager-6fd9b6d999-nj6cb         1/1     Running   0          34h
kubesphere-system              ks-installer-7557594789-ltkzl                  1/1     Running   0          34h
kubesphere-system              openldap-0                                     1/1     Running   0          8d
kubesphere-system              redis-6fd6c6d6f9-ps5t4                         1/1     Running   0          8d
openebs                        openebs-admission-server-5cf6864fbf-dvlbw      1/1     Running   0          9d
openebs                        openebs-apiserver-bc55cd99b-5rdxh              1/1     Running   4          9d
openebs                        openebs-localpv-provisioner-85ff89dd44-p9td4   1/1     Running   5          9d
openebs                        openebs-ndm-5wnvr                              1/1     Running   0          9d
openebs                        openebs-ndm-operator-87df44d9-482hh            1/1     Running   1          9d
openebs                        openebs-ndm-p89wd                              1/1     Running   0          9d
openebs                        openebs-ndm-swpgb                              1/1     Running   0          9d
openebs                        openebs-provisioner-7f86c6bb64-5xmdd           1/1     Running   5          9d
openebs                        openebs-snapshot-operator-54b9c886bf-s5x9m     2/2     Running   3          9d

Feynman

xiaomingbudushu 用的什么持久化存储？集群的所有的 PVC 是什么状态？

xiaomingbudushu

Feynman

xiaomingbudushu 你的 master 节点是不是有污点（Taint）？

xiaomingbudushu

Feynman 这次操作步骤是没有污点的

我以前安装试过有污点，能看到openldap-0，但是一直pending，后来才发现要完全安装完才能放开，这次操作就是完全放开的

lmc920213

今天yum安装kubelet-1.17.3 kubeadm-1.17.3 kubectl-1.17.3 遇到这个问题，有大佬帮忙看下吗

Feynman

xinghai ks-account 起不来的原因有很多，不知道具体问题的前提下我们也没有通用的解决方案。你新建一个帖子，把 ks-account 的 Pod 日志和 events 贴出看看？

xinghai

Feynman 好的都需要看那些信息啊

huanggze

xinghai

kubectl describe. kubectl logs 啊。之类的，有什么贴出来

Feynman

xinghai kubectl logs 查看 ks-account 的 Pod 日志，describe 看看 Pod 的 events

xinghai

Feynman huanggze
这是我之前修改 kube-system 中的 coredns 改坏了
起不来

改了这块

xinghai

huanggze Feynman
https://kubesphere.com.cn/forum/d/1600-ks-account-init-1-2-helm-2-16-3-k8s-1-17-3
我发布了

huanggze

xinghai

莫急，按 Feynman 版主的建议：

开新帖
kubectl logs 查看 ks-account 的 Pod 日志，describe 看看 Pod 的 events

xinghai

huanggze 好的我弄个

xinghai

huanggze 给看看呀大佬