kubesphere2.1.0和kubesphere2.1.1离线和在线安装指南

noteshare · 2020年5月30日

记录一次客户环境安装kubesphere的流程
以上是2.1.0的在线安装，2.1.1的也是如此。
Kubesphere 离线自动化安装完整过程
以上是2.1.1的离线安装

以上安装教程在多套云环境下均安装过：包含神龙云、腾讯云、华为云、公司内部虚拟机环境都有安装过，各个厂商云服务器需要做的特殊修改也有提及，基本是比较完整的。

另外在此感谢青云的平台，让我一个刚接手k8s的人可以同时维护6套云环境，以后可能会更多，压力山大，后续还得仰仗青云。好想发个表情，这个帖子编写体验有待改善。哈哈！

PS：由于是新手，所以可能存在部分冗余操作，但是正是由于是新手，所以写的比较全。文章之前写的，可能存在部分后面优化的内容没完善到，后续会在原地址不断完善起来。

以下将2.1.0的在客户环境的完整安装过程输出：

[TOC]

记录一次客户环境安装kubesphere的流程

服务器申请注意事项

k8s的所有机器和nfs机器需要能够访问互联网，用于下载安装软件，安装完后可以关闭
10.233段的ip需要确保没被使用，k8s集群需要用来做虚拟ip
k8s机器的ip段例如192.168.3.xxx需要互通和kube_pods_subnet: 10.233.64.0/18以及kube_service_addresses: 10.233.0.0/18网段也要互通；华为云要互通需要取消所有k8s主机的通信认证，类似其他云服务商的安全策略。
k8s机器的ip段的222ip需要预留出来做master负载均衡，神龙云该ip需要挂载网卡不挂载实例，华三云不需要挂载，但是需要取消3台master节点的mac绑定否则只有一台能够ping通222的ip，华为云需要单独申请虚拟ip。
k8s集群机器的密码请根据实际情况修改好后再提供给出来，密码中不要带!$等非常特殊的字符，有些特殊字符会导致安装报错
所有k8s机器的ssh需要配置能够通过账号密码访问，需要取消ssh超时限制，且能够快速的通过ssh切换到其他机器，否则容易导致切换超时问题。实在不行那就配置ssh免登录试试吧，这是个痛苦的活（好吧，我试过了有效）
所有服务器根目录给150G
cpu和内存，操作系统需要满足要求，cpu建议4核以上，内存建议16G以上，操作系统CentOS Linux release 7.7.1908 (Core)
问下客户是否有云存储有的话用客户的云存储，这样就不用自己搭建存储和做备份了
确保所有机器的硬件时间和系统时间都是一样且正确的，另外注意修改镜像中的时区https://blog.csdn.net/aixiaoyang168/article/details/88341082
腾讯云注意事项，/var/lib/docker和/var/lib/kubelet不能挂在腾讯的cfs存储，个人测试的挂载安装会导致一些问题，取消挂载则无此问题。

背景介绍

在客户的服务器上安装k8s，采用了kubesphere的安装脚本来安装。
资源情况：

服务器使用的是阿里的神龙云，其和其他几个厂商的云存在一定差异，属于物理云
操作系统是centos7.7 64
服务器资源情况：

服务用途	服务器ip	备注
nfs2	192.168.100.68	做nfs服务器
master1	192.168.100.57	主节点
master2	192.168.100.56	主节点
master3	192.168.100.58	主节点
node1	192.168.100.61	工作节点
node2	192.168.100.60	工作节点
node3	192.168.100.62	工作节点
node4	192.168.100.59	工作节点
node5	192.168.100.71	工作节点
node6	192.168.100.72	工作节点

安装前服务器信息验证与配置

验证脚本：可以列出linux系统版本，内核版本，cpu情况，内存情况
cat /etc/redhat-release && cat /proc/cpuinfo |grep "processor"|wc -l && cat /proc/cpuinfo |grep "physical id"|sort |uniq|wc -l && fdisk -l|grep /dev/sda | head -n 1 && cat /proc/meminfo | grep MemTotal
建议其中一台用来做安装的任务机master的cpu至少4核，内存8G，宁外两台master机可以是2核8G的服务器，node节点至少8G以上，如果node机器并不是很多，服务器资源不是很足的话建议node机器可以少几台但是内存可以稍微大一些如用32G。
以下是node1的情况
```
CentOS Linux release 7.7.1908 (Core)
4
4
MemTotal:       16262128 kB
```

检查服务器之间是否可以通过ssh相互登录，并设置ssh超时时间

# 登录检查
sss xxx.xxx.xxx.xxx
# ssh超时设置
注释文件/etc/profile中最下面的export TMOUT=300
并使其立即生效
source /etc/profile

防火墙是否关闭验证,已关闭
firewall-cmd --state
检查服务器是否可以联网,验证结果为可以上网
ping baidu.com

查看网络配置和修改网络配置，这个一般不用动，一般服务器提供方会弄好给你。

# 网卡uuid生成方法
uuidgen eth0
# 查看网络配置
cd /etc/sysconfig/network-scripts
cat ifcfg-ens160
# node1情况
[root@node1 network-scripts]# cat ifcfg-eth0 
TYPE="Ethernet"
PROXY_METHOD="none"
BROWSER_ONLY="no"
BOOTPROTO="static"
DEFROUTE="yes"
IPV4_FAILURE_FATAL="no"
IPV6INIT="yes"
IPV6_AUTOCONF="yes"
IPV6_DEFROUTE="yes"
IPV6_FAILURE_FATAL="no"
IPV6_ADDR_GEN_MODE="stable-privacy"
NAME="eth0"
UUID="04d22cbb-d66d-407d-9c96-9283a669d271"
DEVICE="eth0"
ONBOOT="yes"
IPADDR="192.168.100.61"
PREFIX="24"
GATEWAY="192.168.100.1"
IPV6_PRIVACY="no"
DNS1="8.8.8.8"
DNS2="144.144.144.144"
# 重启网卡
service network restart

网速测试

cd /home/tools
wget https://raw.githubusercontent.com/sivel/speedtest-cli/master/speedtest.py
chmod a+rx speedtest.py
mv speedtest.py /usr/local/bin/speedtest-cli
chown root:root /usr/local/bin/speedtest-cli

speedtest-cli

检查磁盘信息

# 查看磁盘信息
fdisk -ls
# 查看磁盘分区和挂载情况
lsblk
# 根目录扩容，请查看linux专题关于磁盘扩容的操作说明

其他需要确保的

云服务器一般除了防火墙外还存在外部的安全组限制，一般可以通过门户配置，这个环境特殊特殊，对方最终是在虚拟化层屏蔽了网卡安全组，取消了网卡流表策略才使192的网段和10的网段打通了。
需要确保10.233的网段没被使用，且192的宿主机需要能够和10.233网段的ip可以ping通，不通服务器之间ping通。
多master的安装需要预留一个虚拟ip如我预留的192.168.100.222做master负载的虚拟ip，其需要在云门户上添加一个网卡并配置这个ip，但是这个ip不绑定服务器实例（这个是神龙云的特殊处，我在公司实验环境安装是不要求其绑定网卡的）

nfs的安装和各k8s集群与nfs的通信检查

nfs服务搭建

hostnamectl set-hostname nfs2
mkdir /home/nfs2 -p
yum install nfs-utils rpcbind -y
# 修改配置
vi /etc/systemd/system/sockets.target.wants/rpcbind.socket
# 注释：111的都注释掉
#ListenStream=0.0.0.0:111
#ListenDatagram=0.0.0.0:111
#ListenStream=[::]:111
#ListenDatagram=[::]:111

# 添加共享配置
vi /etc/exports
添加如下信息
#share /home/nfs2 by noteshare for bingbing at 2020-1-17
/home/nfs2 192.168.100.0/24(rw,sync,no_root_squash)

# 重装与启动
systemctl daemon-reload
systemctl start/restart rpcbind.socket
systemctl start/restart rpcbind
systemctl start nfs
systemctl start nfs-server
# 设置开机启动
systemctl enable nfs-server

# 验证端口
[root@vm ~]# netstat -tnulp|grep rpc
tcp        0      0 0.0.0.0:111             0.0.0.0:*               LISTEN      26754/rpcbind       
tcp        0      0 0.0.0.0:20048           0.0.0.0:*               LISTEN      21385/rpc.mountd    
tcp        0      0 0.0.0.0:36435           0.0.0.0:*               LISTEN      26764/rpc.statd     
tcp6       0      0 :::111                  :::*                    LISTEN      26754/rpcbind       
tcp6       0      0 :::20048                :::*                    LISTEN      21385/rpc.mountd    
tcp6       0      0 :::51189                :::*                    LISTEN      26764/rpc.statd     
udp        0      0 0.0.0.0:20048           0.0.0.0:*                           21385/rpc.mountd    
udp        0      0 0.0.0.0:111             0.0.0.0:*                           26754/rpcbind       
udp        0      0 0.0.0.0:49757           0.0.0.0:*                           26764/rpc.statd     
udp        0      0 0.0.0.0:639             0.0.0.0:*                           26754/rpcbind       
udp        0      0 127.0.0.1:659           0.0.0.0:*                           26764/rpc.statd     
udp6       0      0 :::20048                :::*                                21385/rpc.mountd    
udp6       0      0 :::111                  :::*                                26754/rpcbind       
udp6       0      0 :::639                  :::*                                26754/rpcbind       
udp6       0      0 :::54093                :::*                                26764/rpc.statd
# 继续 查看端口映射（在未启动nfs服务时，不能看到nfs端口的映射情况）
[root@vm ~]# rpcinfo -p localhost 
   program vers proto   port  service
   100000    4   tcp    111  portmapper
   100000    3   tcp    111  portmapper
   100000    2   tcp    111  portmapper
   100000    4   udp    111  portmapper
   100000    3   udp    111  portmapper
   100000    2   udp    111  portmapper
   100024    1   udp  49757  status
   100024    1   tcp  36435  status

# 验证
[root@nfs2 ~]# showmount -e localhost
Export list for localhost:
/home/nfs2 192.168.100.0/24

# 客户端验证，找一台机器验证192.168.100.57，如果不同需要把防火墙的端口开放，头部有端口开放要求
yum install nfs-utils rpcbind -y
/sbin/rpcbind
[root@node2 ~]# showmount -e 192.168.100.56
Export list for 192.168.100.56:
/home/nfs2 192.168.100.0/24

防火墙和安全组端口开放要求（暂未用上）

内部防火墙关闭，主要配置安全组
以下是整理的一个模板：（以下的端口开放暂时没用上，此次是把安全组关闭了，这个后续测试加上，下面是根据官方提供的整理的，暂未完全验证）

![port](http://www.itnoteshare.com\articlePic/getArticlePic.htm?fileName=25_386_37ebfb65-2f71-4978-aafd-87f98dd2f12a.jpg “port”)

集群安装

master1作为任务机

3台master机安装keepalived+haproxy，安装方式一致，下面以master1上的安装说明

yum install -y keepalived && yum install -y haproxy

# 修改/etc/haproxy

global
	log /dev/log    local0
	log /dev/log    local1 notice
	chroot /var/lib/haproxy
	#stats socket /run/haproxy/admin.sock mode 660 level admin
	stats timeout 30s
	user haproxy
	group haproxy
	daemon
	nbproc 1

defaults
	log     global
	timeout connect 5000
	timeout client  50000
	timeout server  50000

listen kube-master
	bind 0.0.0.0:8443
	mode tcp
	option tcplog
	balance roundrobin
	server master1 192.168.100.57:6443  check inter 10000 fall 2 rise 2 weight 1
	server master2 192.168.100.56:6443  check inter 10000 fall 2 rise 2 weight 1
	server master3 192.168.100.58 6443  check inter 10000 fall 2 rise 2 weight 1

# 修改/etc/keepalived，注意修改下面的interface eth0（网卡标志）

global_defs {
    router_id lb-backup
}

vrrp_instance VI-kube-master {
    state MASTER
    priority 110
    dont_track_primary
    interface eth0
    virtual_router_id 90
    advert_int 3
    virtual_ipaddress {
        192.168.100.222
    }
}

# 设置开机启动和启动服务，3台master都需要做
systemctl enable keepalived && systemctl restart keepalived && systemctl enable haproxy && systemctl restart haproxy
# 在3台master机器上分别ping虚拟ip192.168.100.222都需要能通

master1任务机下载kubesphere配置并进行配置和运行安装

mkdir /home/tools/kubesphere -p && cd /home/tools/kubesphere && curl -L https://kubesphere.io/download/stable/v2.1.0 > installer.tar.gz && tar -zxf installer.tar.gz

修改以上下载的安装信息中的common.yaml和host.ini文件，以下是我此次的文件内容

common.yaml

# 
# Copyright 2018 The KubeSphere Authors.
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#     http://www.apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# 
# KubeSphere Installer Sample Configuration File
#
# Note that below sample configuration could be reference to install
# both Kubernetes and KubeSphere together.
# For the users who want to install KubeSphere upon an existing Kubernetes cluster
# please visit https://github.com/kubesphere/ks-installer for more information

######################### Kubernetes #########################

# The supported Kubernetes to install. Note: not all
# Kubernetes versions are supported by KubeSphere, visit
# https://kubesphere.io/docs to get full support list.
kube_version: v1.15.5

# The supported etcd to install. Note: not all
# etcd versions are supported by KubeSphere, visit
# https://kubesphere.io/docs to get full support list.
etcd_version: v3.2.18

# Configure a cron job to backup etcd data, which is running on etcd host machines.
# Period of running backup etcd job, the unit is minutes.
# 30 as default, means backup etcd every 30 minutes.
etcd_backup_period: 30

# How many backup replicas to keep.
# 5 means to keep latest 5 backups, older ones will be deleted by order.
keep_backup_number: 5

# The location to store etcd backups files on etcd host machines.
etcd_backup_dir: "/var/backups/kube_etcd"

## Add other registry.
docker_registry_mirrors:
  - https://docker.mirrors.ustc.edu.cn
  - https://registry.docker-cn.com
  - https://mirror.aliyuncs.com
  - https://wixr7yss.mirror.aliyuncs.com
  - http://xxx.xxx.xxx.xxx:30280
  - http://harbor.powerdnoteshareata.com.cn:30280

docker_insecure_registries:
  - 192.168.100.57:5000
  - xxx.xxx.xxx.xxx:30280
  - harbor.noteshare.com.cn:30280

# Kubernetes network plugin. Note that calico and flannel
# are recommended plugins, which are tested and verified by KubeSphere.
kube_network_plugin: calico

# A valid CIDR range for Kubernetes services,
# 1. should not overlap with node subnet
# 2. should not overlap with Kubernetes pod subnet
kube_service_addresses: 10.233.0.0/18

# A valid CIDR range for Kubernetes pod subnet,
# 1. should not overlap with node subnet
# 2. should not overlap with Kubernetes services subnet
kube_pods_subnet: 10.233.64.0/18

# Kube-proxy proxyMode configuration, either ipvs, or iptables
kube_proxy_mode: ipvs

# Maximum pods allowed to run on every node.
kubelet_max_pods: 110

# Enable nodelocal dns cache
enable_nodelocaldns: true

# HA(Highly Available) loadbalancer example config
# apiserver_loadbalancer_domain_name: "lb.kubesphere.local"
loadbalancer_apiserver:
  address: 192.168.100.222
  port: 8443

######################### Common Storage #########################

# This section will configure storage to use in Kubernetes.
# For full supported storage list, please check
# https://docs.kubesphere.io/v2.1/zh-CN/installation/storage-configuration

# LOCAL VOLUME
# KubeSphere will use local volume as storage by default.
# This is just for demostration and testing purpose, and highly not
# recommended in production environment.
# For production environment, please change to other storage type.
local_volume_enabled: false
local_volume_is_default_class: false
local_volume_storage_class: local


# CEPH RBD
# KubeSphere can use an existing ceph as backend storage service.
# change to true to use ceph,
# MUST disable other storage types in configuration file.
ceph_rbd_enabled: false
ceph_rbd_is_default_class: false
ceph_rbd_storage_class: rbd

# Ceph rbd monitor endpoints, for example
#
# ceph_rbd_monitors:
#   - 172.24.0.1:6789
#   - 172.24.0.2:6789
#   - 172.24.0.3:6789
ceph_rbd_monitors:
  - SHOULD_BE_REPLACED

# ceph admin account name
ceph_rbd_admin_id: admin

# ceph admin secret, for example,
# ceph_rbd_admin_secret: AQAnwihbXo+uDxAAD0HmWziVgTaAdai90IzZ6Q==
ceph_rbd_admin_secret: TYPE_ADMIN_ACCOUNT_HERE
ceph_rbd_pool: rbd
ceph_rbd_user_id: admin
# e.g. ceph_rbd_user_secret: AQAnwihbXo+uDxAAD0HmWziVgTaAdai90IzZ6Q==
ceph_rbd_user_secret: TYPE_ADMIN_SECRET_HERE
ceph_rbd_fsType: ext4
ceph_rbd_imageFormat: 1

# Additional ceph configurations
# ceph_rbd_imageFeatures: layering


# NFS CONFIGURATION
# KubeSphere can use existing nfs service as backend storage service.
# change to true to use nfs.
nfs_client_enabled: true
nfs_client_is_default_class: true

# Hostname of the NFS server(ip or hostname)
nfs_server: 192.168.100.68

# Basepath of the mount point
nfs_path: /home/nfs2/k8sdata
nfs_vers3_enabled: false
nfs_archiveOnDelete: false


# GLUSTERFS CONFIGURATION
# change to true to use glusterfs as backend storage service.
# for more detailed configuration, please check
# https://docs.kubesphere.io/v2.1/zh-CN/installation/storage-configuration
glusterfs_provisioner_enabled: false
glusterfs_provisioner_is_default_class: false
glusterfs_provisioner_storage_class: glusterfs
glusterfs_provisioner_restauthenabled: true
# e.g. glusterfs_provisioner_resturl: http://192.168.0.4:8080
glusterfs_provisioner_resturl: SHOULD_BE_REPLACED
# e.g. glusterfs_provisioner_clusterid: 6a6792ed25405eaa6302da99f2f5e24b
glusterfs_provisioner_clusterid: SHOULD_BE_REPLACED
glusterfs_provisioner_restuser: admin
glusterfs_provisioner_secretName: heketi-secret
glusterfs_provisioner_gidMin: 40000
glusterfs_provisioner_gidMax: 50000
glusterfs_provisioner_volumetype: replicate:2
# e.g. jwt_admin_key: 123456
jwt_admin_key: SHOULD_BE_REPLACED


######################### KubeSphere #########################

# Version of KubeSphere
ks_version: v2.1.0

# KubeSphere console port, range 30000-32767,
# but 30180/30280/30380 are reserved for internal service
console_port: 30880

# Enable Multi users login.
# false means allowing only one active session to access KubeSphere for same account.
# Duplicated login action will cause previous session invalid and 
# that user will be logged out by force.
enable_multi_login: true

# devops/openpitrix/notification/alerting components depend on
# mysql/minio/etcd/openldap/redis to store credentials.
# Configure parameters below to set how much storage they use.
mysql_volume_size: 20Gi
minio_volume_size: 20Gi
etcd_volume_size: 20Gi
openldap_volume_size: 2Gi
redis_volume_size: 2Gi


# MONITORING CONFIGURATION
# monitoring is a MUST required component for KubeSphere,
# monitoring deployment configuration
# prometheus replicas numbers,
# 2 means better availability, but more resource consumption
prometheus_replicas: 2

# prometheus pod memory requests
prometheus_memory_request: 400Mi

# prometheus storage size,
# 20Gi means every prometheus replica consumes 20Gi storage
prometheus_volume_size: 20Gi

# whether to install a grafana
grafana_enabled: false

# LOGGING CONFIGURATION
# logging is an optional component when installing KubeSphere, and
# Kubernetes builtin logging APIs will be used if logging_enabled is set to false. 
# Builtin logging only provides limited functions, so recommend to enable logging.
logging_enabled: true
elasticsearch_master_replicas: 1
elasticsearch_data_replicas: 2
logsidecar_replicas: 2
elasticsearch_volume_size: 50Gi
log_max_age: 30
elk_prefix: logstash
kibana_enabled: false
#external_es_url: SHOULD_BE_REPLACED
#external_es_port: SHOULD_BE_REPLACED

# DEVOPS CONFIGURATION
# Devops is an optional component for KubeSphere.
devops_enabled: false
jenkins_memory_lim: 8Gi
jenkins_memory_req: 4Gi
jenkins_volume_size: 8Gi
jenkinsJavaOpts_Xms: 3g
jenkinsJavaOpts_Xmx: 6g
jenkinsJavaOpts_MaxRAM: 8g
sonarqube_enabled: false
#sonar_server_url: SHOULD_BE_REPLACED
#sonar_server_token: SHOULD_BE_REPLACED

# Following components are all optional for KubeSphere,
# Which could be turned on to install it before installation or later by updating its value to true
openpitrix_enabled: true
metrics_server_enabled: false
servicemesh_enabled: true
notification_enabled: true
alerting_enabled: true

# Harbor is an optional component for KubeSphere.
# Which could be turned on to install it before installation or later by updating its value to true
harbor_enabled: false
harbor_domain: harbor.devops.kubesphere.local
# GitLab is an optional component for KubeSphere.
# Which could be turned on to install it before installation or later by updating its value to true
gitlab_enabled: false
gitlab_hosts_domain: devops.kubesphere.local


# Container Engine Acceleration
# Use nvidia gpu acceleration in containers
# KubeSphere currently support Nvidia GPU V100 P100 1060 1080 1080Ti
# The driver version is 387.26，cuda is 9.1
# nvidia_accelerator_enabled: true
# nvidia_gpu_nodes:
#   - kube-gpu-001

hosts.ini

; Parameters:
;  ansible_connection: connection type to the target machine
;  ansible_host: the host name of the target machine
;  ip: ip address of the target machine
;  ansible_user: the default user name for ssh connection
;  ansible_ssh_pass: the password for ssh connection
;  ansible_become_pass: the privilege escalation password to grant access
;  ansible_port: the ssh port number, if not 22

; If installer is ran as non-root user who has sudo privilege, refer to the following sample configuration:
; e.g 
;  master ansible_connection=local  ip=192.168.0.5  ansible_user=ubuntu  ansible_become_pass=Qcloud@123 
;  node1  ansible_host=192.168.0.6  ip=192.168.0.6  ansible_user=ubuntu  ansible_become_pass=Qcloud@123
;  node2  ansible_host=192.168.0.8  ip=192.168.0.8  ansible_user=ubuntu  ansible_become_pass=Qcloud@123

; As recommended as below sample configuration, use root account by default to install


[all]
master1 		ansible_connection=local  		ip=192.168.100.57
master2  		ansible_host=192.168.100.56  	ip=192.168.100.56  	ansible_ssh_pass=noteshare@568
master3  		ansible_host=192.168.100.58  	ip=192.168.100.58  	ansible_ssh_pass=noteshare@568
node1  			ansible_host=192.168.100.61  	ip=192.168.100.61  	ansible_ssh_pass=noteshare@568
node1  			ansible_host=192.168.100.61  	ip=192.168.100.61  	ansible_ssh_pass=noteshare@568
node2  			ansible_host=192.168.100.60  	ip=192.168.100.60  	ansible_ssh_pass=noteshare@568
node3  			ansible_host=192.168.100.62  	ip=192.168.100.62  	ansible_ssh_pass=noteshare@568
node4  			ansible_host=192.168.100.59  	ip=192.168.100.59  	ansible_ssh_pass=noteshare@568
node5  			ansible_host=192.168.100.71  	ip=192.168.100.71  	ansible_ssh_pass=noteshare@568
node6  			ansible_host=192.168.100.72  	ip=192.168.100.72  	ansible_ssh_pass=noteshare@568

[kube-master]
master1
master2
master3

[kube-node]
node1
node2
node3
node4
node5
node6

[etcd]
master1
master2
master3

[k8s-cluster:children]
kube-node
kube-master

执行安装
master1节点执行安装

cd /home/tools/kubesphere/kubesphere-all-v2.1.0/scripts
./install.sh
2
yes
...等待
如果失败了，可以重新直接再执行一次安装，多执行个2~3次，有的时候是因为超时导致失败的。

安装部分问题处理

问题一：
https://kubesphere.com.cn/docs/v2.1/zh-CN/faq/faq-install/
以下是我用到的：

4.3、FAELED - RETRYING: KubeSphere Waiting for ks-console (30 retries left)一直卡在这里。

答： 检查cpu和内存，目前单机版部署至少8核16G 当free -m 时buff/cache占用内存资源多，执行echo 3 > /proc/sys/vm/drop_caches 指令释放下内存

问题二：
命名空间页面弹出提示：
Internal Server Error
rpc error: code = Internal desc = describe resources failed: Error 1146: Table ‘cluster.cluster’ doesn’t exist
处理方法：
使用以下命令查看job运行情况
kubectl get job -n openpitrix-system -o wide
使用命令：将所有未0/1也就是启动失败的job重新启动
kubectl -n openpitrix-system get job openpitrix-task-db-ctrl-job -o json | jq 'del(.spec.selector)' | jq 'del(.spec.template.metadata.labels)' | kubectl replace --force -f -
问题得到解决
问题三：
报错信息

2020-02-27 19:14:12,895 p=28985 u=root |  FAILED - RETRYING: download_file | Download item (1 retries left).
2020-02-27 19:16:35,899 p=28985 u=root |  An exception occurred during task execution. To see the full traceback, use -vvv. The error was: SSLError: ('The read operation timed out',)
2020-02-27 19:16:35,900 p=28985 u=root |  fatal: [node1 -> 10.16.3.17]: FAILED! => {
    "attempts": 4, 
    "changed": false
}

MSG:

failed to create temporary content file: ('The read operation timed out',)

2020-02-27 19:16:42,724 p=28985 u=root |  An exception occurred during task execution. To see the full traceback, use -vvv. The error was: SSLError: ('The read operation timed out',)
2020-02-27 19:16:42,725 p=28985 u=root |  fatal: [node3 -> 10.16.3.19]: FAILED! => {
    "attempts": 4, 
    "changed": false
}

MSG:

failed to create temporary content file: ('The read operation timed out',)

镜像下载慢，配置阿里镜像加速
common.yaml添加
docker_registry_mirrors:

阿里镜像加速地址，地址登录阿里官网到镜像仓库中去拿找到《镜像加速器》

问题四
卸载时总是报错：这个是因为ssh连接超时问题，尝试让网络管理员解决，表示解决不了，fuck，不偷懒了，把所有机器的ssh免登录配置上试试，结果是有效果的。其实我不想这么干，太累了。

Timeout (12s) waiting for privilege escalation prompt

问题五
部分节点执行get-pip.py失败，以下是报错信息

2020-02-28 22:39:39,675 p=80528 u=root |  fatal: [node5]: FAILED! => {
    "changed": true, 
    "cmd": "sudo python /tmp/pip/get-pip.py", 
    "delta": "0:02:20.094835", 
    "end": "2020-02-28 22:39:39.655732", 
    "rc": 2, 
    "start": "2020-02-28 22:37:19.560897"
}

STDOUT:

Collecting pip
  Downloading https://files.pythonhosted.org/packages/54/0c/d01aa759fdc501a58f431eb594a17495f15b88da142ce14b5845662c13f3/pip-20.0.2-py2.py3-none-any.whl (1.4MB)


STDERR:

解决方法：安装完后登录到该机器手工执行命令sudo python /tmp/pip/get-pip.py，然后不卸载直接再次安装。

用到的一些其他辅助验证命令

ipvsadm
ip route