由于文章被限制在 65535 个字符,无法详细描述,文章内容会精简一部分;
设备环境相关信息
物理环境信息
- 一台服务器虚拟出来的五台虚拟机:
- Master 节点两台,4C/8G;
- Worker 节点三台,8C/16G;
操作系统信息
系统安装相关信息
Docker 版本
Client: Docker Engine - Community
Version: 20.10.12
API version: 1.41
Server: Docker Engine - Community
Engine:
Version: 20.10.12
API version: 1.41 (minimum version 1.12)
containerd:
Version: 1.4.12
runc:
Version: 1.0.2
Kubernetes 版本
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.5", GitCommit:"aea7bbadd2fc0cd689de94a54e5b7b758869d691", GitTreeState:"clean", BuildDate:"2021-09-15T21:10:45Z", GoVersion:"go1.16.8", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.5", GitCommit:"aea7bbadd2fc0cd689de94a54e5b7b758869d691", GitTreeState:"clean", BuildDate:"2021-09-15T21:04:16Z", GoVersion:"go1.16.8", Compiler:"gc", Platform:"linux/amd64"}
KubeSphere 版本、安装方式
版本:v3.2.1
安装方式:
在纯净的 CentOS 上,使用 kk All in one 方式安装,kk 配置如下:
apiVersion: kubekey.kubesphere.io/v1alpha2
kind: Cluster
metadata:
name: sample
spec:
hosts:
- {name: k8s-test-master1, address: 192.168.10.10, internalAddress: 192.168.10.10, user: root, password: "root"}
- {name: k8s-test-master2, address: 192.168.10.11, internalAddress: 192.168.10.11, user: root, password: "root"}
- {name: k8s-test-worker1, address: 192.168.10.12, internalAddress: 192.168.10.12, user: root, password: "root"}
- {name: k8s-test-worker2, address: 192.168.10.13, internalAddress: 192.168.10.13, user: root, password: "root"}
- {name: k8s-test-worker3, address: 192.168.10.14, internalAddress: 192.168.10.14, user: root, password: "root"}
roleGroups:
etcd:
- k8s-test-master1
- k8s-test-master2
control-plane:
- k8s-test-master1
- k8s-test-master2
worker:
- k8s-test-worker1
- k8s-test-worker2
- k8s-test-worker3
controlPlaneEndpoint:
internalLoadbalancer: haproxy
domain: lb.kubesphere.local
address: ""
port: 6443
kubernetes:
version: v1.21.5
clusterName: cluster.local
autoRenewCerts: true
etcd:
type: kubekey
network:
plugin: calico
kubePodsCIDR: 10.233.64.0/18
kubeServiceCIDR: 10.233.0.0/18
multusCNI:
enabled: false
registry:
plainHTTP: false
privateRegistry: ""
namespaceOverride: ""
registryMirrors: ["https://registry.docker-cn.com", "https://docker.mirrors.ustc.edu.cn", "http://hub-mirror.c.163.com"]
addons: []
---
apiVersion: installer.kubesphere.io/v1alpha1
kind: ClusterConfiguration
metadata:
name: ks-installer
namespace: kubesphere-system
labels:
version: v3.2.1
spec:
persistence:
storageClass: ""
authentication:
jwtSecret: ""
zone: ""
local_registry: ""
namespace_override: ""
etcd:
monitoring: true
endpointIps: localhost
port: 2379
tlsEnable: true
common:
core:
console:
enableMultiLogin: true
port: 30880
type: NodePort
redis:
enabled: false
volumeSize: 2Gi
openldap:
enabled: false
volumeSize: 2Gi
minio:
volumeSize: 20Gi
monitoring:
endpoint: http://prometheus-operated.kubesphere-monitoring-system.svc:9090
GPUMonitoring:
enabled: false
gpu:
kinds:
- resourceName: "nvidia.com/gpu"
resourceType: "GPU"
default: true
es:
logMaxAge: 7
elkPrefix: logstash
basicAuth:
enabled: false
username: ""
password: ""
externalElasticsearchHost: ""
externalElasticsearchPort: ""
alerting:
enabled: true
auditing:
enabled: true
devops:
enabled: true
jenkinsMemoryLim: 2Gi
jenkinsMemoryReq: 1500Mi
jenkinsVolumeSize: 8Gi
jenkinsJavaOpts_Xms: 512m
jenkinsJavaOpts_Xmx: 512m
jenkinsJavaOpts_MaxRAM: 2g
events:
enabled: true
logging:
enabled: true
containerruntime: docker
logsidecar:
enabled: true
replicas: 2
metrics_server:
enabled: true
monitoring:
storageClass: ""
gpu:
nvidia_dcgm_exporter:
enabled: false
# resources: {}
multicluster:
clusterRole: none
network:
networkpolicy:
enabled: false
ippool:
type: calico
topology:
type: weave scope
openpitrix:
store:
enabled: true
servicemesh:
enabled: true
kubeedge:
enabled: true
cloudCore:
nodeSelector: {"node-role.kubernetes.io/worker": ""}
tolerations: []
cloudhubPort: "10000"
cloudhubQuicPort: "10001"
cloudhubHttpsPort: "10002"
cloudstreamPort: "10003"
tunnelPort: "10004"
cloudHub:
advertiseAddress:
- "192.168.10.10"
- "192.168.10.11"
nodeLimit: "800"
service:
cloudhubNodePort: "30000"
cloudhubQuicNodePort: "30001"
cloudhubHttpsNodePort: "30002"
cloudstreamNodePort: "30003"
tunnelNodePort: "30004"
edgeWatcher:
nodeSelector: {"node-role.kubernetes.io/worker": ""}
tolerations: []
edgeWatcherAgent:
nodeSelector: {"node-role.kubernetes.io/worker": ""}
tolerations: []
KubeEdge 版本
v1.7.2
问题描述
背景
边缘节点也是服务器下的一台虚拟机,用于作为边缘节点测试;
使用容忍的方式,使得 iptables 可以运行在 master 节点上,解决了无法在 KubeSphere 上对边缘节点进行 log、exec;
kubectl edit iptables -n kubeedge
在 master 节点执行了以下脚本,用于对边缘节点进行了 patch,使得容忍度较强的 calico、kubeproxy 等 pod 不会被调度到边缘节点上:
#!/bin/bash
NodeSelectorPatchJson='{"spec":{"template":{"spec":{"nodeSelector":{"node-role.kubernetes.io/master": "","node-role.kubernetes.io/worker": ""}}}}}'
NoShedulePatchJson='{"spec":{"template":{"spec":{"affinity":{"nodeAffinity":{"requiredDuringSchedulingIgnoredDuringExecution":{"nodeSelectorTerms":[{"matchExpressions":[{"key":"node-role.kubernetes.io/edge","operator":"DoesNotExist"}]}]}}}}}}}'
edgenode="edge1"
if [ $1 ]; then
edgenode="$1"
fi
namespaces=($(kubectl get pods -A -o wide |egrep -i $edgenode | awk '{print $1}' ))
pods=($(kubectl get pods -A -o wide |egrep -i $edgenode | awk '{print $2}' ))
length=${#namespaces[@]}
for((i=0;i<$length;i++));
do
ns=${namespaces[$i]}
pod=${pods[$i]}
resources=$(kubectl -n $ns describe pod $pod | grep "Controlled By" |awk '{print $3}')
echo "Patching for ns: $ns, resources: $resources"
kubectl -n $ns patch $resources --type merge --patch "$NoShedulePatchJson"
sleep 1
done
目前已经可以成功对边缘节点进行 log、exec、metric;
问题一:安装完 KubeSphere 后,cloudcore 的报错问题
此时还没加入边缘节点,不知道该报错是否有影响,以下为 cloudcore 的日志:
W0527 10:09:10.897412 1 validation.go:168] TLSTunnelPrivateKeyFile does not exist in /etc/kubeedge/certs/server.key, will load from secret
W0527 10:09:10.897547 1 validation.go:171] TLSTunnelCertFile does not exist in /etc/kubeedge/certs/server.crt, will load from secret
W0527 10:09:10.897556 1 validation.go:174] TLSTunnelCAFile does not exist in /etc/kubeedge/ca/rootCA.crt, will load from secret
I0527 10:09:10.897566 1 server.go:73] Version: v1.7.2
W0527 10:09:10.897580 1 client_config.go:608] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
I0527 10:09:11.910684 1 module.go:34] Module cloudhub registered successfully
I0527 10:09:11.922923 1 module.go:34] Module edgecontroller registered successfully
I0527 10:09:11.923136 1 module.go:34] Module devicecontroller registered successfully
I0527 10:09:11.923769 1 module.go:34] Module synccontroller registered successfully
I0527 10:09:11.924170 1 module.go:34] Module cloudStream registered successfully
W0527 10:09:11.924183 1 module.go:37] Module router is disabled, do not register
W0527 10:09:11.924188 1 module.go:37] Module dynamiccontroller is disabled, do not register
I0527 10:09:11.924299 1 core.go:24] Starting module cloudhub
I0527 10:09:11.924344 1 core.go:24] Starting module edgecontroller
I0527 10:09:11.924389 1 core.go:24] Starting module devicecontroller
I0527 10:09:11.924408 1 upstream.go:121] start upstream controller
I0527 10:09:11.924428 1 core.go:24] Starting module synccontroller
I0527 10:09:11.924451 1 core.go:24] Starting module cloudStream
I0527 10:09:11.924497 1 downstream.go:870] Start downstream devicecontroller
I0527 10:09:11.925257 1 downstream.go:566] start downstream controller
E0527 10:09:11.998958 1 reflector.go:127] github.com/kubeedge/kubeedge/cloud/pkg/client/informers/externalversions/factory.go:119: Failed to watch *v1alpha2.Device: failed to list *v1alpha2.Device: the server could not find the requested resource (get devices.devices.kubeedge.io)
***E0527 10:09:12.096627 1 reflector.go:127] github.com/kubeedge/kubeedge/cloud/pkg/client/informers/externalversions/factory.go:119: Failed to watch *v1alpha2.DeviceModel: failed to list *v1alpha2.DeviceModel: the server could not find the requested resource (get devicemodels.devices.kubeedge.io)***
I0527 10:09:12.124506 1 server.go:243] Ca and CaKey don't exist in local directory, and will read from the secret
I0527 10:09:12.126786 1 server.go:247] Ca and CaKey don't exist in the secret, and will be created by CloudCore
I0527 10:09:12.201854 1 server.go:288] CloudCoreCert and key don't exist in local directory, and will read from the secret
I0527 10:09:12.203132 1 server.go:292] CloudCoreCert and key don't exist in the secret, and will be signed by CA
I0527 10:09:12.207185 1 tunnelserver.go:136] Succeed in loading TunnelCA from CloudHub
I0527 10:09:12.207531 1 tunnelserver.go:149] Succeed in loading TunnelCert and Key from CloudHub
I0527 10:09:12.207700 1 tunnelserver.go:169] Prepare to start tunnel server ...
I0527 10:09:12.209109 1 streamserver.go:280] Prepare to start stream server ...
I0527 10:09:12.210027 1 signcerts.go:100] Succeed to creating token
I0527 10:09:12.210065 1 server.go:44] start unix domain socket server
I0527 10:09:12.210225 1 uds.go:71] listening on: //var/lib/kubeedge/kubeedge.sock
I0527 10:09:12.210407 1 server.go:64] Starting cloudhub websocket server
***E0527 10:09:13.124710 1 reflector.go:127] github.com/kubeedge/kubeedge/cloud/pkg/client/informers/externalversions/factory.go:119: Failed to watch *v1alpha2.DeviceModel: failed to list *v1alpha2.DeviceModel: the server could not find the requested resource (get devicemodels.devices.kubeedge.io)
E0527 10:09:13.450656 1 reflector.go:127] github.com/kubeedge/kubeedge/cloud/pkg/client/informers/externalversions/factory.go:119: Failed to watch *v1alpha2.Device: failed to list *v1alpha2.Device: the server could not find the requested resource (get devices.devices.kubeedge.io)***
I0527 10:09:13.924726 1 upstream.go:63] Start upstream devicecontroller
***E0527 10:09:15.639730 1 reflector.go:127] github.com/kubeedge/kubeedge/cloud/pkg/client/informers/externalversions/factory.go:119: Failed to watch *v1alpha2.DeviceModel: failed to list *v1alpha2.DeviceModel: the server could not find the requested resource (get devicemodels.devices.kubeedge.io)
E0527 10:09:16.432358 1 reflector.go:127] github.com/kubeedge/kubeedge/cloud/pkg/client/informers/externalversions/factory.go:119: Failed to watch *v1alpha2.Device: failed to list *v1alpha2.Device: the server could not find the requested resource (get devices.devices.kubeedge.io)***
问题二:边缘节点加入时的报错问题
使用 KubeSphere 加入边缘节点时生成的命令,并修改为内网 IP ,在边缘节点上执行以下命令来加入集群:
arch=$(uname -m); curl -LO https://kubeedge.pek3b.qingstor.com/bin/v1.7.2/$arch/keadm-v1.7.2-linux-$arch.tar.gz && tar xvf keadm-v1.7.2-linux-$arch.tar.gz && chmod +x keadm && ./keadm join --kubeedge-version=1.7.2 --region=zh --cloudcore-ipport=192.168.10.40:30000 --quicport 30001 --certport 30002 --tunnelport 30004 --edgenode-name edge1 --edgenode-ip 192.168.10.141 --token e467ec90405bd002fcbda2594c62683f0e2ed3694ccd6e07439fb1d8be94572e.eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJleHAiOjE2NTM3MDM3NTJ9.nhjRfMZdj17wzLd8zsnHto9aEqGESTEwM2BWT-mQxxk --with-edge-taint
执行以上命令后,边缘节点可以成功加入集群,但查看边缘节点的日志有几处报错,不知道是否有影响,报错如下:
报错1:
May 27 02:27:33 k8s-test-edge2 edgecore[6149]: I0527 02:27:33.998159 6149 core.go:24] Starting module edgemesh
May 27 02:27:33 k8s-test-edge2 edgecore[6149]: I0527 02:27:33.998240 6149 core.go:24] Starting module metaManager
May 27 02:27:33 k8s-test-edge2 edgecore[6149]: I0527 02:27:33.998307 6149 core.go:24] Starting module edgestream
May 27 02:27:33 k8s-test-edge2 edgecore[6149]: I0527 02:27:33.998390 6149 core.go:24] Starting module twin
May 27 02:27:33 k8s-test-edge2 edgecore[6149]: I0527 02:27:33.998480 6149 core.go:24] Starting module edged
May 27 02:27:33 k8s-test-edge2 edgecore[6149]: I0527 02:27:33.998537 6149 edged.go:290] Starting edged...
May 27 02:27:33 k8s-test-edge2 edgecore[6149]: I0527 02:27:33.998642 6149 http.go:40] tlsConfig InsecureSkipVerify true
May 27 02:27:33 k8s-test-edge2 edgecore[6149]: I0527 02:27:33.998502 6149 process.go:113] Begin to sync sqlite
May 27 02:27:33 k8s-test-edge2 edgecore[6149]: I0527 02:27:33.998554 6149 core.go:24] Starting module websocket
May 27 02:27:33 k8s-test-edge2 edgecore[6149]: I0527 02:27:33.999086 6149 core.go:24] Starting module eventbus
***May 27 02:27:33 k8s-test-edge2 edgecore[6149]: E0527 02:27:33.998678 6149 csi_plugin.go:226] kubernetes.io/csi: CSIDriverLister not found on KubeletVolumeHost***
May 27 02:27:33 k8s-test-edge2 edgecore[6149]: I0527 02:27:33.999471 6149 fs_resource_analyzer.go:64] Starting FS ResourceAnalyzer
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.000315 6149 client.go:86] parsed scheme: "unix"
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.000332 6149 client.go:86] scheme "unix" not registered, fallback to default scheme
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.000382 6149 passthrough.go:48] ccResolverWrapper: sending update to cc: {[{unix:///run/containerd/containerd.sock <nil> 0 <nil>}] <nil> <nil>}
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.000389 6149 clientconn.go:948] ClientConn switching balancer to "pick_first"
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.000823 6149 common.go:96] start connect to mqtt server with client id: hub-client-sub-1653618454
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.000855 6149 common.go:98] client hub-client-sub-1653618454 isconnected: false
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:33.998396 6149 log.go:181] DEBUG: Installed strategy plugin: [RoundRobin].
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.001331 6149 log.go:181] DEBUG: ConfigurationFactory Initiated
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.001354 6149 log.go:181] INFO: Configuration files: []
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.001403 6149 log.go:181] WARN: empty configurtion from [FileSource]
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.001420 6149 log.go:181] INFO: invoke dynamic handler:FileSource
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.001463 6149 log.go:181] INFO: archaius init success
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.001847 6149 log.go:181] INFO: create new watcher
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.003303 6149 client.go:150] finish hub-client sub
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.003342 6149 common.go:96] start connect to mqtt server with client id: hub-client-pub-1653618454
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.003357 6149 common.go:98] client hub-client-pub-1653618454 isconnected: false
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.010648 6149 client.go:166] finish hub-client pub
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.010671 6149 eventbus.go:63] Init Sub And Pub Client for externel mqtt broker tcp://127.0.0.1:1883 successfully
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.010707 6149 client.go:91] edge-hub-cli subscribe topic to $hw/events/upload/#
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.010924 6149 client.go:91] edge-hub-cli subscribe topic to $hw/events/device/+/state/update
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.011051 6149 client.go:91] edge-hub-cli subscribe topic to $hw/events/device/+/twin/+
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.011163 6149 client.go:91] edge-hub-cli subscribe topic to $hw/events/node/+/membership/get
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.011272 6149 client.go:91] edge-hub-cli subscribe topic to SYS/dis/upload_records
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.011406 6149 client.go:91] edge-hub-cli subscribe topic to +/user/#
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.011506 6149 client.go:99] list edge-hub-cli-topics status, no record, skip sync
***May 27 02:27:34 k8s-test-edge2 edgecore[6149]: E0527 02:27:34.014687 6149 proxy.go:143] [EdgeMesh] open file /run/edgemesh-iptables err: open /run/edgemesh-iptables: no such file or directory***
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.018287 6149 proxy.go:95] [EdgeMesh] chain EDGE-MESH not exists
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.024132 6149 proxy.go:103] [EdgeMesh] inbound rule -p tcp -d 9.251.0.0/16 -i docker0 -j EDGE-MESH not exists
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.034342 6149 certmanager.go:159] Certificate rotation is enabled.
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.034366 6149 websocket.go:51] Websocket start to connect Access
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.035274 6149 proxy.go:111] [EdgeMesh] outbound rule -p tcp -d 9.251.0.0/16 -o docker0 -j EDGE-MESH not exists
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.058096 6149 proxy.go:119] [EdgeMesh] dnat rule -p tcp -j DNAT --to-destination 172.17.0.1:40001 not exists
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.058842 6149 ws.go:46] dial wss://192.168.10.40:30000/e632aba927ea4ac2b575ec1603d56f10/edge1/events successfully
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.059074 6149 websocket.go:93] Websocket connect to cloud access successful
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.059368 6149 process.go:513] node connection event occur: cloud_connected
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: W0527 02:27:34.059464 6149 eventbus.go:148] Action not found
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.059541 6149 process.go:513] node connection event occur: cloud_connected
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.059601 6149 process.go:282] DeviceTwin receive msg
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.059671 6149 process.go:66] Send msg to the CommModule module in twin
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.111086 6149 cpu_manager.go:184] [cpumanager] starting with none policy
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.111469 6149 cpu_manager.go:185] [cpumanager] reconciling every 0s
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.111640 6149 state_mem.go:36] [cpumanager] initializing new in-memory state store
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.113127 6149 policy_none.go:43] [cpumanager] none policy: Start
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.139178 6149 record.go:19] Normal NodeAllocatableEnforced Updated Node Allocatable limit across pods
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.139511 6149 volume_manager.go:265] Starting Kubelet Volume Manager
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.141891 6149 desired_state_of_world_populator.go:139] Desired state populator starts to run
***May 27 02:27:34 k8s-test-edge2 edgecore[6149]: E0527 02:27:34.153820 6149 imitator.go:222] failed to unmarshal message content to unstructured obj: Object 'Kind' is missing in '{"metadata":{"name":"edge1","creationTimestamp":null,"labels":{"kubernetes.io/arch":"amd64","kubernetes.io/hostname":"edge1","kubernetes.io/os":"linux","node-role.kubernetes.io/agent":"","node-role.kubernetes.io/edge":""}},"spec":{},"status":{"daemonEndpoints":{"kubeletEndpoint":{"Port":0}},"nodeInfo":{"machineID":"","systemUUID":"","bootID":"","kernelVersion":"","osImage":"","containerRuntimeVersion":"","kubeletVersion":"","kubeProxyVersion":"","operatingSystem":"","architecture":""}}}'***
报错2:
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.156121 6149 status_manager.go:53] Starting to sync pod status with apiserver
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.156307 6149 edged.go:890] start pod addition queue work 0
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.156592 6149 edged.go:890] start pod addition queue work 1
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.156692 6149 edged.go:890] start pod addition queue work 2
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.156755 6149 edged.go:890] start pod addition queue work 3
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.156819 6149 edged.go:890] start pod addition queue work 4
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.156895 6149 edged.go:356] starting plugin manager
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.156896 6149 server.go:35] starting to listen read-only on 127.0.0.1:10350
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.157240 6149 plugin_manager.go:114] Starting Kubelet Plugin Manager
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.158731 6149 server.go:425] Adding debug handlers to kubelet server.
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.159897 6149 edged_status.go:390] Attempting to register node edge1
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.160798 6149 cpu_manager.go:184] [cpumanager] starting with none policy
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.160809 6149 cpu_manager.go:185] [cpumanager] reconciling every 1s
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.160823 6149 state_mem.go:36] [cpumanager] initializing new in-memory state store
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.160924 6149 state_mem.go:88] [cpumanager] updated default cpuset: ""
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.160931 6149 state_mem.go:96] [cpumanager] updated cpuset assignments: "map[]"
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.160941 6149 policy_none.go:43] [cpumanager] none policy: Start
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.160957 6149 edged.go:368] starting syncPod
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.170464 6149 edged_status.go:409] Successfully registered node edge1
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.179008 6149 edged_status.go:198] Sync VolumesInUse: []
***May 27 02:27:34 k8s-test-edge2 edgecore[6149]: E0527 02:27:34.211687 6149 imitator.go:222] failed to unmarshal message content to unstructured obj: json: cannot unmarshal array into Go value of type map[string]interface {}***
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.258100 6149 listener.go:316] [EdgeMesh] update services: 50 resource: namespace/servicelist/service
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: I0527 02:27:34.258392 6149 listener.go:327] [EdgeMesh] update svc kubesphere-logging-system.ks-events-ruler in cache
***May 27 02:27:34 k8s-test-edge2 edgecore[6149]: E0527 02:27:34.259566 6149 imitator.go:222] failed to unmarshal message content to unstructured obj: json: cannot unmarshal array into Go value of type map[string]interface {}
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: E0527 02:27:34.263035 6149 imitator.go:222] failed to unmarshal message content to unstructured obj: json: cannot unmarshal array into Go value of type map[string]interface {}
May 27 02:27:34 k8s-test-edge2 edgecore[6149]: E0527 02:27:34.359786 6149 imitator.go:222] failed to unmarshal message content to unstructured obj: Object 'Kind' is missing in 'null'***
问题三:在边缘节点部署 pod 时出现问题
问题描述:
该应用在集群内部创建可以成功运行;
将该应用调度到边缘节点创建,则无法成功运行:
cloudcore 日志:
I0527 10:27:37.900475 1 session.go:125] Add a new apiserver connection APIServer_MetricsConnection MessageID 3 in to Tunnel session [edge1]
I0527 10:27:37.908200 1 containermetrics_connection.go:117] APIServer_MetricsConnection MessageID 3 find edge peer done, so stop this connection
I0527 10:27:37.908216 1 containermetrics_connection.go:93] APIServer_MetricsConnection MessageID 3 end successful
I0527 10:27:37.908223 1 session.go:133] Delete a apiserver connection APIServer_MetricsConnection MessageID 3 from Tunnel session [edge1]
I0527 10:27:37.908227 1 streamserver.go:189] Delete APIServer_MetricsConnection MessageID 3 from Tunnel session [edge1]
I0527 10:27:40.126371 1 upstream.go:88] Dispatch message: 5f765cde-43e3-4fad-b4a9-4dbe0d4fa21f
W0527 10:27:40.127293 1 upstream.go:92] Parse message: 5f765cde-43e3-4fad-b4a9-4dbe0d4fa21f resource type with error: unknown resource
***E0527 10:27:49.999931 1 reflector.go:127] github.com/kubeedge/kubeedge/cloud/pkg/client/informers/externalversions/factory.go:119: Failed to watch *v1alpha2.DeviceModel: failed to list *v1alpha2.DeviceModel: the server could not find the requested resource (get devicemodels.devices.kubeedge.io)
E0527 10:28:02.091135 1 reflector.go:127] github.com/kubeedge/kubeedge/cloud/pkg/client/informers/externalversions/factory.go:119: Failed to watch *v1alpha2.Device: failed to list *v1alpha2.Device: the server could not find the requested resource (get devices.devices.kubeedge.io)***
I0527 10:28:27.573267 1 session.go:125] Add a new apiserver connection APIServer_LogsConnection MessageID 4 in to Tunnel session [edge1]
I0527 10:28:27.577977 1 containerlog_connection.go:116] APIServer_LogsConnection MessageID 4 find edge peer done, so stop this connection
I0527 10:28:27.577988 1 containerlog_connection.go:92] APIServer_LogsConnection MessageID 4 end successful
I0527 10:28:27.577995 1 session.go:133] Delete a apiserver connection APIServer_LogsConnection MessageID 4 from Tunnel session [edge1]
I0527 10:28:27.577999 1 streamserver.go:139] Delete APIServer_LogsConnection MessageID 4 from Tunnel session [edge1]
***E0527 10:28:29.251566 1 objectsync.go:38] failed to get obj(gvr:/, Resource=,namespace:default,name:pvc-a1e452be-a4a5-451a-8d1f-3569a8289835), default "pvc-a1e452be-a4a5-451a-8d1f-3569a8289835" is forbidden: User "system:serviceaccount:kubeedge:cloudcore" cannot get resource "default" in API group "" at the cluster scope
E0527 10:28:29.257808 1 objectsync.go:38] failed to get obj(gvr:/, Resource=,namespace:edge-edgex1,name:consul-config), edge-edgex1 "consul-config" is forbidden: User "system:serviceaccount:kubeedge:cloudcore" cannot get resource "edge-edgex1" in API group "" at the cluster scope
E0527 10:28:29.262112 1 objectsync.go:38] failed to get obj(gvr:/, Resource=,namespace:edge-edgex1,name:consul-data), edge-edgex1 "consul-data" is forbidden: User "system:serviceaccount:kubeedge:cloudcore" cannot get resource "edge-edgex1" in API group "" at the cluster scope
E0527 10:28:29.274941 1 objectsync.go:38] failed to get obj(gvr:/, Resource=,namespace:edge-edgex1,name:db-data), edge-edgex1 "db-data" is forbidden: User "system:serviceaccount:kubeedge:cloudcore" cannot get resource "edge-edgex1" in API group "" at the cluster scope
E0527 10:28:29.303132 1 objectsync.go:38] failed to get obj(gvr:/, Resource=,namespace:default,name:pvc-ec504cbf-a4b7-4a04-9b38-932c50bf2607), default "pvc-ec504cbf-a4b7-4a04-9b38-932c50bf2607" is forbidden: User "system:serviceaccount:kubeedge:cloudcore" cannot get resource "default" in API group "" at the cluster scope***
边缘节点 edgecore 日志:
May 27 03:03:09 k8s-test-edge2 edgecore[6149]: I0527 03:03:09.184467 6149 record.go:24] Warning MissingClusterDNS kubelet does not have ClusterDNS IP configured and cannot create Pod using "ClusterFirst" policy. Falling back to "Default" policy.
May 27 03:03:09 k8s-test-edge2 edgecore[6149]: I0527 03:03:09.184817 6149 record.go:24] Warning MissingClusterDNS pod: "edgex-core-metadata-78788c8c48-2cmkz_edge-edgex1(eb4a0914-f8c4-4a23-9ea4-b8c077dc9a7d)". kubelet does not have ClusterDNS IP configured and cannot create Pod using "ClusterFirst" policy. Falling back to "Default" policy.
May 27 03:03:09 k8s-test-edge2 edgecore[6149]: I0527 03:03:09.184953 6149 edged.go:1015] consume added pod [edgex-core-metadata-78788c8c48-2cmkz] successfully
May 27 03:03:09 k8s-test-edge2 edgecore[6149]: I0527 03:03:09.189078 6149 edged.go:900] worker [2] get pod addition item [edgex-support-notifications-598f7f85d-q6bfw]
***May 27 03:03:09 k8s-test-edge2 edgecore[6149]: E0527 03:03:09.189313 6149 edged.go:903] consume pod addition backoff: Back-off consume pod [edgex-support-notifications-598f7f85d-q6bfw] addition error, backoff: [20s]***
May 27 03:03:09 k8s-test-edge2 edgecore[6149]: I0527 03:03:09.189413 6149 edged.go:905] worker [2] backoff pod addition item [edgex-support-notifications-598f7f85d-q6bfw] failed, re-add to queue
May 27 03:03:09 k8s-test-edge2 edgecore[6149]: I0527 03:03:09.344129 6149 edged_volumes.go:54] Using volume plugin "kubernetes.io/empty-dir" to mount wrapped_kube-api-access-5pbcc
May 27 03:03:10 k8s-test-edge2 edgecore[6149]: I0527 03:03:10.007517 6149 edged.go:900] worker [3] get pod addition item [edgex-support-scheduler-5f9c499574-4jj64]
May 27 03:03:10 k8s-test-edge2 edgecore[6149]: I0527 03:03:10.007606 6149 edged.go:968] start to consume added pod [edgex-support-scheduler-5f9c499574-4jj64]
May 27 03:03:10 k8s-test-edge2 edgecore[6149]: I0527 03:03:10.008306 6149 record.go:24] Warning MissingClusterDNS kubelet does not have ClusterDNS IP configured and cannot create Pod using "ClusterFirst" policy. Falling back to "Default" policy.
May 27 03:03:10 k8s-test-edge2 edgecore[6149]: I0527 03:03:10.008359 6149 record.go:24] Warning MissingClusterDNS pod: "edgex-support-scheduler-5f9c499574-4jj64_edge-edgex1(da0167dc-a3cb-4760-842e-ee4601a139f7)". kubelet does not have ClusterDNS IP configured and cannot create Pod using "ClusterFirst" policy. Falling back to "Default" policy.
May 27 03:03:10 k8s-test-edge2 edgecore[6149]: I0527 03:03:10.009382 6149 record.go:24] Warning BackOff Back-off restarting failed container
***May 27 03:03:10 k8s-test-edge2 edgecore[6149]: E0527 03:03:10.009558 6149 edged.go:919] worker [3] handle pod addition item [edgex-support-scheduler-5f9c499574-4jj64] failed: failed to "StartContainer" for "edgex-support-scheduler" with CrashLoopBackOff: "back-off 5m0s restarting failed container=edgex-support-scheduler pod=edgex-support-scheduler-5f9c499574-4jj64_edge-edgex1(da0167dc-a3cb-4760-842e-ee4601a139f7)", re-add to queue***
May 27 03:03:10 k8s-test-edge2 edgecore[6149]: I0527 03:03:10.146028 6149 edged_volumes.go:54] Using volume plugin "kubernetes.io/empty-dir" to mount wrapped_kube-api-access-zv6l9
May 27 03:03:10 k8s-test-edge2 edgecore[6149]: I0527 03:03:10.720984 6149 edged.go:900] worker [1] get pod addition item [edgex-core-data-5bb8bcc584-95zzk]
***May 27 03:03:10 k8s-test-edge2 edgecore[6149]: E0527 03:03:10.721058 6149 edged.go:903] consume pod addition backoff: Back-off consume pod [edgex-core-data-5bb8bcc584-95zzk] addition error, backoff: [1m20s]***
May 27 03:03:10 k8s-test-edge2 edgecore[6149]: I0527 03:03:10.721118 6149 edged.go:905] worker [1] backoff pod addition item [edgex-core-data-5bb8bcc584-95zzk] failed, re-add to queue
May 27 03:03:10 k8s-test-edge2 edgecore[6149]: I0527 03:03:10.727149 6149 edged.go:900] worker [4] get pod addition item [edgex-core-command-6fb8d849bc-q9qvr]
***May 27 03:03:10 k8s-test-edge2 edgecore[6149]: E0527 03:03:10.727166 6149 edged.go:903] consume pod addition backoff: Back-off consume pod [edgex-core-command-6fb8d849bc-q9qvr] addition error, backoff: [20s]***
May 27 03:03:10 k8s-test-edge2 edgecore[6149]: I0527 03:03:10.727184 6149 edged.go:905] worker [4] backoff pod addition item [edgex-core-command-6fb8d849bc-q9qvr] failed, re-add to queue
May 27 03:03:11 k8s-test-edge2 edgecore[6149]: I0527 03:03:11.327897 6149 edged_status.go:198] Sync VolumesInUse: []
May 27 03:03:18 k8s-test-edge2 edgecore[6149]: I0527 03:03:18.945156 6149 edged.go:900] worker [0] get pod addition item [edgex-sys-mgmt-agent-76698f698-g2n7w]
May 27 03:03:18 k8s-test-edge2 edgecore[6149]: I0527 03:03:18.945856 6149 edged.go:968] start to consume added pod [edgex-sys-mgmt-agent-76698f698-g2n7w]
May 27 03:03:18 k8s-test-edge2 edgecore[6149]: I0527 03:03:18.946594 6149 record.go:24] Warning MissingClusterDNS kubelet does not have ClusterDNS IP configured and cannot create Pod using "ClusterFirst" policy. Falling back to "Default" policy.
May 27 03:03:18 k8s-test-edge2 edgecore[6149]: I0527 03:03:18.946878 6149 record.go:24] Warning MissingClusterDNS pod: "edgex-sys-mgmt-agent-76698f698-g2n7w_edge-edgex1(5e5563dd-1d8e-48eb-aab6-708f4c12d1d9)". kubelet does not have ClusterDNS IP configured and cannot create Pod using "ClusterFirst" policy. Falling back to "Default" policy.
May 27 03:03:18 k8s-test-edge2 edgecore[6149]: I0527 03:03:18.949592 6149 record.go:19] Normal Pulled Container image "edgexfoundry/sys-mgmt-agent:2.1.0" already present on machine
May 27 03:03:18 k8s-test-edge2 edgecore[6149]: I0527 03:03:18.949665 6149 edged_pods.go:321] container: edge-edgex1/edgex-sys-mgmt-agent-76698f698-g2n7w/edgex-sys-mgmt-agent podIP: "172.17.0.13" creating hosts mount: true
May 27 03:03:18 k8s-test-edge2 edgecore[6149]: I0527 03:03:18.949693 6149 edged_pods.go:403] Pod "edgex-sys-mgmt-agent-76698f698-g2n7w_edge-edgex1(5e5563dd-1d8e-48eb-aab6-708f4c12d1d9)" container "edgex-sys-mgmt-agent" mount "system-claim0" has propagation "PROPAGATION_HOST_TO_CONTAINER"
May 27 03:03:18 k8s-test-edge2 edgecore[6149]: I0527 03:03:18.949721 6149 edged_pods.go:403] Pod "edgex-sys-mgmt-agent-76698f698-g2n7w_edge-edgex1(5e5563dd-1d8e-48eb-aab6-708f4c12d1d9)" container "edgex-sys-mgmt-agent" mount "kube-api-access-f7s2p" has propagation "PROPAGATION_HOST_TO_CONTAINER"
May 27 03:03:18 k8s-test-edge2 edgecore[6149]: I0527 03:03:18.978162 6149 record.go:19] Normal Created Created container edgex-sys-mgmt-agent
May 27 03:03:19 k8s-test-edge2 edgecore[6149]: W0527 03:03:19.053832 6149 dns.go:125] [EdgeMesh] failed to resolve dns: get from real dns
May 27 03:03:19 k8s-test-edge2 edgecore[6149]: I0527 03:03:19.058828 6149 record.go:19] Normal Started Started container edgex-sys-mgmt-agent
May 27 03:03:19 k8s-test-edge2 edgecore[6149]: I0527 03:03:19.058926 6149 edged.go:1015] consume added pod [edgex-sys-mgmt-agent-76698f698-g2n7w] successfully
May 27 03:03:19 k8s-test-edge2 edgecore[6149]: E0527 03:03:19.067279 6149 dns.go:290] [EdgeMesh] service edgex-core-consul is not found in this cluster
May 27 03:03:19 k8s-test-edge2 edgecore[6149]: W0527 03:03:19.067297 6149 dns.go:125] [EdgeMesh] failed to resolve dns: get from real dns
May 27 03:03:19 k8s-test-edge2 edgecore[6149]: W0527 03:03:19.067311 6149 dns.go:125] [EdgeMesh] failed to resolve dns: get from real dns
May 27 03:03:19 k8s-test-edge2 edgecore[6149]: I0527 03:03:19.069355 6149 edged_volumes.go:54] Using volume plugin "kubernetes.io/empty-dir" to mount wrapped_kube-api-access-f7s2p
***May 27 03:03:19 k8s-test-edge2 edgecore[6149]: E0527 03:03:19.078343 6149 dns.go:290] [EdgeMesh] service edgex-core-consul is not found in this cluster***
May 27 03:03:19 k8s-test-edge2 edgecore[6149]: W0527 03:03:19.078520 6149 dns.go:125] [EdgeMesh] failed to resolve dns: get from real dns
May 27 03:03:20 k8s-test-edge2 edgecore[6149]: I0527 03:03:20.721558 6149 edged.go:900] worker [1] get pod addition item [edgex-sys-mgmt-agent-76698f698-g2n7w]
May 27 03:03:20 k8s-test-edge2 edgecore[6149]: I0527 03:03:20.721595 6149 edged.go:968] start to consume added pod [edgex-sys-mgmt-agent-76698f698-g2n7w]
问题四:物理环境重启后无法对边缘节点进行 log、exec、metric
- 将运行着 KubeSphere 、边缘节点的虚拟机所在的服务器重启后,在 KubeSphere 上查看边缘节点信息,此时能看到边缘节点,但无法 log、exec、metric,包括 CPU 、内存的信息都无法显示,对边缘节点的 pod 进行 log 、exec 的时候出现报错: ip:端口 connect refused;