• 新版本升级
  • 过渡无压力!KubeSphere v3.4.x 到 v4.x 平滑升级全攻略

本文将为您提供从 KubeSphere v3.4.x 升级到 v4.x 的完整操作步骤,帮助您顺利完成升级过程。特别注意,如果您的 KubeSphere 版本早于 v3.4.x,您需要先将其升级至 v3.4.x,然后再进行进一步的升级。

升级过程分为以下三个主要步骤,请按照顺序执行:

  1. 升级 host 集群并迁移扩展组件数据。
  2. 升级 member 集群并迁移扩展组件数据。
  3. 升级网关。

下载升级脚本

在 host 集群和 member 集群环境中下载升级脚本:

curl -LO https://github.com/kubesphere/ks-installer/archive/refs/tags/v4.1.3.tar.gz
tar -xzvf v4.1.3.tar.gz
cd ks-installer-4.1.3/scripts

修改升级配置文件

在 host 和所有的 member 集群中调整升级配置文件,配置文件路径 scripts/ks-core-values.yaml,其中 upgrade 部分为升级相关的配置项目,在升级之前,您需要确认要升级的组件和配置信息,通过 upgrade.config.jobs.<name>.enabled 控制是否需要升级该组件,以下为配置示例:

upgrade:
  enabled: true
  image:
    registry: ""
    repository: kubesphere/ks-upgrade
    tag: "v4.1.3"
    pullPolicy: IfNotPresent
  persistenceVolume:
    name: ks-upgrade
    storageClassName: ""
    accessMode: ReadWriteOnce
    size: 5Gi
  config:
    storage:
      local:
        path: /tmp/ks-upgrade
    validator:
      ksVersion:
        enabled: true
      extensionsMuseum:
        enabled: true
        namespace: kubesphere-system
        name: extensions-museum
        syncInterval: 0
        watchTimeout: 30m
    jobs:
      network:            
        enabled: true        # 是启用并升级该扩展组件
        priority: 100
        extensionRef:        # 扩展组件的版本配置,需要与 extensions-museum 中的信息一致
          name: "network"
          version: "1.1.0" 
        dynamicOptions: {
          "rerun": "false"
        }
      gateway:
        enabled: true
        priority: 90
        extensionRef:
          name: "gateway"
          version: "1.0.5"

更多配置说明请参考 扩展组件升级配置说明

集群状态检查

在升级集群之前,执行以下 scripts/pre-check.sh 脚本文件,检查集群状态及是否满足升级条件。

bash pre-check.sh

升级 host 集群

执行以下命令,升级 host 集群、安装扩展组件并迁移数据。

# 指定镜像仓库地址
# export IMAGE_REGISTRY=swr.cn-southwest-2.myhuaweicloud.com/ks 
# 指定扩展组件镜像仓库地址
# export EXTENSION_IMAGE_REGISTRY=swr.cn-southwest-2.myhuaweicloud.com/ks 
bash upgrade.sh host | tee host-upgrade.log

执行升级命令后,可以在新的终端窗口中运行以下命令,实时观察 kubesphere-system 命名空间下 Pod 的状态变化:

watch kubectl get pod -n kubesphere-system

升级 member 集群

执行以下命令,升级 member 集群、安装扩展组件 agent 并迁移数据。

# 指定镜像仓库地址
# export IMAGE_REGISTRY=swr.cn-southwest-2.myhuaweicloud.com/ks 
# 指定扩展组件镜像仓库地址
# export EXTENSION_IMAGE_REGISTRY=swr.cn-southwest-2.myhuaweicloud.com/ks 
bash upgrade.sh member | tee member-upgrade.log

执行升级命令后,可以在新的终端窗口中运行以下命令,实时观察 kubesphere-system 命名空间下 Pod 的状态变化:

watch kubectl get pod -n kubesphere-system

在 member 集群升级成功之后,需要在 host 集群上执行以下命令,移除 member 集群的污点,使扩展组件 agent 能够调度到 member 集群上。

kubectl get clusters.cluster.kubesphere.io <MEMBER_CLUSTER_NAME> -o json | jq 'del(.status.conditions[] | select(.type=="Schedulable"))' | kubectl apply -f -

升级网关

网关的升级会导致 Nginx Ingress Controller 重启,依赖网关提供的服务会产生中断,请在业务低峰期进行升级。

在开始升级前,请通过以下命令检查集群中部署的网关实例及状态:

helm -n kubesphere-controls-system list -a

通过以下命令逐一升级网关:

# 指定镜像仓库地址
# export IMAGE_REGISTRY=swr.cn-southwest-2.myhuaweicloud.com/ks 
bash upgrade.sh gateway kubesphere-router-<NAMESPACE> | tee gateway-upgrade.log

或者通过以下命令升级所有网关:

# 指定镜像仓库地址
# export IMAGE_REGISTRY=swr.cn-southwest-2.myhuaweicloud.com/ks 
bash upgrade.sh gateway all | tee gateway-upgrade.log

在网关升级完成后,通过以下命令检查网关的部署状态:

helm -n kubesphere-controls-system list -a

完成升级

升级完成后,请确保所有服务正常运行,并检查系统的健康状态。可以通过以下命令验证:

for ns in $(kubectl get namespaces -l kubesphere.io/workspace=system-workspace -o jsonpath='{.items[*].metadata.name}'); do
    kubectl get pods -n $ns --no-headers --ignore-not-found | grep -vE 'Running|Completed'
done

确保所有 Pod 状态为 Running。如有问题,请查看相关容器日志以进行故障排除。

通过本文提供的平滑升级步骤,您应该能够顺利将平台从 v3.4.x 升级到 v4.x,如果您在升级过程中遇到任何困难或有进一步的需求,KubeSphere 社区随时欢迎您的参与和反馈。通过不断的优化和更新,我们致力于为每一位用户提供更加稳定、高效的云原生平台。

感谢您选择 KubeSphere,我们期待您在新版平台上获得更好的体验和成效!

特别提醒:产品生命周期管理政策

在进行升级时,建议您关注 KubeSphere 产品生命周期管理政策,该政策为您提供了产品版本的生命周期终止方案,确保您使用的版本始终满足最新的市场需求和技术标准。了解每个版本的支持和更新策略,有助于您及时做好系统更新与版本迁移的规划,避免由于使用不再维护的版本而带来的潜在风险。

升级中有报错,这种资源没有被helm 管理的

upgrade.go:142: [debug] preparing upgrade for ks-core

upgrade.go:150: [debug] performing update for ks-core

Error: UPGRADE FAILED: rendered manifests contain a resource that already exists. Unable to continue with update: GlobalRole "anonymous" in namespace "" exists and cannot be imported into the current release: invalid ownership metadata; label validation error: missing key "app.kubernetes.io/managed-by": must be set to "Helm"; annotation validation error: missing key "meta.helm.sh/release-name": must be set to "ks-core"; annotation validation error: missing key "meta.helm.sh/release-namespace": must be set to "kubesphere-system"

helm.go:84: [debug] GlobalRole "anonymous" in namespace "" exists and cannot be imported into the current release: invalid ownership metadata; label validation error: missing key "app.kubernetes.io/managed-by": must be set to "Helm"; annotation validation error: missing key "meta.helm.sh/release-name": must be set to "ks-core"; annotation validation error: missing key "meta.helm.sh/release-namespace": must be set to "kubesphere-system"

rendered manifests contain a resource that already exists. Unable to continue with update

helm.sh/helm/v3/pkg/action.(\*Upgrade).performUpgrade

	helm.sh/helm/v3/pkg/action/upgrade.go:301

helm.sh/helm/v3/pkg/action.(\*Upgrade).RunWithContext

	helm.sh/helm/v3/pkg/action/upgrade.go:151

main.newUpgradeCmd.func2

	helm.sh/helm/v3/cmd/helm/upgrade.go:199

github.com/spf13/cobra.(\*Command).execute

	github.com/spf13/cobra@v1.5.0/command.go:872

github.com/spf13/cobra.(\*Command).ExecuteC

	github.com/spf13/cobra@v1.5.0/command.go:990

github.com/spf13/cobra.(\*Command).Execute

	github.com/spf13/cobra@v1.5.0/command.go:918

main.main

	helm.sh/helm/v3/cmd/helm/helm.go:83

runtime.main

	runtime/proc.go:250

runtime.goexit

	runtime/asm_arm64.s:1172

UPGRADE FAILED

main.newUpgradeCmd.func2

	helm.sh/helm/v3/cmd/helm/upgrade.go:201

github.com/spf13/cobra.(\*Command).execute

	github.com/spf13/cobra@v1.5.0/command.go:872

github.com/spf13/cobra.(\*Command).ExecuteC

	github.com/spf13/cobra@v1.5.0/command.go:990

github.com/spf13/cobra.(\*Command).Execute

	github.com/spf13/cobra@v1.5.0/command.go:918

main.main

	helm.sh/helm/v3/cmd/helm/helm.go:83

runtime.main

	runtime/proc.go:250

runtime.goexit

	runtime/asm_arm64.s:1172

    有没有脚本一次来处理这些不受helm 管理的资源呢。

    我的集群是TKE上进行升级,

    cat host-upgrade.log | grep "apply CRDs" -A 20
    
    apply CRDs
    
    customresourcedefinition.apiextensions.k8s.io/applications.app.k8s.io configured
    
    customresourcedefinition.apiextensions.k8s.io/applicationreleases.application.kubesphere.io unchanged
    
    customresourcedefinition.apiextensions.k8s.io/applications.application.kubesphere.io unchanged
    
    customresourcedefinition.apiextensions.k8s.io/applicationversions.application.kubesphere.io unchanged
    
    customresourcedefinition.apiextensions.k8s.io/categories.application.kubesphere.io unchanged
    
    customresourcedefinition.apiextensions.k8s.io/repos.application.kubesphere.io unchanged
    
    customresourcedefinition.apiextensions.k8s.io/clusters.cluster.kubesphere.io unchanged
    
    customresourcedefinition.apiextensions.k8s.io/labels.cluster.kubesphere.io unchanged
    
    customresourcedefinition.apiextensions.k8s.io/apiservices.extensions.kubesphere.io unchanged
    
    customresourcedefinition.apiextensions.k8s.io/extensionentries.extensions.kubesphere.io unchanged
    
    customresourcedefinition.apiextensions.k8s.io/jsbundles.extensions.kubesphere.io unchanged
    
    customresourcedefinition.apiextensions.k8s.io/reverseproxies.extensions.kubesphere.io unchanged
    
    customresourcedefinition.apiextensions.k8s.io/ingressclassscopes.gateway.kubesphere.io unchanged
    
    customresourcedefinition.apiextensions.k8s.io/builtinroles.iam.kubesphere.io unchanged
    
    customresourcedefinition.apiextensions.k8s.io/categories.iam.kubesphere.io unchanged
    
    customresourcedefinition.apiextensions.k8s.io/clusterrolebindings.iam.kubesphere.io unchanged
    
    customresourcedefinition.apiextensions.k8s.io/clusterroles.iam.kubesphere.io cunchangedcustomresourcedefinition.apiextensions.k8s.io/globalrolebindings.iam.kubesphere.io cunchangedcustomresourcedefinition.apiextensions.k8s.io/globalroles.iam.kubesphere.io cunchangedcustomresourcedefinition.apiextensions.k8s.io/groupbindings.iam.kubesphere.io unchanged

      xingxing122

      检查看看为什么这行命令为什么没有被执行, https://github.com/kubesphere/ks-installer/blob/release-4.1/scripts/upgrade.sh#L180-L185,脚本内容一致吗

      helm template -s templates/prepare-upgrade-job.yaml -n kubesphere-system --release-name \
              --set upgrade.prepare=true,upgrade.image.registry=$IMAGE_REGISTRY,upgrade.image.tag=$KS_UPGRADE_TAG \
              $EXTENSION_REGISTRY_ARG \
              --set global.imageRegistry=$IMAGE_REGISTRY,global.tag=$TAG \
              -f ks-core-values.yaml \
              $chart --dry-run=server | kubectl -n kubesphere-system apply --wait -f - && kubectl -n kubesphere-system wait --for=condition=complete --timeout=600s job/prepare-upgrade

      正常升级过程中应该有如下日志:

      apply CRDs
      configmap/ks-upgrade-prepare-config created
      job.batch/prepare-upgrade created
      persistentvolumeclaim/ks-upgrade created
      job.batch/prepare-upgrade condition met

      那我的升级过程就有问题,脚本没有执行这部分。 有点奇怪,现在升级到一半了,如何重新升级

      卡在这里了,我想重新从零开始升级,好像也不行,卡住了

        hongming 中断了之后呢,我看pod 是pending 状态的,咋继续排查错误

          hongming 脚本是一致的,我对比了一下,下载文档说明的脚本

          xingxing122 upgrade.sh 可以重复执行的,脚本没有修改过吧?别把 set -e 去掉了,正常运行肯定会执行到上面我列出的这行命令

            hongming 我之前用patch 打上helm 标签,我取消试试,for r in globalroles.iam.kubesphere.io globalrolebindings.iam.kubesphere.io workspacetemplates.tenant.kubesphere.io clusterroles.iam.kubesphere.io clusterrolebindings.iam.kubesphere.io; do \

            kubectl get $r -o name | xargs -I{} kubectl label {} app.kubernetes.io/managed-by=Helm –overwrite; \

            kubectl get $r -o name | xargs -I{} kubectl annotate {} meta.helm.sh/release-name=ks-core meta.helm.sh/release-namespace=kubesphere-system –overwrite; \

            done

            脚本没有修改过,我取消打的标签,重新跑一下脚本

            有执行命令,但是看是报错了

            apply CRDs
            
            Error: invalid argument "server" for "--dry-run" flag: strconv.ParseBool: parsing "server": invalid syntax
            
            error: no objects passed to apply
            
            customresourcedefinition.apiextensions.k8s.io/applications.app.k8s.io configured
            
            customresourcedefinition.apiextensions.k8s.io/applicationreleases.application.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/applications.application.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/applicationversions.application.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/categories.application.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/repos.application.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/clusters.cluster.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/labels.cluster.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/apiservices.extensions.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/extensionentries.extensions.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/jsbundles.extensions.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/reverseproxies.extensions.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/ingressclassscopes.gateway.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/builtinroles.iam.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/categories.iam.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/clusterrolebindings.iam.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/clusterroles.iam.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/globalrolebindings.iam.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/globalroles.iam.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/groupbindings.iam.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/groups.iam.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/loginrecords.iam.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/rolebindings.iam.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/roles.iam.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/roletemplates.iam.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/users.iam.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/workspacerolebindings.iam.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/workspaceroles.iam.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/categories.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/extensions.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/extensionversions.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/installplans.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/repositories.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/serviceaccounts.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/resourcequotas.quota.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/provisionercapabilities.storage.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/storageclasscapabilities.storage.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/workspaces.tenant.kubesphere.io unchanged
            
            customresourcedefinition.apiextensions.k8s.io/workspacetemplates.tenant.kubesphere.io unchanged
            
            review your upgrade values.yaml and make sure the extension configs matches the extension you published, you have 10 seconds before upgrade starts.
            
            upgrade.go:142: [debug] preparing upgrade for ks-core
            
            upgrade.go:150: [debug] performing update for ks-core
            
            Error: UPGRADE FAILED: rendered manifests contain a resource that already exists. Unable to continue with update: GlobalRole "anonymous" in namespace "" exists and cannot be imported into the current release: invalid ownership metadata; label validation error: missing key "app.kubernetes.io/managed-by": must be set to "Helm"; annotation validation error: missing key "meta.helm.sh/release-name": must be set to "ks-core"; annotation validation error: missing key "meta.helm.sh/release-namespace": must be set to "kubesphere-system"
            
            helm.go:84: [debug] GlobalRole "anonymous" in namespace "" exists and cannot be imported into the current release: invalid ownership metadata; label validation error: missing key "app.kubernetes.io/managed-by": must be set to "Helm"; annotation validation error: missing key "meta.helm.sh/release-name": must be set to "ks-core"; annotation validation error: missing key "meta.helm.sh/release-namespace": must be set to "kubesphere-system"
            
            rendered manifests contain a resource that already exists. Unable to continue with update
            
            helm.sh/helm/v3/pkg/action.(\*Upgrade).performUpgrade
            
            	helm.sh/helm/v3/pkg/action/upgrade.go:301
            
            helm.sh/helm/v3/pkg/action.(\*Upgrade).RunWithContext
            
            	helm.sh/helm/v3/pkg/action/upgrade.go:151
            
            main.newUpgradeCmd.func2
            
            	helm.sh/helm/v3/cmd/helm/upgrade.go:199
            
            github.com/spf13/cobra.(\*Command).execute
            
            	github.com/spf13/cobra@v1.5.0/command.go:872
            
            github.com/spf13/cobra.(\*Command).ExecuteC
            
            	github.com/spf13/cobra@v1.5.0/command.go:990
            
            github.com/spf13/cobra.(\*Command).Execute
            
            	github.com/spf13/cobra@v1.5.0/command.go:918
            
            main.main
            
            	helm.sh/helm/v3/cmd/helm/helm.go:83
            
            runtime.main
            
            	runtime/proc.go:250
            
            runtime.goexit
            
            	runtime/asm_arm64.s:1172
            
            UPGRADE FAILED
            
            main.newUpgradeCmd.func2
            
            	helm.sh/helm/v3/cmd/helm/upgrade.go:201
            
            github.com/spf13/cobra.(\*Command).execute
            
            	github.com/spf13/cobra@v1.5.0/command.go:872
            
            github.com/spf13/cobra.(\*Command).ExecuteC
            
            	github.com/spf13/cobra@v1.5.0/command.go:990
            
            github.com/spf13/cobra.(\*Command).Execute
            
            	github.com/spf13/cobra@v1.5.0/command.go:918
            
            main.main
            
            	helm.sh/helm/v3/cmd/helm/helm.go:83
            
            runtime.main
            
            	runtime/proc.go:250
            
            runtime.goexit
            
            	runtime/asm_arm64.s:1172

            脚本是不是需要完善一下,跟helm 版本也有问题吧。

              xingxing122

              是的,helm 版本需要 3.13+,我们调整一下脚本

              更新好了,可以说一下,谢谢,社区还是给力