argocd和Jenkins部署都是成功的,访问页面也没问题,但是因为上面agent的报错导致devops使用不了
4.1.2安装devops 集群 Agent 报错
2024-11-01T14:27:16.634541258+08:00 Error: Get “https://10.211.0.1:443/api/v1/namespaces/argocd/services/devops-agent-argocd-repo-server”: dial tcp 10.211.0.1:443: connect: connection refused - error from a previous attempt: http2: server sent GOAWAY and closed the connection; LastStreamID=1961, ErrCode=NO_ERROR, debug=""
2024-11-01T14:27:16.634657666+08:00 helm.go:84: [debug] Get “https://10.211.0.1:443/api/v1/namespaces/argocd/services/devops-agent-argocd-repo-server”: dial tcp 10.211.0.1:443: connect: connection refused - error from a previous attempt: http2: server sent GOAWAY and closed the connection; LastStreamID=1961, ErrCode=NO_ERROR, debug=""
重新运行这个任务报错:
2024-11-01T14:46:41.573531713+08:00 WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: kube.config
2024-11-01T14:46:41.573601428+08:00 WARNING: Kubernetes configuration file is world-readable. This is insecure. Location: kube.config
2024-11-01T14:46:41.598947422+08:00 history.go:56: [debug] getting history for release devops-agent
2024-11-01T14:46:42.350201444+08:00 upgrade.go:155: [debug] preparing upgrade for devops-agent
2024-11-01T14:46:42.681432355+08:00 Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress
2024-11-01T14:46:42.681513867+08:00 helm.go:84: [debug] another operation (install/upgrade/rollback) is in progress
2024-11-01T14:46:42.681539322+08:00 helm.sh/helm/v3/pkg/action.init
2024-11-01T14:46:42.681559948+08:00 helm.sh/helm/v3/pkg/action/action.go:52
2024-11-01T14:46:42.681580350+08:00 runtime.doInit1
2024-11-01T14:46:42.681627968+08:00 runtime/proc.go:6735
2024-11-01T14:46:42.681647336+08:00 runtime.doInit
2024-11-01T14:46:42.681696425+08:00 runtime/proc.go:6702
2024-11-01T14:46:42.681729478+08:00 runtime.main
2024-11-01T14:46:42.681745982+08:00 runtime/proc.go:249
2024-11-01T14:46:42.681760621+08:00 runtime.goexit
2024-11-01T14:46:42.681774903+08:00 runtime/asm_amd64.s:1650
2024-11-01T14:46:42.681788437+08:00 UPGRADE FAILED
2024-11-01T14:46:42.681803309+08:00 main.newUpgradeCmd.func2
2024-11-01T14:46:42.681817144+08:00 helm.sh/helm/v3/cmd/helm/upgrade.go:229
2024-11-01T14:46:42.681830576+08:00 github.com/spf13/cobra.(*Command).execute
2024-11-01T14:46:42.681845488+08:00 github.com/spf13/cobra@v1.8.0/command.go:983
2024-11-01T14:46:42.681862769+08:00 github.com/spf13/cobra.(*Command).ExecuteC
2024-11-01T14:46:42.681880411+08:00 github.com/spf13/cobra@v1.8.0/command.go:1115
2024-11-01T14:46:42.681895093+08:00 github.com/spf13/cobra.(*Command).Execute
2024-11-01T14:46:42.681908660+08:00 github.com/spf13/cobra@v1.8.0/command.go:1039
2024-11-01T14:46:42.681922206+08:00 main.main
2024-11-01T14:46:42.681935909+08:00 helm.sh/helm/v3/cmd/helm/helm.go:83
2024-11-01T14:46:42.681949478+08:00 runtime.main
2024-11-01T14:46:42.681966315+08:00 runtime/proc.go:267
2024-11-01T14:46:42.681984208+08:00 runtime.goexit
2024-11-01T14:46:42.682003073+08:00 runtime/asm_amd64.s:1650
我也是
碰到了同样的问题,请问各路大神,如何处理?
重启 containerd, 重启 kubelet,删除 kube-proxy再生成,试试,我后来好像是这样解决了的。
- 已编辑
Picked up JAVA_TOOL_OPTIONS: -XX:InitialRAMPercentage=70 -XX:MaxRAMPercentage=70 -Dhudson.slaves.NodeProvisioner.initialDelay=20 -Dhudson.slaves.NodeProvisioner.MARGIN=50 -Dhudson.slaves.NodeProvisioner.MARGIN0=0.85 -Dhudson.model.LoadStatistics.clock=5000 -Dhudson.model.LoadStatistics.decay=0.2 -Dhudson.slaves.NodeProvisioner.recurrencePeriod=5000 -Dhudson.security.csrf.DefaultCrumbIssuer.EXCLUDE_SESSION_ID=true -Dhudson.plugins.git.GitStatus.NOTIFY_COMMIT_ACCESS_CONTROL=disabled -Dio.jenkins.plugins.casc.ConfigurationAsCode.initialDelay=10000 -Djenkins.install.runSetupWizard=false -XX:+AlwaysPreTouch -XX:+HeapDumpOnOutOfMemoryError -XX:+UseG1GC -XX:+UseStringDeduplication -XX:+ParallelRefProcEnabled -XX:+DisableExplicitGC -XX:+UnlockDiagnosticVMOptions -XX:+UnlockExperimentalVMOptions
Running from: /usr/share/jenkins/jenkins.war
webroot: EnvVars.masterEnvVars.get(“JENKINS_HOME”)
2024-11-30 09:40:04.451+0000 [id=1] INFO org.eclipse.jetty.util.log.Log#initialized: Logging initialized @1046ms to org.eclipse.jetty.util.log.JavaUtilLog
2024-11-30 09:40:04.562+0000 [id=1] INFO winstone.Logger#logInternal: Beginning extraction from war file
2024-11-30 09:49:03.230+0000 [id=1] WARNING o.e.j.s.handler.ContextHandler#setContextPath: Empty contextPath
2024-11-30 09:49:03.305+0000 [id=1] INFO org.eclipse.jetty.server.Server#doStart: jetty-9.4.45.v20220203; built: 2022-02-03T09:14:34.105Z; git: 4a0c91c0be53805e3fcffdcdcc9587d5301863db; jvm 11.0.16+8
2024-11-30 09:49:03.703+0000 [id=1] INFO o.e.j.w.StandardDescriptorProcessor#visitServlet: NO JSP Support for /, did not find org.eclipse.jetty.jsp.JettyJspServlet
2024-11-30 09:49:03.747+0000 [id=1] INFO o.e.j.s.s.DefaultSessionIdManager#doStart: DefaultSessionIdManager workerName=node0
2024-11-30 09:49:03.748+0000 [id=1] INFO o.e.j.s.s.DefaultSessionIdManager#doStart: No SessionScavenger set, using defaults
2024-11-30 09:49:03.749+0000 [id=1] INFO o.e.j.server.session.HouseKeeper#startScavenging: node0 Scavenging every 660000ms
2024-11-30 09:49:04.474+0000 [id=1] INFO hudson.WebAppMain#contextInitialized: Jenkins home directory: /var/jenkins_home found at: EnvVars.masterEnvVars.get(“JENKINS_HOME”)
2024-11-30 09:49:06.756+0000 [id=1] INFO o.e.j.s.handler.ContextHandler#doStart: Started w.@4acb2510{Jenkins v2.346.3,/,file:///var/jenkins_home/war/,AVAILABLE}{/var/jenkins_home/war}
2024-11-30 09:49:06.798+0000 [id=1] INFO o.e.j.server.AbstractConnector#doStart: Started ServerConnector@260e86a1{HTTP/1.1, (http/1.1)}{0.0.0.0:8080}
2024-11-30 09:49:06.799+0000 [id=1] INFO org.eclipse.jetty.server.Server#doStart: Started @543396ms
2024-11-30 09:49:06.800+0000 [id=23] INFO winstone.Logger#logInternal: Winstone Servlet Engine running: controlPort=disabled
2024-11-30 09:49:08.801+0000 [id=30] INFO jenkins.InitReactorRunner$1#onAttained: Started initialization
2024-11-30 09:58:36.286+0000 [id=28] WARNING hudson.ClassicPluginStrategy#createClassJarFromWebInfClasses: Created /var/jenkins_home/plugins/job-dsl/WEB-INF/lib/classes.jar; update plugin to a version created with a newer harness
2024-11-30 10:19:20.847+0000 [id=29] WARNING hudson.ClassicPluginStrategy#createClassJarFromWebInfClasses: Created /var/jenkins_home/plugins/node-iterator-api/WEB-INF/lib/classes.jar; update plugin to a version created with a newer harness
2024-11-30 10:19:26.289+0000 [id=29] INFO jenkins.InitReactorRunner$1#onAttained: Listed all plugins
2024-11-30 10:19:36.544+0000 [id=29] INFO jenkins.InitReactorRunner$1#onAttained: Prepared all plugins
2024-11-30 10:19:36.565+0000 [id=29] INFO jenkins.InitReactorRunner$1#onAttained: Started all plugins
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.codehaus.groovy.vmplugin.v7.Java7$1 (file:/var/jenkins_home/war/WEB-INF/lib/groovy-all-2.4.21.jar) to constructor java.lang.invoke.MethodHandles$Lookup(java.lang.Class,int)
WARNING: Please consider reporting this to the maintainers of org.codehaus.groovy.vmplugin.v7.Java7$1
WARNING: Use –illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
这是Jenkins启动的某段日志,你可以看到在2024-11-30 09:49:08.801到2024-11-30 09:58:36.286用了几十分钟初始化,这个时间太久了,如果探针时间过段就永远启动不了,所以我干脆删除探针,这也许是我的二手服务器性能拉跨,但是想来其他人的服务器应该也好不到哪里去,毕竟现在的服务器大多数是虚拟机的,我这个是英特尔至强E5 2698 V4的处理器,20核心40线程的。
我也遇到了同样的问题,请问有解决的了吗?
而且我的扩展中心的devops组件一直停在“安装中”的状态,没法继续任何操作了。
这个 devops 给它 12G 内存还报 OOMKilled,好折磨啊
tudan110 不会吧。devops-jenkins 这个pod吗?通常来说6G就够了。jenkins log 有异常吗
破案了应该是这个原因
devops-jenkins 这个pod中jvm的参数设置如下
-XX:InitialRAMPercentage=70 -XX:MaxRAMPercentage=70 -Dhudson.slaves.NodeProvisioner.initialDelay=20 -Dhudson.slaves.NodeProvisioner.MARGIN=50 -Dhudson.slaves.NodeProvisioner.MARGIN0=0.85 -Dhudson.model.LoadStatistics.clock=5000 -Dhudson.model.LoadStatistics.decay=0.2 -Dhudson.slaves.NodeProvisioner.recurrencePeriod=5000 -Dhudson.security.csrf.DefaultCrumbIssuer.EXCLUDE_SESSION_ID=true -Dhudson.plugins.git.GitStatus.NOTIFY_COMMIT_ACCESS_CONTROL=disabled -Dio.jenkins.plugins.casc.ConfigurationAsCode.initialDelay=10000 -Djenkins.install.runSetupWizard=false -XX:+AlwaysPreTouch -XX:+HeapDumpOnOutOfMemoryError -XX:+UseG1GC -XX:+UseStringDeduplication -XX:+ParallelRefProcEnabled -XX:+DisableExplicitGC -XX:+UnlockDiagnosticVMOptions -XX:+UnlockExperimentalVMOptions
就是说启动的时候使用系统上限的70%内存,这个百分比的基数大概率是分配到的节点的最大内存,打比方节点内存32G这时候启动然后就会占用 32*0.7 22.4GB内存,此时pod配置limit是默认的6GB 直接被k8s OOM= =、所以配置12G也会蛋疼
至于解决方案就是修改这些配置啦