我用pipeline测试一个机器学习代码执行,部署到k8s使用的Job,编辑好pipeline后,第一次执行没问题。然后,我测试重复运行,最后一步发布应用到k8s就报错了。我试了一下官网的java的例子,是可以重运行的,没问题。
下面是我自己写的最简单的Job对应的yaml文件:
kind: Job
apiVersion: batch/v1
metadata:
name: machine-learning-demo
namespace: demo-project
spec:
activeDeadlineSeconds: 300
backoffLimit: 5
template:
spec:
containers:
- name: machine-learning
image: $REGISTRY/$DOCKERHUB_NAMESPACE/$APP_NAME:SNAPSHOT-$BUILD_NUMBER
resources:
limits:
cpu: 500m
memory: 500Mi
requests:
cpu: 500m
memory: 500Mi
imagePullPolicy: IfNotPresent
restartPolicy: OnFailure
下面是pipeline最后一步,发布到k8s的报错信息。
Starting Kubernetes deployment
Loading configuration: /home/jenkins/agent/workspace/project-oOrEv42rm51B/machine-learning-demo/deploy/job/deploy.yaml
ERROR: ERROR: java.lang.RuntimeException: io.kubernetes.client.openapi.ApiException: Unprocessable Entity
hudson.remoting.ProxyException: java.lang.RuntimeException: io.kubernetes.client.openapi.ApiException: Unprocessable Entity
at com.microsoft.jenkins.kubernetes.wrapper.ResourceManager.handleApiException(ResourceManager.java:193)
at com.microsoft.jenkins.kubernetes.wrapper.V1ResourceManager$JobUpdater.applyResource(V1ResourceManager.java:508)
at com.microsoft.jenkins.kubernetes.wrapper.V1ResourceManager$JobUpdater.applyResource(V1ResourceManager.java:484)
at com.microsoft.jenkins.kubernetes.wrapper.ResourceManager$ResourceUpdater.createOrApply(ResourceManager.java:97)
at com.microsoft.jenkins.kubernetes.wrapper.KubernetesClientWrapper.handleResource(KubernetesClientWrapper.java:289)
at com.microsoft.jenkins.kubernetes.wrapper.KubernetesClientWrapper.apply(KubernetesClientWrapper.java:256)
at com.microsoft.jenkins.kubernetes.command.DeploymentCommand$DeploymentTask.doCall(DeploymentCommand.java:172)
at com.microsoft.jenkins.kubernetes.command.DeploymentCommand$DeploymentTask.call(DeploymentCommand.java:124)
at com.microsoft.jenkins.kubernetes.command.DeploymentCommand$DeploymentTask.call(DeploymentCommand.java:106)
at hudson.remoting.UserRequest.perform(UserRequest.java:212)
at hudson.remoting.UserRequest.perform(UserRequest.java:54)
at hudson.remoting.Request$2.run(Request.java:369)
at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at hudson.remoting.Engine$1.lambda$newThread$0(Engine.java:93)
at java.lang.Thread.run(Thread.java:748)
Suppressed: hudson.remoting.Channel$CallSiteStackTrace: Remote call to JNLP4-connect connection from 172.20.200.170/172.20.200.170:58078
at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1743)
at hudson.remoting.UserRequest$ExceptionResponse.retrieve(UserRequest.java:357)
at hudson.remoting.Channel.call(Channel.java:957)
at hudson.FilePath.act(FilePath.java:1160)
at com.microsoft.jenkins.kubernetes.command.DeploymentCommand.execute(DeploymentCommand.java:68)
at com.microsoft.jenkins.kubernetes.command.DeploymentCommand.execute(DeploymentCommand.java:45)
at com.microsoft.jenkins.azurecommons.command.CommandService.runCommand(CommandService.java:88)
at com.microsoft.jenkins.azurecommons.command.CommandService.execute(CommandService.java:96)
at com.microsoft.jenkins.azurecommons.command.CommandService.executeCommands(CommandService.java:75)
at com.microsoft.jenkins.azurecommons.command.BaseCommandContext.executeCommands(BaseCommandContext.java:77)
at com.microsoft.jenkins.kubernetes.KubernetesDeploy.perform(KubernetesDeploy.java:42)
at com.microsoft.jenkins.azurecommons.command.SimpleBuildStepExecution.run(SimpleBuildStepExecution.java:54)
at com.microsoft.jenkins.azurecommons.command.SimpleBuildStepExecution.run(SimpleBuildStepExecution.java:35)
at org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:47)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
... 1 more
Caused by: hudson.remoting.ProxyException: io.kubernetes.client.openapi.ApiException: Unprocessable Entity
at io.kubernetes.client.openapi.ApiClient.handleResponse(ApiClient.java:979)
at io.kubernetes.client.openapi.ApiClient.execute(ApiClient.java:895)
at io.kubernetes.client.openapi.apis.BatchV1Api.replaceNamespacedJobWithHttpInfo(BatchV1Api.java:1828)
at io.kubernetes.client.openapi.apis.BatchV1Api.replaceNamespacedJob(BatchV1Api.java:1802)
at com.microsoft.jenkins.kubernetes.wrapper.V1ResourceManager$JobUpdater.applyResource(V1ResourceManager.java:505)
... 16 more
Api call failed with code 422, detailed message: {
"kind": "Status",
"apiVersion": "v1",
"metadata": {
},
"status": "Failure",
"message": "Job.batch \"machine-learning-demo\" is invalid: spec.template: Invalid value: core.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:\"\", GenerateName:\"\", Namespace:\"\", SelfLink:\"\", UID:\"\", ResourceVersion:\"\", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string{\"controller-uid\":\"9509b29b-760a-4466-9aaf-dee49c75961d\", \"job-name\":\"machine-learning-demo\"}, Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:\"\", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, Spec:core.PodSpec{Volumes:[]core.Volume(nil), InitContainers:[]core.Container(nil), Containers:[]core.Container{core.Container{Name:\"machine-learning\", Image:\"docker.io/ygtao/machine-learning-pipeline-demo:SNAPSHOT-14\", Command:[]string(nil), Args:[]string(nil), WorkingDir:\"\", Ports:[]core.ContainerPort(nil), EnvFrom:[]core.EnvFromSource(nil), Env:[]core.EnvVar(nil), Resources:core.ResourceRequirements{Limits:core.ResourceList{\"cpu\":resource.Quantity{i:resource.int64Amount{value:500, scale:-3}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:\"500m\", Format:\"DecimalSI\"}, \"memory\":resource.Quantity{i:resource.int64Amount{value:524288000, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:\"500Mi\", Format:\"BinarySI\"}}, Requests:core.ResourceList{\"cpu\":resource.Quantity{i:resource.int64Amount{value:500, scale:-3}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:\"500m\", Format:\"DecimalSI\"}, \"memory\":resource.Quantity{i:resource.int64Amount{value:524288000, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:\"500Mi\", Format:\"BinarySI\"}}}, VolumeMounts:[]core.VolumeMount(nil), VolumeDevices:[]core.VolumeDevice(nil), LivenessProbe:(*core.Probe)(nil), ReadinessProbe:(*core.Probe)(nil), Lifecycle:(*core.Lifecycle)(nil), TerminationMessagePath:\"/dev/termination-log\", TerminationMessagePolicy:\"File\", ImagePullPolicy:\"IfNotPresent\", SecurityContext:(*core.SecurityContext)(nil), Stdin:false, StdinOnce:false, TTY:false}}, RestartPolicy:\"OnFailure\", TerminationGracePeriodSeconds:(*int64)(0xc0097ff2e8), ActiveDeadlineSeconds:(*int64)(nil), DNSPolicy:\"ClusterFirst\", NodeSelector:map[string]string(nil), ServiceAccountName:\"\", AutomountServiceAccountToken:(*bool)(nil), NodeName:\"\", SecurityContext:(*core.PodSecurityContext)(0xc017bec0e0), ImagePullSecrets:[]core.LocalObjectReference(nil), Hostname:\"\", Subdomain:\"\", Affinity:(*core.Affinity)(nil), SchedulerName:\"default-scheduler\", Tolerations:[]core.Toleration(nil), HostAliases:[]core.HostAlias(nil), PriorityClassName:\"\", Priority:(*int32)(nil), PreemptionPolicy:(*core.PreemptionPolicy)(nil), DNSConfig:(*core.PodDNSConfig)(nil), ReadinessGates:[]core.PodReadinessGate(nil), RuntimeClassName:(*string)(nil), EnableServiceLinks:(*bool)(nil)}}: field is immutable",
"reason": "Invalid",
"details": {
"name": "machine-learning-demo",
"group": "batch",
"kind": "Job",
"causes": [
{
"reason": "FieldValueInvalid",
"message": "Invalid value: core.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:\"\", GenerateName:\"\", Namespace:\"\", SelfLink:\"\", UID:\"\", ResourceVersion:\"\", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string{\"controller-uid\":\"9509b29b-760a-4466-9aaf-dee49c75961d\", \"job-name\":\"machine-learning-demo\"}, Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:\"\", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, Spec:core.PodSpec{Volumes:[]core.Volume(nil), InitContainers:[]core.Container(nil), Containers:[]core.Container{core.Container{Name:\"machine-learning\", Image:\"docker.io/ygtao/machine-learning-pipeline-demo:SNAPSHOT-14\", Command:[]string(nil), Args:[]string(nil), WorkingDir:\"\", Ports:[]core.ContainerPort(nil), EnvFrom:[]core.EnvFromSource(nil), Env:[]core.EnvVar(nil), Resources:core.ResourceRequirements{Limits:core.ResourceList{\"cpu\":resource.Quantity{i:resource.int64Amount{value:500, scale:-3}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:\"500m\", Format:\"DecimalSI\"}, \"memory\":resource.Quantity{i:resource.int64Amount{value:524288000, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:\"500Mi\", Format:\"BinarySI\"}}, Requests:core.ResourceList{\"cpu\":resource.Quantity{i:resource.int64Amount{value:500, scale:-3}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:\"500m\", Format:\"DecimalSI\"}, \"memory\":resource.Quantity{i:resource.int64Amount{value:524288000, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:\"500Mi\", Format:\"BinarySI\"}}}, VolumeMounts:[]core.VolumeMount(nil), VolumeDevices:[]core.VolumeDevice(nil), LivenessProbe:(*core.Probe)(nil), ReadinessProbe:(*core.Probe)(nil), Lifecycle:(*core.Lifecycle)(nil), TerminationMessagePath:\"/dev/termination-log\", TerminationMessagePolicy:\"File\", ImagePullPolicy:\"IfNotPresent\", SecurityContext:(*core.SecurityContext)(nil), Stdin:false, StdinOnce:false, TTY:false}}, RestartPolicy:\"OnFailure\", TerminationGracePeriodSeconds:(*int64)(0xc0097ff2e8), ActiveDeadlineSeconds:(*int64)(nil), DNSPolicy:\"ClusterFirst\", NodeSelector:map[string]string(nil), ServiceAccountName:\"\", AutomountServiceAccountToken:(*bool)(nil), NodeName:\"\", SecurityContext:(*core.PodSecurityContext)(0xc017bec0e0), ImagePullSecrets:[]core.LocalObjectReference(nil), Hostname:\"\", Subdomain:\"\", Affinity:(*core.Affinity)(nil), SchedulerName:\"default-scheduler\", Tolerations:[]core.Toleration(nil), HostAliases:[]core.HostAlias(nil), PriorityClassName:\"\", Priority:(*int32)(nil), PreemptionPolicy:(*core.PreemptionPolicy)(nil), DNSConfig:(*core.PodDNSConfig)(nil), ReadinessGates:[]core.PodReadinessGate(nil), RuntimeClassName:(*string)(nil), EnableServiceLinks:(*bool)(nil)}}: field is immutable",
"field": "spec.template"
}
]
},
"code": 422
}
Kubernetes deployment ended with HasError
看报错信息说的是我PodTemplate中有变量值非法,但我实在没看出来有啥问题。路过的大佬帮忙瞅瞅,感激不尽