PyTorchJob

Reference documentation for PyTorchJob

Packages:

kubeflow.org

Package v1beta2 is the v1beta2 version of the API.

Resource Types:

PyTorchJob

PyTorchJob represents the configuration of PyTorchJob

Field Description
apiVersion
string
kubeflow.org/v1beta2
kind
string
PyTorchJob
metadata
Kubernetes meta/v1.ObjectMeta

Standard object’s metadata.

Refer to the Kubernetes API documentation for the fields of the metadata field.
spec
PyTorchJobSpec

Specification of the desired behavior of the PyTorchJob.



activeDeadlineSeconds
int64
(Optional)

Specifies the duration in seconds relative to the startTime that the job may be active before the system tries to terminate it; value must be positive integer. This method applies only to pods with restartPolicy == OnFailure or Always.

backoffLimit
int32
(Optional)

Optional number of retries before marking this job failed.

cleanPodPolicy
common/v1beta2.CleanPodPolicy

CleanPodPolicy defines the policy to kill pods after PyTorchJob is succeeded. Default to Running.

ttlSecondsAfterFinished
int32

TTLSecondsAfterFinished is the TTL to clean up pytorch-jobs (temporary before kubernetes adds the cleanup controller). It may take extra ReconcilePeriod seconds for the cleanup, since reconcile gets called periodically. Default to infinite.

pytorchReplicaSpecs
map[github.com/kubeflow/pytorch-operator/pkg/apis/pytorch/v1beta2.PyTorchReplicaType]*github.com/kubeflow/tf-operator/pkg/apis/common/v1beta2.ReplicaSpec

PyTorchReplicaSpecs is map of PyTorchReplicaType and PyTorchReplicaSpec specifies the PyTorch replicas to run. For example, { “Master”: PyTorchReplicaSpec, “Worker”: PyTorchReplicaSpec, }

status
common/v1beta2.JobStatus

Most recently observed status of the PyTorchJob. This data may not be up to date. Populated by the system. Read-only.

PyTorchJobSpec

(Appears on: PyTorchJob)

PyTorchJobSpec is a desired state description of the PyTorchJob.

Field Description
activeDeadlineSeconds
int64
(Optional)

Specifies the duration in seconds relative to the startTime that the job may be active before the system tries to terminate it; value must be positive integer. This method applies only to pods with restartPolicy == OnFailure or Always.

backoffLimit
int32
(Optional)

Optional number of retries before marking this job failed.

cleanPodPolicy
common/v1beta2.CleanPodPolicy

CleanPodPolicy defines the policy to kill pods after PyTorchJob is succeeded. Default to Running.

ttlSecondsAfterFinished
int32

TTLSecondsAfterFinished is the TTL to clean up pytorch-jobs (temporary before kubernetes adds the cleanup controller). It may take extra ReconcilePeriod seconds for the cleanup, since reconcile gets called periodically. Default to infinite.

pytorchReplicaSpecs
map[github.com/kubeflow/pytorch-operator/pkg/apis/pytorch/v1beta2.PyTorchReplicaType]*github.com/kubeflow/tf-operator/pkg/apis/common/v1beta2.ReplicaSpec

PyTorchReplicaSpecs is map of PyTorchReplicaType and PyTorchReplicaSpec specifies the PyTorch replicas to run. For example, { “Master”: PyTorchReplicaSpec, “Worker”: PyTorchReplicaSpec, }

PyTorchReplicaType (string alias)

PyTorchReplicaType is the type for PyTorchReplica.


Generated with gen-crd-api-reference-docs on git commit e775742.


Last modified 30.04.2021: fix broken link (#2672) (c1aba76b)