MPIJob
Reference documentation for MPIJob
Out of date
This guide contains outdated information pertaining to Kubeflow 1.0. This guide needs to be updated for Kubeflow 1.1.
Packages:
kubeflow.org
Package v1alpha2 is the v1alpha2 version of the API.
Resource Types:
MPIJob
Represents a MPIJob resource.
Field | Description | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
apiVersion string | kubeflow.org/v1alpha2 | ||||||||||||||
kind string | MPIJob | ||||||||||||||
metadata Kubernetes meta/v1.ObjectMeta | Standard Kubernetes object’s metadata. Refer to the Kubernetes API documentation for the fields of themetadata field. | ||||||||||||||
spec MPIJobSpec | Specification of the desired state of the MPIJob.
| ||||||||||||||
status common/v1.JobStatus | Most recently observed status of the MPIJob. Read-only (modified by the system). |
MPIJobSpec
(Appears on: MPIJob)
MPIJobSpec is a desired state description of the MPIJob.
Field | Description |
---|---|
activeDeadlineSeconds int64 | (Optional) Specifies the duration (in seconds) since startTime during which the job can remain active before it is terminated. Must be a positive integer. This setting applies only to pods where restartPolicy is OnFailure or Always. |
backoffLimit int32 | (Optional) Number of retries before marking this job as failed. |
cleanPodPolicy common/v1.CleanPodPolicy | Defines the policy for cleaning up pods after the MPIJob completes. Defaults to None. |
slotsPerWorker int32 | (Optional) Specifies the number of slots per worker used in hostfile. Defaults to 1. |
mainContainer string | (Optional) Specifies name of the main container which executes the MPI code. |
runPolicy common/v1.RunPolicy | (Optional) Encapsulates various runtime policies of the distributed training job, for example how to clean up resources and how long the job can stay active. |
mpiReplicaSpecs map[github.com/kubeflow/mpi-operator/pkg/apis/kubeflow/v1alpha2.MPIReplicaType]*github.com/kubeflow/tf-operator/pkg/apis/common/v1.ReplicaSpec | A map of MPIReplicaType (type) to ReplicaSpec (value). Specifies the MPI cluster configuration. For example, { “Launcher”: MPIReplicaSpec, “Worker”: MPIReplicaSpec, } |
MPIReplicaType (string
alias)
MPIReplicaType is the type for MPIReplica. Can be one of “Launcher” or “Worker”.
Last modified 03.08.2020: Added outdated banner to non-index docs unchanged in last 30d (#2072) (e56f3650)