proceed.model.Step¶
- class proceed.model.Step(name=None, description=None, image=None, command=<factory>, volumes=<factory>, working_dir=None, progress_file=None, match_done=<factory>, match_in=<factory>, match_out=<factory>, match_summary=<factory>, environment=<factory>, gpus=None, user=None, network_mode=None, mac_address=None, shm_size=None, privileged=False, X11=False)¶
Specifies a container-based processing step.
Most
Stepattributes are optional, butnameis required in order to distinguish steps from each other, andimageis required in order to actually run anything.- Parameters:
name (str)
description (str)
image (str)
command (list[str])
volumes (dict[str, str | dict[str, str]])
working_dir (str)
progress_file (str)
match_done (list[str])
match_in (list[str])
match_out (list[str])
match_summary (list[str])
environment (dict[str, str])
gpus (bool | list[str | int])
user (str)
network_mode (str)
mac_address (str)
shm_size (str)
privileged (bool)
X11 (bool)
- description: str = None¶
Any description to save along with the step.
The step description is not used during pipeline execution. It’s provided as a convenience to support user documentation, notes-to-self, audits, etc.
Unlike code comments or YAML comments, the description is saved as part of the
ExecutionRecord.
- image: str = None¶
The tag or id of the container image to run from (required).
The
imageis the most important part of each step! It provides the step’s executables, dependencies, and basic environment.The image may be a human-readable tag of the form
group/name:version(like on Docker Hub) or a unique id (like theIMAGE IDoutput ofdocker images).steps: - name: human readable example image: mathworks/matlab:r2022b - name: image id example image: d209dd14c3c4
- command: list[str]¶
The command to run inside the container.
The step command is passed to the entrypoint executable of the
image. To use the defaultcmdof theimage, omit thiscommand.The command should be given as a list of string arguments. The list form makes it clear which argument is which and avoids confusion around spaces and quotes. All command elements will be converted to strings with str().
steps: - name: command example image: ubuntu command: ["echo", "hello world"]
- volumes: dict[str, str | dict[str, str]]¶
Host directories to make available inside the step’s container.
This is a key-value mapping from host absolute paths to container absolute paths. The keys are strings (host absolute paths). The values are strings (container absolute paths) or detailed key-value mappings.
steps: - name: volumes example volumes: /host/simple: /simple /host/read-only: {bind: /read-only, mode: ro} /host/read-write: {bind: /read-write, mode: rw}
The detailed style lets you specify the container path to bind as well as the read/write permissions.
- bind
the container absolute path to bind (where the host dir will show up inside the container)
- mode
the read/write permission to give the container:
rwfor read plus write (the default),rofor read only
- working_dir: str = None¶
A working directory path within the container – the initial shell
pwdor Pythonos.getcwd().
- progress_file: str = None¶
File to create when the step starts, and rename to
<progress_file>.donewhen the step succeeds.This is an optional marker file that Proceed can use to indicate progress through the step and to decide whether step is already complete and can be skipped.
Step.progress_fileshould be a file path on the host – unlikeStep.match_done,Step.match_in, andStep.match_out, which are patterns to match withinStep.volumes.Proceed will create
Step.progress_filewhen starting to execute a step. If the step completes with a nonzero exit code, Proceed will append an error message to the file. If the step completes with a zero exit code, Proceed will append a success message to the file and rename the file, adding the suffix,.done.When
<progress_file>.donealready exists the step will be skipped. This is intended as a convenience to avoid redundant processing. To make a step run unconditionally, omitStep.progress_fileandmatch_done.For example, say
Step.progress_fileis given asprogress.txt. When beginning the step, Proceed will createprogress.txt. When the step succeeds Proceed will append a success message toprogress.txtand rename the file toprogress.txt.done. Next time the step runs, ifprogress.txt.donestill exists, the step will be skipped.
- match_done: list[str]¶
File matching patterns to search for, before deciding to run the step.
This is a list of glob patterns to search for before running the step. Each of the step’s
volumeswill be searched with the same list of patterns.If any matches are found, these files will be noted in the
ExecutionRecord, along with their content digests, and the step will be skipped. This is intended as a convenience to avoid redundant processing. To make a step run unconditionally, omitStep.progress_fileandmatch_done.steps: - name: match done example match_done: - one/specific/file.txt - any/text/*.txt - any/text/any/subdir/**/*.txt
- match_in: list[str]¶
File matching patterns to search for, before running the step.
This is a list of glob patterns to search for before running the step. Each of the step’s
volumeswill be searched with the same list of patterns.Any matches found will be noted in the
ExecutionRecord.match_inis intended to support audits by accounting for the input files that went into a step, along with their content digests. Unlikematch_done,match_indoes not affect step execution.steps: - name: match in example match_in: - one/specific/file.txt - any/text/*.txt - any/text/any/subdir/**/*.txt
- match_out: list[str]¶
File matching patterns to search for, after running the step.
This is a list of glob patterns to search for after running the step. Each of the step’s
volumeswill be searched with the same list of patterns.Any matches found will be noted in the
ExecutionRecord.match_outis intended to support audits by accounting for the output files that came from a step, along with their content digests. Unlikematch_done,match_outdoes not affect step execution.steps: - name: match out example match_out: - one/specific/file.txt - any/text/*.txt - any/text/any/subdir/**/*.txt
- match_summary: list[str]¶
File matching patterns to search for, after running the step, to include when summarizing results.
This is a list of glob patterns to search for after running the step. Each of the step’s
volumeswill be searched with the same list of patterns.Any matches found will be noted in the
ExecutionRecord.match_summaryis intended to enrich pipeline execution summaries with custom columns. SeeStepResult.files_summaryfor how matched files are treated. Unlikematch_done,match_summarydoes not affect step execution.steps: - name: match summary example match_summary: - one/specific/file.txt - any/text/*.txt - any/text/any/subdir/**/*.txt
- environment: dict[str, str]¶
Environment variables to set inside the step’s container.
This is a key-value mapping from environment variable names to values. The keys and values are both strings.
steps: - name: environment example environment: MLM_LICENSE_FILE: /license.lic foo: bar
- gpus: bool | list[str | int] = None¶
Which GPU devices to request.
When
gpusisTrue, request GPU device support similar todocker run --gpus all.When
gpusis a list, the list elements will be treated as specific GPU device IDs or indexes to request.See Docker resource constraints.
steps: - name: all gpus gpus: true steps: - name: no gpus gpus: false steps: - name: one specific gpu by string id gpus: ['GPU-3a23c669-1f69-c64e-cf85-44e9b07e7a2a'] steps: - name: two specific gpus by numeric index gpus: [0, 2]
- user: str = None¶
User (and group) to run as in the container, instead of container default (usually root).
When
useris omitted orNonethe container will run with the default user and group specified in the image. This is usually root, or sometimes an image-specific user and group.When
useris provided it must be a string user name or int uid, with group/gid optional, as follows:- self or self:group
The special user name
selfmeans run with the uid of the current user on the Docker host. Optionally, this can be followed by a group name or gid as inself:group. When thisgroupis a string name it must exists on the Docker host and will be converted to a host gid.- user or user:group
Other string user names and group names are used as-is and must exist within the image / container.
- uid or uid:gid
Integer uids and gids don’t have to exist within the image / container. It’s proably helpful if they exist on the Docker host.
steps: - name: default/root user example steps: - name: host current user and group example user: self steps: - name: existing container user example user: container-user steps: - name: integer uid and gid example user: 1234:5678
- network_mode: str = None¶
How to configure the container’s network environment.
When provided, this should be one of the following network modes:
- bridge
create an isolated network environment for the container (default)
- none
disable networking for the container
- container:<name|id>
reuse the network of another container, by name or id
- host
make the container’s network environment just like the host’s
- mac_address: str = None¶
Aribtrary MAC address to set in the container.
Perhaps surprisingly, containers can have arbitrary MAC “hardware” addresses.
steps: - name: mac address example mac_address: aa:bb:cc:dd:ee:ff
- shm_size: str = None¶
Max size of the
/dev/shmshared memory in-memory-file-system.Docker defaults
/dev/shmto 64 megabytes. Steps that need more can useshm_sizeto increase this limit. Integer values will be treated as bytes, for example1000. Values with a unit suffix will use larger units, for example 10b, 10k, 10m, or 10g.steps: - name: more-shm shm_size: 2g
- privileged: bool = False¶
Whether the step’s container should run with elevated privileges and device access.
This defaults to
False. Please only setprivilegedtoTruetemporarily, for troubleshooting!steps: - name: elevated-privileged privileged: True
- X11: bool = False¶
Whether to set up the container as an X11 client app with
DISPLAYaccess.This defaults to
False, assuming most steps are noninteractive. SetX11toTrueto set up the container as an X11 GUI client app withDISPLAYaccess. This will modify the container environment in a few ways:DISPLAYProceed will set the
DISPLAYenvironment variable in the step container to match the host environment./tmp/.X11-unixIf the
/tmp/.X11-unixdirectory exists on the host Proceed will add this to the step’sStep.volumes. This lets the step container access local Unix sockets for connecting to a local X server.Step.network_modehostProceed will set the step’s network mode to
host. This lets the step container access TCP sockets for connecting to a remote/proxied X server as with ssh -X or ssh -Y.XAUTHORITYProceed will set up the
XAUTHORITYenvironment variable and.Xauthoritycookie file based on the host environment. If theXAUTHORITYvariable is set in the host environment Proceed will use this file path to locate the cookie file. Otherwise Proceed will use the default cookie file path which is the current host user’s$HOME/.Xauthority. If the cookie file exists on the host Proceed will add it to the step’sStep.volumes. Proceed will bind the cookie file to a fixed, known path in the container like/var/.Xauthority. Proceed will set theXAUTHORITYenvironment variable in the container to the same known path. Using a fixed path for the cookie file should avoid any dependency on the container user or HOME configuration (or lack thereof). All of this lets the step container authenticate with a remote/proxied X server as with ssh -X or ssh -Y.
steps: - name: x11-gui-client X11: True