proceed.model.Step¶
- class proceed.model.Step(name=None, description=None, image=None, command=<factory>, volumes=<factory>, working_dir=None, match_done=<factory>, match_in=<factory>, match_out=<factory>, match_summary=<factory>, environment=<factory>, gpus=None, user=None, network_mode=None, mac_address=None, shm_size=None, privileged=False, X11=False)¶
Specifies a container-based processing step.
Most
Step
attributes are optional, butname
is required in order to distinguish steps from each other, andimage
is required in order to actually run anything.- Parameters:
name (str)
description (str)
image (str)
command (list[str])
volumes (dict[str, str | dict[str, str]])
working_dir (str)
match_done (list[str])
match_in (list[str])
match_out (list[str])
match_summary (list[str])
environment (dict[str, str])
gpus (str | bool)
user (str | int)
network_mode (str)
mac_address (str)
shm_size (str)
privileged (bool)
X11 (bool)
- description: str = None¶
Any description to save along with the step.
The step description is not used during pipeline execution. It’s provided as a convenience to support user documentation, notes-to-self, audits, etc.
Unlike code comments or YAML comments, the description is saved as part of the
ExecutionRecord
.
- image: str = None¶
The tag or id of the container image to run from (required).
The
image
is the most important part of each step! It provides the step’s executables, dependencies, and basic environment.The image may be a human-readable tag of the form
group/name:version
(like on Docker Hub) or a unique id (like theIMAGE ID
output ofdocker images
).steps: - name: human readable example image: mathworks/matlab:r2022b - name: image id example image: d209dd14c3c4
- command: list[str]¶
The command to run inside the container.
The step command is passed to the entrypoint executable of the
image
. To use the defaultcmd
of theimage
, omit thiscommand
.The command should be given as a list of string arguments. The list form makes it clear which argument is which and avoids confusion around spaces and quotes.
steps: - name: command example image: ubuntu command: ["echo", "hello world"]
- volumes: dict[str, str | dict[str, str]]¶
Host directories to make available inside the step’s container.
This is a key-value mapping from host absolute paths to container absolute paths. The keys are strings (host absolute paths). The values are strings (container absolute paths) or detailed key-value mappings.
steps: - name: volumes example volumes: /host/simple: /simple /host/read-only: {bind: /read-only, mode: ro} /host/read-write: {bind: /read-write, mode: rw}
The detailed style lets you specify the container path to bind as well as the read/write permissions.
- bind
the container absolute path to bind (where the host dir will show up inside the container)
- mode
the read/write permission to give the container:
rw
for read plus write (the default),ro
for read only
- working_dir: str = None¶
A working directory path within the container – the initial shell
pwd
or Pythonos.getcwd()
.
- match_done: list[str]¶
File matching patterns to search for, before deciding to run the step.
This is a list of glob patterns to search for before running the step. Each of the step’s
volumes
will be searched with the same list of patterns.If any matches are found, these files will be noted in the
ExecutionRecord
, along with their content digests, and the step will be skipped. This is intended as a convenience to avoid redundant processing. To make a step run unconditionally, omitmatch_done
.steps: - name: match done example match_done: - one/specific/file.txt - any/text/*.txt - any/text/any/subdir/**/*.txt
- match_in: list[str]¶
File matching patterns to search for, before running the step.
This is a list of glob patterns to search for before running the step. Each of the step’s
volumes
will be searched with the same list of patterns.Any matches found will be noted in the
ExecutionRecord
.match_in
is intended to support audits by accounting for the input files that went into a step, along with their content digests. Unlikematch_done
,match_in
does not affect step execution.steps: - name: match in example match_in: - one/specific/file.txt - any/text/*.txt - any/text/any/subdir/**/*.txt
- match_out: list[str]¶
File matching patterns to search for, after running the step.
This is a list of glob patterns to search for after running the step. Each of the step’s
volumes
will be searched with the same list of patterns.Any matches found will be noted in the
ExecutionRecord
.match_out
is intended to support audits by accounting for the output files that came from a step, along with their content digests. Unlikematch_done
,match_out
does not affect step execution.steps: - name: match out example match_out: - one/specific/file.txt - any/text/*.txt - any/text/any/subdir/**/*.txt
- match_summary: list[str]¶
File matching patterns to search for, after running the step, to include when summarizing results.
This is a list of glob patterns to search for after running the step. Each of the step’s
volumes
will be searched with the same list of patterns.Any matches found will be noted in the
ExecutionRecord
.match_summary
is intended to enrich pipeline execution summaries with custom columns. SeeStepResult.files_summary
for how matched files are treated. Unlikematch_done
,match_summary
does not affect step execution.steps: - name: match summary example match_summary: - one/specific/file.txt - any/text/*.txt - any/text/any/subdir/**/*.txt
- environment: dict[str, str]¶
Environment variables to set inside the step’s container.
This is a key-value mapping from environment variable names to values. The keys and values are both strings.
steps: - name: environment example environment: MLM_LICENSE_FILE: /license.lic foo: bar
- gpus: str | bool = None¶
Whether or not to request GPU device support.
When
gpus
isTrue
/ truthy, request GPU device support similar to the Docker run--gpus all
. Note: the empty string""
will be treated asFalse
. resource request.steps: - name: gpus example gpus: true
- user: str | int = None¶
User (and group) to run as in the container, instead of container default (usually root).
When
user
is omitted orNone
the container will run with the default user and group specified in the image. This is usually root, or sometimes an image-specific user and group.When
user
is provided it must be a string user name or int uid, with group/gid optional, as follows:- self or self:group
The special user name
self
means run with the uid of the current user on the Docker host. Optionally, this can be followed by a group name or gid as inself:group
. When thisgroup
is a string name it must exists on the Docker host and will be converted to a host gid.- user or user:group
Other string user names and group names are used as-is and must exist within the image / container.
- uid or uid:gid
Integer uids and gids don’t have to exist within the image / container. It’s proably helpful if they exist on the Docker host.
steps: - name: default/root user example steps: - name: host current user and group example user: self steps: - name: existing container user example user: container-user steps: - name: integer uid and gid example user: 1234:5678
- network_mode: str = None¶
How to configure the container’s network environment.
When provided, this should be one of the following network modes:
- bridge
create an isolated network environment for the container (default)
- none
disable networking for the container
- container:<name|id>
reuse the network of another container, by name or id
- host
make the container’s network environment just like the host’s
- mac_address: str = None¶
Aribtrary MAC address to set in the container.
Perhaps surprisingly, containers can have arbitrary MAC “hardware” addresses.
steps: - name: mac address example mac_address: aa:bb:cc:dd:ee:ff
- shm_size: str = None¶
Max size of the
/dev/shm
shared memory in-memory-file-system.Docker defaults
/dev/shm
to 64 megabytes. Steps that need more can useshm_size
to increase this limit. Integer values will be treated as bytes, for example1000
. Values with a unit suffix will use larger units, for example 10b, 10k, 10m, or 10g.steps: - name: more-shm shm_size: 2g
- privileged: bool = False¶
Whether the step’s container should run with elevated privileges and device access.
This defaults to
False
. Please only setprivileged
toTrue
temporarily, for troubleshooting!steps: - name: elevated-privileged privileged: True
- X11: bool = False¶
Whether to set up the container as an X11 client app with
DISPLAY
access.This defaults to
False
, assuming most steps are noninteractive. SetX11
toTrue
to set up the container as an X11 GUI client app withDISPLAY
access. This will modify the container environment in a few ways:DISPLAY
Proceed will set the
DISPLAY
environment variable in the step container to match the host environment./tmp/.X11-unix
If the
/tmp/.X11-unix
directory exists on the host Proceed will add this to the step’sStep.volumes
. This lets the step container access local Unix sockets for connecting to a local X server.Step.network_mode
host
Proceed will set the step’s network mode to
host
. This lets the step container access TCP sockets for connecting to a remote/proxied X server as with ssh -X or ssh -Y.XAUTHORITY
Proceed will set up the
XAUTHORITY
environment variable and.Xauthority
cookie file based on the host environment. If theXAUTHORITY
variable is set in the host environment Proceed will use this file path to locate the cookie file. Otherwise Proceed will use the default cookie file path which is the current host user’s$HOME/.Xauthority
. If the cookie file exists on the host Proceed will add it to the step’sStep.volumes
. Proceed will bind the cookie file to a fixed, known path in the container like/var/.Xauthority
. Proceed will set theXAUTHORITY
environment variable in the container to the same known path. Using a fixed path for the cookie file should avoid any dependency on the container user or HOME configuration (or lack thereof). All of this lets the step container authenticate with a remote/proxied X server as with ssh -X or ssh -Y.
steps: - name: x11-gui-client X11: True