proceed.model.Step

class proceed.model.Step(name=None, description=None, image=None, command=<factory>, volumes=<factory>, working_dir=None, match_done=<factory>, match_in=<factory>, match_out=<factory>, match_summary=<factory>, environment=<factory>, gpus=None, user=None, network_mode=None, mac_address=None, shm_size=None, privileged=False, X11=False)

Specifies a container-based processing step.

Most Step attributes are optional, but name is required in order to distinguish steps from each other, and image is required in order to actually run anything.

Parameters:
  • name (str)

  • description (str)

  • image (str)

  • command (list[str])

  • volumes (dict[str, str | dict[str, str]])

  • working_dir (str)

  • match_done (list[str])

  • match_in (list[str])

  • match_out (list[str])

  • match_summary (list[str])

  • environment (dict[str, str])

  • gpus (str | bool)

  • user (str | int)

  • network_mode (str)

  • mac_address (str)

  • shm_size (str)

  • privileged (bool)

  • X11 (bool)

name: str = None

Any name for the step, unique within a Pipeline (required).

description: str = None

Any description to save along with the step.

The step description is not used during pipeline execution. It’s provided as a convenience to support user documentation, notes-to-self, audits, etc.

Unlike code comments or YAML comments, the description is saved as part of the ExecutionRecord.

image: str = None

The tag or id of the container image to run from (required).

The image is the most important part of each step! It provides the step’s executables, dependencies, and basic environment.

The image may be a human-readable tag of the form group/name:version (like on Docker Hub) or a unique id (like the IMAGE ID output of docker images).

steps:
  - name: human readable example
    image: mathworks/matlab:r2022b
  - name: image id example
    image: d209dd14c3c4
command: list[str]

The command to run inside the container.

The step command is passed to the entrypoint executable of the image. To use the default cmd of the image, omit this command.

The command should be given as a list of string arguments. The list form makes it clear which argument is which and avoids confusion around spaces and quotes.

steps:
  - name: command example
    image: ubuntu
    command: ["echo", "hello world"]
volumes: dict[str, str | dict[str, str]]

Host directories to make available inside the step’s container.

This is a key-value mapping from host absolute paths to container absolute paths. The keys are strings (host absolute paths). The values are strings (container absolute paths) or detailed key-value mappings.

steps:
  - name: volumes example
    volumes:
      /host/simple: /simple
      /host/read-only: {bind: /read-only, mode: ro}
      /host/read-write: {bind: /read-write, mode: rw}

The detailed style lets you specify the container path to bind as well as the read/write permissions.

bind

the container absolute path to bind (where the host dir will show up inside the container)

mode

the read/write permission to give the container: rw for read plus write (the default), ro for read only

working_dir: str = None

A working directory path within the container – the initial shell pwd or Python os.getcwd().

match_done: list[str]

File matching patterns to search for, before deciding to run the step.

This is a list of glob patterns to search for before running the step. Each of the step’s volumes will be searched with the same list of patterns.

If any matches are found, these files will be noted in the ExecutionRecord, along with their content digests, and the step will be skipped. This is intended as a convenience to avoid redundant processing. To make a step run unconditionally, omit match_done.

steps:
  - name: match done example
    match_done:
      - one/specific/file.txt
      - any/text/*.txt
      - any/text/any/subdir/**/*.txt
match_in: list[str]

File matching patterns to search for, before running the step.

This is a list of glob patterns to search for before running the step. Each of the step’s volumes will be searched with the same list of patterns.

Any matches found will be noted in the ExecutionRecord. match_in is intended to support audits by accounting for the input files that went into a step, along with their content digests. Unlike match_done, match_in does not affect step execution.

steps:
  - name: match in example
    match_in:
      - one/specific/file.txt
      - any/text/*.txt
      - any/text/any/subdir/**/*.txt
match_out: list[str]

File matching patterns to search for, after running the step.

This is a list of glob patterns to search for after running the step. Each of the step’s volumes will be searched with the same list of patterns.

Any matches found will be noted in the ExecutionRecord. match_out is intended to support audits by accounting for the output files that came from a step, along with their content digests. Unlike match_done, match_out does not affect step execution.

steps:
  - name: match out example
    match_out:
      - one/specific/file.txt
      - any/text/*.txt
      - any/text/any/subdir/**/*.txt
match_summary: list[str]

File matching patterns to search for, after running the step, to include when summarizing results.

This is a list of glob patterns to search for after running the step. Each of the step’s volumes will be searched with the same list of patterns.

Any matches found will be noted in the ExecutionRecord. match_summary is intended to enrich pipeline execution summaries with custom columns. See StepResult.files_summary for how matched files are treated. Unlike match_done, match_summary does not affect step execution.

steps:
  - name: match summary example
    match_summary:
      - one/specific/file.txt
      - any/text/*.txt
      - any/text/any/subdir/**/*.txt
environment: dict[str, str]

Environment variables to set inside the step’s container.

This is a key-value mapping from environment variable names to values. The keys and values are both strings.

steps:
  - name: environment example
    environment:
      MLM_LICENSE_FILE: /license.lic
      foo: bar
gpus: str | bool = None

Whether or not to request GPU device support.

When gpus is True / truthy, request GPU device support similar to the Docker run --gpus all. Note: the empty string "" will be treated as False. resource request.

steps:
  - name: gpus example
    gpus: true
user: str | int = None

User (and group) to run as in the container, instead of container default (usually root).

When user is omitted or None the container will run with the default user and group specified in the image. This is usually root, or sometimes an image-specific user and group.

When user is provided it must be a string user name or int uid, with group/gid optional, as follows:

self or self:group

The special user name self means run with the uid of the current user on the Docker host. Optionally, this can be followed by a group name or gid as in self:group. When this group is a string name it must exists on the Docker host and will be converted to a host gid.

user or user:group

Other string user names and group names are used as-is and must exist within the image / container.

uid or uid:gid

Integer uids and gids don’t have to exist within the image / container. It’s proably helpful if they exist on the Docker host.

steps:
  - name: default/root user example
steps:
  - name: host current user and group example
    user: self
steps:
  - name: existing container user example
    user: container-user
steps:
  - name: integer uid and gid example
    user: 1234:5678
network_mode: str = None

How to configure the container’s network environment.

When provided, this should be one of the following network modes:

bridge

create an isolated network environment for the container (default)

none

disable networking for the container

container:<name|id>

reuse the network of another container, by name or id

host

make the container’s network environment just like the host’s

mac_address: str = None

Aribtrary MAC address to set in the container.

Perhaps surprisingly, containers can have arbitrary MAC “hardware” addresses.

steps:
  - name: mac address example
    mac_address: aa:bb:cc:dd:ee:ff
shm_size: str = None

Max size of the /dev/shm shared memory in-memory-file-system.

Docker defaults /dev/shm to 64 megabytes. Steps that need more can use shm_size to increase this limit. Integer values will be treated as bytes, for example 1000. Values with a unit suffix will use larger units, for example 10b, 10k, 10m, or 10g.

steps:
  - name: more-shm
    shm_size: 2g
privileged: bool = False

Whether the step’s container should run with elevated privileges and device access.

This defaults to False. Please only set privileged to True temporarily, for troubleshooting!

steps:
  - name: elevated-privileged
    privileged: True
X11: bool = False

Whether to set up the container as an X11 client app with DISPLAY access.

This defaults to False, assuming most steps are noninteractive. Set X11 to True to set up the container as an X11 GUI client app with DISPLAY access. This will modify the container environment in a few ways:

DISPLAY

Proceed will set the DISPLAY environment variable in the step container to match the host environment.

/tmp/.X11-unix

If the /tmp/.X11-unix directory exists on the host Proceed will add this to the step’s Step.volumes. This lets the step container access local Unix sockets for connecting to a local X server.

Step.network_mode host

Proceed will set the step’s network mode to host. This lets the step container access TCP sockets for connecting to a remote/proxied X server as with ssh -X or ssh -Y.

XAUTHORITY

Proceed will set up the XAUTHORITY environment variable and .Xauthority cookie file based on the host environment. If the XAUTHORITY variable is set in the host environment Proceed will use this file path to locate the cookie file. Otherwise Proceed will use the default cookie file path which is the current host user’s $HOME/.Xauthority. If the cookie file exists on the host Proceed will add it to the step’s Step.volumes. Proceed will bind the cookie file to a fixed, known path in the container like /var/.Xauthority. Proceed will set the XAUTHORITY environment variable in the container to the same known path. Using a fixed path for the cookie file should avoid any dependency on the container user or HOME configuration (or lack thereof). All of this lets the step container authenticate with a remote/proxied X server as with ssh -X or ssh -Y.

steps:
  - name: x11-gui-client
    X11: True