queue#

caput.pipeline.runner.queue(configfile: os.PathLike, submit: bool = False, lint: bool = True, profile: bool = False, profiler: str | None = 'cProfiler', psutil: bool = False, overwrite: Literal['never', 'failed'] = 'never', email: str | None = None, mailtype: str | None = None)[source]#

Queue a pipeline on a cluster from the given configfile.

This queues the job, using parameters from the cluster section of the submitted YAML file.

Parameters:

configfileos.PathLike: Path to a .yaml pipeline config file.
submitbool, optional: If True, the job will be submitted to the scheduler. Otherwise, the job directory and files will be created but not submitted. Default is False.
lintbool, optional: If True, lint the configfile before creating any job files. Default is True.
profilebool, optional: If True, use a profiler to monitor the time and resource usage of the pipeline job. Default is False.
profiler{“cprofile”, “pyinstrument”}, optional: Which profiler to use if profile is True. Default is cprofile.
psutilbool, optional: If True, use psutil to monitor the memory use of the pipeline job. Default is False.
overwrite{“never”, “failed”}, optional: How to handle job directories which already exist. If “failed”, only jobs which have reported FAILED will be re-queued. Default is “never”.
emailstr | None, optional: Email address for job status notifications. Default is None
mailtypestr | None, optional: Types of job events for which to send email notifications. These are typically specific to the queue system used. Default is None.
Cluster Config
~~~~~~~~~~~~~~
There are several *required* keys:
``nodes``: The number of nodes to run the job on.
``time``: The time length of the job. Must be a string that the queueing system understands.
``directory``: The directory to place the output in.
There are many *optional* keys that control more functionality:
``system``: The name of the cluster that we are running on. If this is a known system (currently gpc, cedar, fir), more relevant defaults are used.
``system``: The queue system to run on. Either pbs or slurm.
``queue``: The queue to submit to. Only used for PBS
``ompnum``: The number of OpenMP threads to use.
``pernode``: Number of processes to run on each node.
``mem``: How much memory to reserve per node.
``account``: The account to submit the job against. Only used on SLURM
``ppn``: Only used for PBS. Should typically be equal to the number of processors on a node.
``venv``: Path to a virtual environment to load before running.
``module_list``: Only used for slurm. A list of modules environments to load before running a job. If set, a module purge will occur before loading the specified modules. Sticky modules like StdEnv/* on Cedar and Fir will not get purged, and should not be specified. If not set, the current environment is used.
``module_path``: Only used for slurm. A list of modules paths to use. May be required to load modules.
``temp_directory``: If set, save the output to a temporary location while running and then move to a final location if the job successfully finishes. This may be slow, if the temporary and final directories are not on the same filesystem.