queue#

caput.pipeline.runner.queue(configfile: os.PathLike, submit: bool = False, lint: bool = True, profile: bool = False, profiler: str | None = 'cProfiler', psutil: bool = False, overwrite: Literal['never', 'failed'] = 'never', email: str | None = None, mailtype: str | None = None)[source]#

Queue a pipeline on a cluster from the given configfile.

This queues the job, using parameters from the cluster section of the submitted YAML file.

Parameters:
configfileos.PathLike

Path to a .yaml pipeline config file.

submitbool, optional

If True, the job will be submitted to the scheduler. Otherwise, the job directory and files will be created but not submitted. Default is False.

lintbool, optional

If True, lint the configfile before creating any job files. Default is True.

profilebool, optional

If True, use a profiler to monitor the time and resource usage of the pipeline job. Default is False.

profiler{“cprofile”, “pyinstrument”}, optional

Which profiler to use if profile is True. Default is cprofile.

psutilbool, optional

If True, use psutil to monitor the memory use of the pipeline job. Default is False.

overwrite{“never”, “failed”}, optional

How to handle job directories which already exist. If “failed”, only jobs which have reported FAILED will be re-queued. Default is “never”.

emailstr | None, optional

Email address for job status notifications. Default is None

mailtypestr | None, optional

Types of job events for which to send email notifications. These are typically specific to the queue system used. Default is None.

Cluster Config
~~~~~~~~~~~~~~
There are several *required* keys:
``nodes``

The number of nodes to run the job on.

``time``

The time length of the job. Must be a string that the queueing system understands.

``directory``

The directory to place the output in.

There are many *optional* keys that control more functionality:
``system``

The name of the cluster that we are running on. If this is a known system (currently gpc, cedar, fir), more relevant defaults are used.

``system``

The queue system to run on. Either pbs or slurm.

``queue``

The queue to submit to. Only used for PBS

``ompnum``

The number of OpenMP threads to use.

``pernode``

Number of processes to run on each node.

``mem``

How much memory to reserve per node.

``account``

The account to submit the job against. Only used on SLURM

``ppn``

Only used for PBS. Should typically be equal to the number of processors on a node.

``venv``

Path to a virtual environment to load before running.

``module_list``

Only used for slurm. A list of modules environments to load before running a job. If set, a module purge will occur before loading the specified modules. Sticky modules like StdEnv/* on Cedar and Fir will not get purged, and should not be specified. If not set, the current environment is used.

``module_path``

Only used for slurm. A list of modules paths to use. May be required to load modules.

``temp_directory``

If set, save the output to a temporary location while running and then move to a final location if the job successfully finishes. This may be slow, if the temporary and final directories are not on the same filesystem.