JobSubmitDescription
pyslurm.JobSubmitDescription
Submit Description for a Slurm Job.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
**kwargs
|
Any
|
Any valid Attribute this object has |
required |
Attributes:
Name | Type | Description |
---|---|---|
name |
str
|
Name of the Job, same as -J/--job-name from sbatch. |
account |
str
|
Account of the job, same as -A/--account from sbatch. |
user_id |
Union[str, int]
|
Run the job as a different User, same as --uid from sbatch. This requires root privileges. You can both specify the name or numeric uid of the User. |
group_id |
Union[str, int]
|
Run the job as a different Group, same as --gid from sbatch. This requires root privileges. You can both specify the name or numeric gid of the User. |
priority |
int
|
Specific priority the Job will receive. Same as --priority from sbatch. You can achieve the behaviour of sbatch's --hold option by specifying a priority of 0. |
site_factor |
int
|
Site Factor of the Job. Only used when updating an existing Job. |
wckey |
str
|
WCKey to use with the Job, same as --wckey from sbatch. |
array |
str
|
Job Array specification, same as -a/--array from sbatch. |
batch_constraints |
str
|
Batch Features of a Job, same as --batch from sbatch. |
begin_time |
str
|
Defer allocation until the specified time, same as --begin from sbatch. |
clusters |
Union[list, str]
|
Clusters the job may run on, same as -M/--clusters from sbatch. |
cluster_constraints |
str
|
Comma-separated str with cluster constraints for the job. This is the same as --cluster-constraint from sbatch. |
comment |
str
|
Arbitrary job comment, same as --comment from sbatch. |
admin_comment |
str
|
Arbitrary job admin comment. Only used when updating an existing job. |
requires_contiguous_nodes |
bool
|
Whether allocated Nodes are required to form a contiguous set. Same as --contiguous from sbatch. |
cores_reserved_for_system |
int
|
Count of cores reserved for system not usable by the Job.
Same as -S/--core-spec from sbatch.
Mutually exclusive with |
threads_reserved_for_system |
int
|
Count of threads reserved for system not usable by the Job.
Same as --thread-spec from sbatch.
Mutually exclusive with |
working_directory |
str
|
Work directory for the Job. Default is current work-dir from where the job was submitted. Same as -D/--chdir from sbatch. |
cpu_frequency |
Union[dict, str]
|
CPU Frequency for the Job, same as --cpu-freq from sbatch. For example, specifying it as a dict:
or like in sbatch with a string. For more info on that, check out the sbatch documentation for --cpu-freq. If you only want to set a Governor without any min or max, you can simply specify it as a standalone string:
If you want to set a specific, fixed frequency, you can do:
|
nodes |
Union[dict, str, int]
|
Amount of nodes needed for the job. This is the same as -N/--nodes from sbatch. For example, providing min/max nodes as a dict:
When no range is needed, you can also simply specify it as int:
Other than that, a range can also be specified in a str like with sbatch:
|
deadline |
str
|
Deadline specification for the Job, same as --deadline from sbatch. |
delay_boot_time |
Union[str, int]
|
Delay boot specification for the Job, same as --delay-boot from sbatch. |
dependencies |
Union[dict, str]
|
Dependencies for the Job, same as -d/--dependency from sbatch. |
excluded_nodes |
Union[list, str]
|
Exclude specific nodes for this Job. This is the same as -x/--exclude from sbatch. |
required_nodes |
Union[list, str]
|
Specific list of nodes required for the Job. This is the same as -w/--nodelist from sbatch. |
constraints |
str
|
Required node features for the Job. This is the same as -C/--constraint from sbatch. |
kill_on_node_fail |
bool
|
Should the job get killed if one of the Nodes fails? This is the same as -k/--no-kill from sbatch. |
licenses |
Union[list, str]
|
A list of licenses for the Job. This is the same as -L/--licenses from sbatch. |
mail_user |
Union[list, str]
|
List of email addresses for notifications. This is the same as --mail-user from sbatch. |
mail_types |
Union[list, str]
|
List of mail flags. This is the same as --mail-type from sbatch. |
mcs_label |
str
|
An MCS Label for the Job. This is the same as --mcs-label from sbatch. |
memory_per_cpu |
Union[str, int]
|
Memory required per allocated CPU. The default unit is in Mebibytes. You are also able to specify unit suffixes like K|M|G|T. This is the same as --mem-per-cpu from sbatch. This is mutually exclusive with memory_per_node and memory_per_gpu. Examples:
|
memory_per_node |
Union[str, int]
|
Memory required per whole node. The default unit is in Mebibytes. You are also able to specify unit suffixes like K|M|G|T. This is the same as --mem from sbatch. This is mutually exclusive with memory_per_cpu and memory_per_gpu. Examples:
|
memory_per_gpu |
Union[str, int]
|
Memory required per GPU. The default unit is in Mebibytes. You are also able to specify unit suffixes like K|M|G|T. This is the same as --mem-per-gpu from sbatch. This is mutually exclusive with memory_per_node and memory_per_cpu. Examples:
|
network |
str
|
Network types for the Job. This is the same as --network from sbatch. |
nice |
int
|
Adjusted scheduling priority for the Job. This is the same as --nice from sbatch. |
log_files_open_mode |
str
|
Mode in which standard_output and standard_error log files should be opened. This is the same as --open-mode from sbatch. Valid options are:
|
overcommit |
bool
|
If the resources should be overcommitted. This is the same as -O/--overcommit from sbatch. |
partitions |
Union[list, str]
|
A list of partitions the Job may use. This is the same as -p/--partition from sbatch. |
accounting_gather_frequency |
Union[dict, str]
|
Interval for accounting info to be gathered. This is the same as --acctg-freq from sbatch. For example, specifying it as a dict:
or as a single string:
|
qos |
str
|
Quality of Service for the Job. This is the same as -q/--qos from sbatch. |
requires_node_reboot |
bool
|
Force the allocated nodes to reboot before the job starts. This is the same --reboot from sbatch. |
is_requeueable |
bool
|
If the Job is eligible for requeuing. This is the same as --requeue from sbatch. |
reservations |
Union[list, str]
|
A list of possible reservations the Job can use. This is the same as --reservation from sbatch. |
script |
str
|
Absolute Path or content of the batch script. You can specify either a path to a script which will be loaded, or
you can pass the script as a string.
If the script is passed as a string, providing arguments to it
(see |
script_args |
str
|
Arguments passed to the batch script.
You can only set arguments if a file path was specified for
|
environment |
Union[dict, str]
|
Environment variables to be set for the Job. This is the same as --export from sbatch. |
resource_sharing |
str
|
Controls the resource sharing with other Jobs. This property combines functionality of --oversubscribe and --exclusive from sbatch. Allowed values are are:
|
distribution |
str
|
Task distribution for the Job, same as --distribution from sbatch |
time_limit |
str
|
The time limit for the job. This is the same as -t/--time from sbatch. |
time_limit_min |
str
|
A minimum time limit for the Job. This is the same as --time-min from sbatch. |
container |
str
|
Path to an OCI container bundle. This is the same as --container from sbatch. |
cpus_per_task |
int
|
The amount of cpus required for each task. This is the same as -c/--cpus-per-task from sbatch.
This is mutually exclusive with |
cpus_per_gpu |
int
|
The amount of cpus required for each allocated GPU. This is the same as --cpus-per-gpu from sbatch.
This is mutually exclusive with |
sockets_per_node |
int
|
Restrict Job to nodes with at least this many sockets. This is the same as --sockets-per-node from sbatch. |
cores_per_socket |
int
|
Restrict Job to nodes with at least this many cores per socket This is the same as --cores-per-socket from sbatch. |
threads_per_core |
int
|
Restrict Job to nodes with at least this many threads per socket This is the same as --threads-per-core from sbatch. |
gpus |
Union[dict, str, int]
|
GPUs for the Job to be allocated in total. This is the same as -G/--gpus from sbatch. Specifying the type of the GPU is optional. For example, specifying the GPU counts as a dict:
Or, for example, in string format:
Or, if you don't care about the type of the GPU:
|
gpus_per_socket |
Union[dict, str, int]
|
GPUs for the Job to be allocated per socket. This is the same as --gpus-per-socket from sbatch. Specifying the type of the GPU is optional. Note that setting
For example, specifying it as a dict:
Or, for example, in string format:
Or, if you don't care about the type of the GPU:
|
gpus_per_task |
Union[dict, str, int]
|
GPUs for the Job to be allocated per task. This is the same as --gpus-per-task from sbatch. Specifying the type of the GPU is optional. Note that setting
For example, specifying it as a dict:
Or, for example, in string format:
Or, if you don't care about the type of the GPU:
|
gres_per_node |
Union[dict, str]
|
Generic resources to be allocated per node. This is the same as --gres from sbatch. You should also use this option if you want to specify GPUs per node (--gpus-per-node). Specifying the type (by separating GRES name and type with a semicolon) is optional. For example, specifying it as a dict:
Or, for example, in string format:
GPU Gres without a specific type:
|
gpu_binding |
str
|
Specify GPU binding for the Job. This is the same as --gpu-bind from sbatch. |
ntasks |
int
|
Maximum amount of tasks for the Job. This is the same as -n/--ntasks from sbatch. |
ntasks_per_node |
int
|
Amount of tasks to be invoked on each node. This is the same as --ntasks-per-node from sbatch. |
ntasks_per_socket |
int
|
Maximum amount of tasks to be invoked on each socket. This is the same as --ntasks-per-socket from sbatch. |
ntasks_per_core |
int
|
Maximum amount of tasks to be invoked on each core. This is the same as --ntasks-per-core from sbatch. |
ntasks_per_gpu |
int
|
Amount of tasks to be invoked per GPU. This is the same as --ntasks-per-socket from sbatch. |
switches |
Union[dict, str, int]
|
Maximum amount of leaf switches and wait time desired. This can also optionally include a maximum waiting time for these switches. This is the same as --switches from sbatch. For example, specifying it as a dict:
Or as a single string (sbatch-style):
|
signal |
Union[dict, str]
|
Warn signal to be sent to the Job. This is the same as --signal from sbatch. The signal can both be specified with its name, e.g. "SIGKILL", or as a number, e.g. 9 For example, specifying it as a dict:
The above will send a "SIGKILL" signal 120 seconds before the Jobs' time limit is reached. Or, specifying it as a string (sbatch-style):
|
standard_in |
str
|
Path to a File acting as standard_in for the batch-script. This is the same as -i/--input from sbatch. |
standard_error |
str
|
Path to a File acting as standard_error for the batch-script. This is the same as -e/--error from sbatch. |
standard_output |
str
|
Path to a File to write the Jobs standard_output. This is the same as -o/--output from sbatch. |
kill_on_invalid_dependency |
bool
|
Kill the job if it has an invalid dependency. This is the same as --kill-on-invalid-dep from sbatch. |
spreads_over_nodes |
bool
|
Spread the Job over as many nodes as possible. This is the same as --spread-job from sbatch. |
use_min_nodes |
bool
|
Prefer the minimum amount of nodes specified. This is the same as --use-min-nodes from sbatch. |
gres_binding |
str
|
Generic resource task binding options. This is contained in the --gres-flags option from sbatch. Possible values are:
|
gres_tasks_per_sharing |
str
|
Shared GRES Tasks This is contained in the --gres-flags option from sbatch. Possible values are:
|
temporary_disk_per_node |
Union[str, int]
|
Amount of temporary disk space needed per node. This is the same as --tmp from sbatch. You can specify units like K|M|G|T (multiples of 1024). If no unit is specified, the value will be assumed as Mebibytes. Examples:
|
get_user_environment |
Union[str, bool, int]
|
TODO |
min_cpus_per_node |
str
|
Set the minimum amount of CPUs required per Node. This is the same as --mincpus from sbatch. |
wait_all_nodes |
bool
|
Controls when the execution of the command begins. A value of True means that the Job should begin execution only after all nodes in the allocation are ready. Setting it to False, the default, means that it is not waited for the nodes to be ready. (i.e booted) |
load_environment(overwrite=False)
method descriptor
Load values of attributes provided through the environment.
Note
Instead of SBATCH_
, pyslurm uses PYSLURM_JOBDESC_
as a prefix
to identify environment variables which should be used to set
attributes.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
overwrite
|
bool
|
If set to |
False
|
Examples:
Lets consider you want to set the name of the Job, its Account name and that the Job cannot be requeued. Therefore, you will need to have set these environment variables:
# Format is: PYSLURM_JOBDESC_{ATTRIBUTE_NAME}
export PYSLURM_JOBDESC_ACCOUNT="myaccount"
export PYSLURM_JOBDESC_NAME="myjobname"
export PYSLURM_JOBDESC_IS_REQUEUEABLE="False"
As you can see above, boolean values should be the literal strings "False" or "True". In python, you can do this now:
load_sbatch_options(overwrite=False)
method descriptor
Load values from #SBATCH
options in the batch script.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
overwrite
|
bool
|
If set to |
False
|