Job
Note
This supersedes the pyslurm.slurmdb_job class, which will be removed in a future release
pyslurm.db.Job
A Slurm Database Job.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
job_id
|
int
|
An Integer representing a Job-ID. |
0
|
cluster
|
str
|
Name of the Cluster for this Job. Default is the name of the local Cluster. |
None
|
Other Parameters:
Name | Type | Description |
---|---|---|
admin_comment |
str
|
Admin comment for the Job. |
comment |
str
|
Comment for the Job |
wckey |
str
|
Name of the WCKey for this Job |
derived_exit_code |
int
|
Highest exit code of all the Job steps |
extra |
str
|
Arbitrary string that can be stored with a Job. |
Attributes:
Name | Type | Description |
---|---|---|
steps |
JobSteps
|
Steps this Job has |
stats |
JobStatistics
|
Utilization statistics of this Job |
account |
str
|
Account of the Job. |
admin_comment |
str
|
Admin comment for the Job. |
num_nodes |
int
|
Amount of nodes this Job has allocated (if it is running) or requested (if it is still pending). |
array_id |
int
|
The master Array-Job ID. |
array_tasks_parallel |
int
|
Max number of array tasks allowed to run simultaneously. |
array_task_id |
int
|
Array Task ID of this Job if it is an Array-Job. |
array_tasks_waiting |
str
|
Array Tasks that are still waiting. |
association_id |
int
|
ID of the Association this job runs in. |
block_id |
str
|
Name of the block used (for BlueGene Systems) |
cluster |
str
|
Cluster this Job belongs to |
constraints |
str
|
Constraints of the Job |
container |
str
|
Path to OCI Container bundle |
db_index |
int
|
Unique database index of the Job in the job table |
derived_exit_code |
int
|
Highest exit code of all the Job steps |
derived_exit_code_signal |
int
|
Signal of the derived exit code |
comment |
str
|
Comment for the Job |
elapsed_time |
int
|
Amount of seconds elapsed for the Job |
eligible_time |
int
|
When the Job became eligible to run, as a unix timestamp |
end_time |
int
|
When the Job ended, as a unix timestamp |
extra |
str
|
Arbitrary string that can be stored with a Job. |
exit_code |
int
|
Exit code of the job script or salloc. |
exit_code_signal |
int
|
Signal of the exit code for this Job. |
failed_node |
str
|
Name of the failed node that caused the job to get killed. |
group_id |
int
|
ID of the group for this Job |
group_name |
str
|
Name of the group for this Job |
id |
int
|
ID of the Job |
name |
str
|
Name of the Job |
mcs_label |
str
|
MCS Label of the Job |
nodelist |
str
|
Nodes this Job is using |
partition |
str
|
Name of the Partition for this Job |
priority |
int
|
Priority for the Job |
qos |
str
|
Name of the Quality of Service for the Job |
cpus |
int
|
Amount of CPUs the Job has/had allocated, or, if the Job is still pending, this will reflect the amount requested. |
memory |
int
|
Amount of memory the Job requested in total, in Mebibytes |
reservation |
str
|
Name of the Reservation for this Job |
script |
str
|
The batch script for this Job. Note: Only available if the "with_script" condition was given |
start_time |
int
|
Time when the Job started, as a unix timestamp |
state |
str
|
State of the Job |
state_reason |
str
|
Last reason a Job was blocked from running |
cancelled_by |
str
|
Name of the User who cancelled this Job |
submit_time |
int
|
Time the Job was submitted, as a unix timestamp |
submit_command |
str
|
Full command issued to submit the Job |
suspended_time |
int
|
Amount of seconds the Job was suspended |
system_comment |
str
|
Arbitrary System comment for the Job |
time_limit |
int
|
Time limit of the Job in minutes |
user_id |
int
|
UID of the User this Job belongs to |
user_name |
str
|
Name of the User this Job belongs to |
wckey |
str
|
Name of the WCKey for this Job |
working_directory |
str
|
Working directory of the Job |
load(job_id, cluster=None, with_script=False, with_env=False)
staticmethod
Load the information for a specific Job from the Database.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
job_id
|
int
|
ID of the Job to be loaded. |
required |
cluster
|
str
|
Name of the Cluster to search in. Default is the local Cluster. |
None
|
with_script
|
bool
|
Whether the Job-Script should also be loaded. Mutually
exclusive with |
False
|
with_env
|
bool
|
Whether the Job Environment should also be loaded. Mutually
exclusive with |
False
|
Returns:
Type | Description |
---|---|
Job
|
Returns a new Database Job instance |
Raises:
Type | Description |
---|---|
RPCError
|
If requesting the information for the database Job was not successful. |
Examples:
In the above example, attributes like script
and environment
are not populated. You must explicitly request one of them to be
loaded:
modify(changes, db_connection=None)
method descriptor
Modify a Slurm database Job.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
changes
|
Job
|
Another pyslurm.db.Job object that contains all the
changes to apply. Check the |
required |
db_connection
|
Connection
|
A slurmdbd connection. See pyslurm.db.Jobs.modify for more info on this parameter. |
None
|
Raises:
Type | Description |
---|---|
RPCError
|
When modifying the Job failed. |
to_dict()
method descriptor
Convert Database Job information to a dictionary.
Returns:
Type | Description |
---|---|
dict
|
Database Job information as dict |
Examples:
pyslurm.db.Jobs
Bases: pyslurm.xcollections.MultiClusterMap
A Multi Cluster
collection of pyslurm.db.Job objects.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
jobs
|
Union[list[int], dict[int, Job], str]
|
Jobs to initialize this collection with. |
None
|
Attributes:
Name | Type | Description |
---|---|---|
stats |
JobStatistics
|
Utilization statistics of this Job Collection |
cpus |
int
|
Total amount of cpus requested. |
nodes |
int
|
Total amount of nodes requested. |
memory |
int
|
Total amount of requested memory in Mebibytes. |
calc_stats()
method descriptor
(Re)Calculate Statistics for the Job Collection.
load(db_filter=None, db_connection=None)
staticmethod
Load Jobs from the Slurm Database
Implements the slurmdb_jobs_get RPC.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
db_filter
|
JobFilter
|
A search filter that the slurmdbd will apply when retrieving Jobs from the database. |
None
|
db_connection
|
Connection
|
An open database connection. By default if none is specified, one will be opened automatically. |
None
|
Returns:
Type | Description |
---|---|
Jobs
|
A Collection of database Jobs. |
Raises:
Type | Description |
---|---|
RPCError
|
When getting the Jobs from the Database was not successful |
Examples:
Without a Filter the default behaviour applies, which is simply retrieving all Jobs from the same day:
>>> import pyslurm
>>> db_jobs = pyslurm.db.Jobs.load()
>>> print(db_jobs)
pyslurm.db.Jobs({1: pyslurm.db.Job(1), 2: pyslurm.db.Job(2)})
>>> print(db_jobs[1])
pyslurm.db.Job(1)
Now with a Job Filter, so only Jobs that have specific Accounts are returned:
modify(db_filter, changes, db_connection=None)
staticmethod
Modify Slurm database Jobs.
Implements the slurm_job_modify RPC.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
db_filter
|
Union[JobFilter, Jobs]
|
A filter to decide which Jobs should be modified. |
required |
changes
|
Job
|
Another pyslurm.db.Job object that contains all the
changes to apply. Check the |
required |
db_connection
|
Connection
|
A Connection to the slurmdbd. By default, if no connection is supplied, one will automatically be created internally. This means that when the changes were considered successful by the slurmdbd, those modifications will be automatically committed. If you however decide to provide your own Connection instance (which must be already opened before), and the changes were successful, they will basically be in a kind of "staging area". By the time this function returns, the changes are not actually made. You are then responsible to decide whether the changes should be committed or rolled back by using the respective methods on the connection object. This way, you have a chance to see which Jobs were modified before you commit the changes. |
None
|
Returns:
Type | Description |
---|---|
list[int]
|
A list of Jobs that were modified |
Raises:
Type | Description |
---|---|
RPCError
|
When a failure modifying the Jobs occurred. |
Examples:
In its simplest form, you can do something like this:
>>> import pyslurm
>>>
>>> db_filter = pyslurm.db.JobFilter(ids=[9999])
>>> changes = pyslurm.db.Job(comment="A comment for the job")
>>> modified_jobs = pyslurm.db.Jobs.modify(db_filter, changes)
>>> print(modified_jobs)
[9999]
In the above example, the changes will be automatically committed if successful. You can however also control this manually by providing your own connection object:
>>> import pyslurm
>>>
>>> db_conn = pyslurm.db.Connection.open()
>>> db_filter = pyslurm.db.JobFilter(ids=[9999])
>>> changes = pyslurm.db.Job(comment="A comment for the job")
>>> modified_jobs = pyslurm.db.Jobs.modify(
... db_filter, changes, db_conn)
Now you can first examine which Jobs have been modified:
And then you can actually commit the changes:
You can also explicitly rollback these changes instead of committing, so they will not become active: