Skip to content

Job

Note

This supersedes the pyslurm.slurmdb_job class, which will be removed in a future release

pyslurm.db.Job

A Slurm Database Job.

Parameters:

Name Type Description Default
job_id int

An Integer representing a Job-ID.

0
cluster str

Name of the Cluster for this Job. Default is the name of the local Cluster.

None

Other Parameters:

Name Type Description
admin_comment str

Admin comment for the Job.

comment str

Comment for the Job

wckey str

Name of the WCKey for this Job

derived_exit_code int

Highest exit code of all the Job steps

extra str

Arbitrary string that can be stored with a Job.

Attributes:

Name Type Description
steps JobSteps

Steps this Job has

stats JobStatistics

Utilization statistics of this Job

account str

Account of the Job.

admin_comment str

Admin comment for the Job.

num_nodes int

Amount of nodes this Job has allocated (if it is running) or requested (if it is still pending).

array_id int

The master Array-Job ID.

array_tasks_parallel int

Max number of array tasks allowed to run simultaneously.

array_task_id int

Array Task ID of this Job if it is an Array-Job.

array_tasks_waiting str

Array Tasks that are still waiting.

association_id int

ID of the Association this job runs in.

block_id str

Name of the block used (for BlueGene Systems)

cluster str

Cluster this Job belongs to

constraints str

Constraints of the Job

container str

Path to OCI Container bundle

db_index int

Unique database index of the Job in the job table

derived_exit_code int

Highest exit code of all the Job steps

derived_exit_code_signal int

Signal of the derived exit code

comment str

Comment for the Job

elapsed_time int

Amount of seconds elapsed for the Job

eligible_time int

When the Job became eligible to run, as a unix timestamp

end_time int

When the Job ended, as a unix timestamp

extra str

Arbitrary string that can be stored with a Job.

exit_code int

Exit code of the job script or salloc.

exit_code_signal int

Signal of the exit code for this Job.

failed_node str

Name of the failed node that caused the job to get killed.

group_id int

ID of the group for this Job

group_name str

Name of the group for this Job

id int

ID of the Job

name str

Name of the Job

mcs_label str

MCS Label of the Job

nodelist str

Nodes this Job is using

partition str

Name of the Partition for this Job

priority int

Priority for the Job

qos str

Name of the Quality of Service for the Job

cpus int

Amount of CPUs the Job has/had allocated, or, if the Job is still pending, this will reflect the amount requested.

memory int

Amount of memory the Job requested in total, in Mebibytes

reservation str

Name of the Reservation for this Job

script str

The batch script for this Job. Note: Only available if the "with_script" condition was given

start_time int

Time when the Job started, as a unix timestamp

state str

State of the Job

state_reason str

Last reason a Job was blocked from running

cancelled_by str

Name of the User who cancelled this Job

submit_time int

Time the Job was submitted, as a unix timestamp

submit_command str

Full command issued to submit the Job

suspended_time int

Amount of seconds the Job was suspended

system_comment str

Arbitrary System comment for the Job

time_limit int

Time limit of the Job in minutes

user_id int

UID of the User this Job belongs to

user_name str

Name of the User this Job belongs to

wckey str

Name of the WCKey for this Job

working_directory str

Working directory of the Job

load(job_id, cluster=None, with_script=False, with_env=False) staticmethod

Load the information for a specific Job from the Database.

Parameters:

Name Type Description Default
job_id int

ID of the Job to be loaded.

required
cluster str

Name of the Cluster to search in. Default is the local Cluster.

None
with_script bool

Whether the Job-Script should also be loaded. Mutually exclusive with with_env.

False
with_env bool

Whether the Job Environment should also be loaded. Mutually exclusive with with_script.

False

Returns:

Type Description
Job

Returns a new Database Job instance

Raises:

Type Description
RPCError

If requesting the information for the database Job was not successful.

Examples:

>>> import pyslurm
>>> db_job = pyslurm.db.Job.load(10000)

In the above example, attributes like script and environment are not populated. You must explicitly request one of them to be loaded:

>>> import pyslurm
>>> db_job = pyslurm.db.Job.load(10000, with_script=True)
>>> print(db_job.script)

modify(changes, db_connection=None) method descriptor

Modify a Slurm database Job.

Parameters:

Name Type Description Default
changes Job

Another pyslurm.db.Job object that contains all the changes to apply. Check the Other Parameters of the pyslurm.db.Job class to see which properties can be modified.

required
db_connection Connection

A slurmdbd connection. See pyslurm.db.Jobs.modify for more info on this parameter.

None

Raises:

Type Description
RPCError

When modifying the Job failed.

to_dict() method descriptor

Convert Database Job information to a dictionary.

Returns:

Type Description
dict

Database Job information as dict

Examples:

>>> import pyslurm
>>> myjob = pyslurm.db.Job.load(10000)
>>> myjob_dict = myjob.to_dict()

pyslurm.db.Jobs

Bases: pyslurm.xcollections.MultiClusterMap

A Multi Cluster collection of pyslurm.db.Job objects.

Parameters:

Name Type Description Default
jobs Union[list[int], dict[int, Job], str]

Jobs to initialize this collection with.

None

Attributes:

Name Type Description
stats JobStatistics

Utilization statistics of this Job Collection

cpus int

Total amount of cpus requested.

nodes int

Total amount of nodes requested.

memory int

Total amount of requested memory in Mebibytes.

calc_stats() method descriptor

(Re)Calculate Statistics for the Job Collection.

load(db_filter=None, db_connection=None) staticmethod

Load Jobs from the Slurm Database

Implements the slurmdb_jobs_get RPC.

Parameters:

Name Type Description Default
db_filter JobFilter

A search filter that the slurmdbd will apply when retrieving Jobs from the database.

None
db_connection Connection

An open database connection. By default if none is specified, one will be opened automatically.

None

Returns:

Type Description
Jobs

A Collection of database Jobs.

Raises:

Type Description
RPCError

When getting the Jobs from the Database was not successful

Examples:

Without a Filter the default behaviour applies, which is simply retrieving all Jobs from the same day:

>>> import pyslurm
>>> db_jobs = pyslurm.db.Jobs.load()
>>> print(db_jobs)
pyslurm.db.Jobs({1: pyslurm.db.Job(1), 2: pyslurm.db.Job(2)})
>>> print(db_jobs[1])
pyslurm.db.Job(1)

Now with a Job Filter, so only Jobs that have specific Accounts are returned:

>>> import pyslurm
>>> accounts = ["acc1", "acc2"]
>>> db_filter = pyslurm.db.JobFilter(accounts=accounts)
>>> db_jobs = pyslurm.db.Jobs.load(db_filter)

modify(db_filter, changes, db_connection=None) staticmethod

Modify Slurm database Jobs.

Implements the slurm_job_modify RPC.

Parameters:

Name Type Description Default
db_filter Union[JobFilter, Jobs]

A filter to decide which Jobs should be modified.

required
changes Job

Another pyslurm.db.Job object that contains all the changes to apply. Check the Other Parameters of the pyslurm.db.Job class to see which properties can be modified.

required
db_connection Connection

A Connection to the slurmdbd. By default, if no connection is supplied, one will automatically be created internally. This means that when the changes were considered successful by the slurmdbd, those modifications will be automatically committed.

If you however decide to provide your own Connection instance (which must be already opened before), and the changes were successful, they will basically be in a kind of "staging area". By the time this function returns, the changes are not actually made. You are then responsible to decide whether the changes should be committed or rolled back by using the respective methods on the connection object. This way, you have a chance to see which Jobs were modified before you commit the changes.

None

Returns:

Type Description
list[int]

A list of Jobs that were modified

Raises:

Type Description
RPCError

When a failure modifying the Jobs occurred.

Examples:

In its simplest form, you can do something like this:

>>> import pyslurm
>>>
>>> db_filter = pyslurm.db.JobFilter(ids=[9999])
>>> changes = pyslurm.db.Job(comment="A comment for the job")
>>> modified_jobs = pyslurm.db.Jobs.modify(db_filter, changes)
>>> print(modified_jobs)
[9999]

In the above example, the changes will be automatically committed if successful. You can however also control this manually by providing your own connection object:

>>> import pyslurm
>>>
>>> db_conn = pyslurm.db.Connection.open()
>>> db_filter = pyslurm.db.JobFilter(ids=[9999])
>>> changes = pyslurm.db.Job(comment="A comment for the job")
>>> modified_jobs = pyslurm.db.Jobs.modify(
...             db_filter, changes, db_conn)

Now you can first examine which Jobs have been modified:

>>> print(modified_jobs)
[9999]

And then you can actually commit the changes:

>>> db_conn.commit()

You can also explicitly rollback these changes instead of committing, so they will not become active:

>>> db_conn.rollback()