Skip to content

JobStep

Note

This supersedes the pyslurm.jobstep class, which will be removed in a future release

pyslurm.JobStep

A Slurm Jobstep

Parameters:

Name Type Description Default
job_id Union[Job, int]

The Job this Step belongs to.

0
step_id Union[int, str]

Step-ID for this JobStep object.

0

Other Parameters:

Name Type Description
time_limit int

Time limit in Minutes for this step.

Attributes:

Name Type Description
stats JobStepStatistics

Real-time statistics of a Step. Before you can access the stats data for a Step, you have to call the load_stats method of a Step instance or the Jobs collection.

pids dict[str, list]

Current Process-IDs of the Step, organized by node name. Before you can access the pids data, you have to call the load_stats method of a Srep instance or the Jobs collection.

id Union[str, int]

The id for this step.

job_id int

The id for the Job this step belongs to.

name str

Name of the step.

user_id int

User ID who owns this step.

user_name str

Name of the User who owns this step.

time_limit int

Time limit in Minutes for this step.

network str

Network specification for the step.

cpu_frequency_min Union[str, int]

Minimum CPU-Frequency requested.

cpu_frequency_max Union[str, int]

Maximum CPU-Frequency requested.

cpu_frequency_governor Union[str, int]

CPU-Frequency Governor requested.

reserved_ports str

Reserved ports for the step.

cluster str

Name of the cluster this step runs on.

srun_host str

Name of the host srun was executed on.

srun_process_id int

Process ID of the srun command.

container str

Path to the container OCI.

allocated_nodes str

Nodes the Job is using.

start_time int

Time this step started, as unix timestamp.

run_time int

Seconds this step has been running for.

run_time_remaining int

The amount of seconds the step has still left until hitting the time_limit.

elapsed_cpu_time int

Amount of CPU-Time used by the step so far. This is the result of multiplying the run_time with the amount of cpus allocated.

partition str

Name of the partition this step runs in.

state str

State the step is in.

cpus int

Number of CPUs this step uses in total.

ntasks int

Number of tasks this step uses.

distribution dict

Task distribution specification for the step.

command str

Command that was specified with srun.

slurm_protocol_version int

Slurm protocol version in use.

cancel() method descriptor

Cancel a Job step.

Implements the slurm_kill_job_step RPC.

Raises:

Type Description
RPCError

When cancelling the Job was not successful.

Examples:

>>> import pyslurm
>>> pyslurm.JobStep(9999, 1).cancel()

load(job_id, step_id) staticmethod

Load information for a specific job step.

Implements the slurm_get_job_steps RPC.

Parameters:

Name Type Description Default
job_id Union[Job, int]

ID of the Job the Step belongs to.

required
step_id Union[int, str]

Step-ID for the Step to be loaded.

required

Returns:

Type Description
JobStep

Returns a new JobStep instance

Raises:

Type Description
RPCError

When retrieving Step information from the slurmctld was not successful.

Examples:

>>> import pyslurm
>>> jobstep = pyslurm.JobStep.load(9999, 1)

load_stats() method descriptor

Load realtime stats for this Step.

Calling this function returns the live statistics of the step, and additionally populates the stats and pids attribute of the instance.

Returns:

Type Description
JobStepStatistics

The statistics of the Step.

Raises:

Type Description
RPCError

When retrieving the stats for the Step failed.

Examples:

>>> import pyslurm
>>> step = pyslurm.JobStep.load(9999, 1)
>>> stats = step.load_stats()
>>>
>>> # Print the CPU Time Used
>>> print(stats.total_cpu_time)
>>>
>>> # Print the Process-IDs for the Step, organized by hostname
>>> print(step.pids)

modify(changes) method descriptor

Modify a job step.

Implements the slurm_update_step RPC.

Parameters:

Name Type Description Default
changes JobStep

Another JobStep object that contains all the changes to apply. Check the Other Parameters of the JobStep class to see which properties can be modified.

required

Raises:

Type Description
RPCError

When updating the JobStep was not successful.

Examples:

>>> import pyslurm
>>>
>>> # Setting the new time-limit to 20 days
>>> changes = pyslurm.JobStep(time_limit="20-00:00:00")
>>> pyslurm.JobStep(9999, 1).modify(changes)

send_signal(signal) method descriptor

Send a signal to a running Job step.

Implements the slurm_signal_job_step RPC.

Parameters:

Name Type Description Default
signal Union[str, int]

Any valid signal which will be sent to the Job. Can be either a str like SIGUSR1, or simply an int.

required

Raises:

Type Description
RPCError

When sending the signal was not successful.

Examples:

Specifying the signal as a string:

>>> import pyslurm
>>> pyslurm.JobStep(9999, 1).send_signal("SIGUSR1")

or passing in a numeric signal:

>>> pyslurm.JobStep(9999, 1).send_signal(9)

to_dict() method descriptor

JobStep information formatted as a dictionary.

Returns:

Type Description
dict

JobStep information as dict

pyslurm.JobSteps

Bases: builtins.dict

A dict of pyslurm.JobStep objects for a given Job.

Raises:

Type Description
RPCError

When getting the Job steps from the slurmctld failed.

load(job) staticmethod

Load the Job Steps from the system.

Parameters:

Name Type Description Default
job Union[Job, int]

The Job for which the Steps should be loaded.

required

Returns:

Type Description
JobSteps

JobSteps of the Job

Examples:

>>> import pyslurm
>>> steps = pyslurm.JobSteps.load(1)
>>> print(steps)
pyslurm.JobSteps({'batch': pyslurm.JobStep('batch')})
>>> print(steps[1])
pyslurm.JobStep('batch')

load_all() staticmethod

Loads all the steps in the system.

Returns:

Type Description
dict

A dict where every JobID (key) is mapped with an instance of its JobSteps (value).