scontrol - Used view and modify Slurm configuration and state.
scontrol [OPTIONS...] [COMMAND...]
scontrol is used to view or modify Slurm configuration including: job, job step, node, partition, reservation, and overall system configuration. Most of the commands can only be executed by user root. If an attempt to view or modify configuration information is made by an unauthorized user, an error message will be printed and the requested action will not occur. If no command is entered on the execute line, scontrol will operate in an interactive mode and prompt for input. It will continue prompting for input and executing commands until explicitly terminated. If a command is entered on the execute line, scontrol will execute that command and terminate. All commands and options are case-insensitive, although node names, partition names, and reservation names are case-sensitive (node names "LX" and "lx" are distinct). All commands and options can be abbreviated to the extent that the specification is unique. A modified Slurm configuration can be written to a file using the scontrol write config command. The resulting file will be named using the convention "slurm.conf.<datetime>" and located in the same directory as the original "slurm.conf" file. The directory containing the original slurm.conf must be writable for this to occur.
-a, --all
When the show command is used, then display all partitions,
their jobs and jobs steps. This causes information to be
displayed about partitions that are configured as hidden and
partitions that are unavailable to user's group.
-d, --details
Causes the show command to provide additional details where
available. Repeating the option more than once (e.g., "-dd")
will cause the show job command to also list the batch script,
if the job was a batch job.
-h, --help
Print a help message describing the usage of scontrol.
--hide Do not display information about hidden partitions, their jobs
and job steps. By default, neither partitions that are
configured as hidden nor those partitions unavailable to user's
group will be displayed (i.e. this is the default behavior).
-M, --clusters=<string>
The cluster to issue commands to. Only one cluster name may be
specified.
-o, --oneliner
Print information one line per record.
-Q, --quiet
Print no warning or informational messages, only fatal error
messages.
-v, --verbose
Print detailed event logging. Multiple -v's will further
increase the verbosity of logging. By default only errors will
be displayed.
-V , --version
Print version information and exit.
COMMANDS
all Show all partitions, their jobs and jobs steps. This causes
information to be displayed about partitions that are configured
as hidden and partitions that are unavailable to user's group.
abort Instruct the Slurm controller to terminate immediately and
generate a core file. See "man slurmctld" for information about
where the core file will be written.
checkpoint CKPT_OP ID
Perform a checkpoint activity on the job step(s) with the
specified identification. ID can be used to identify a specific
job (e.g. "<job_id>", which applies to all of its existing
steps) or a specific job step (e.g. "<job_id>.<step_id>").
Acceptable values for CKPT_OP include:
able Test if presently not disabled, report start time if
checkpoint in progress
create Create a checkpoint and continue the job or job step
disable Disable future checkpoints
enable Enable future checkpoints
error Report the result for the last checkpoint request,
error code and message
restart Restart execution of the previously checkpointed job
or job step
requeue Create a checkpoint and requeue the batch job,
combines vacate and restart operations
vacate Create a checkpoint and terminate the job or job
step
Acceptable values for CKPT_OP include:
MaxWait=<seconds> Maximum time for checkpoint to be written.
Default value is 10 seconds. Valid with
create and vacate options only.
ImageDir=<directory_name>
Location of checkpoint file. Valid with
create, vacate and restart options only.
This value takes precedent over any
--checkpoint-dir value specified at job
submission time.
StickToNodes If set, resume job on the same nodes are
previously used. Valid with the restart
option only.
cluster CLUSTER_NAME
The cluster to issue commands to. Only one cluster name may be
specified.
create SPECIFICATION
Create a new partition or reservation. See the full list of
parameters below. Include the tag "res" to create a reservation
without specifying a reservation name.
completing
Display all jobs in a COMPLETING state along with associated
nodes in either a COMPLETING or DOWN state.
delete SPECIFICATION
Delete the entry with the specified SPECIFICATION. The two
SPECIFICATION choices are PartitionName=<name> and
Reservation=<name>. On Dynamically laid out Bluegene systems
BlockName=<name> also works. Reservations and partitions should
have no associated jobs at the time of their deletion (modify
the job's first). If the specified partition is in use, the
request is denied.
details
Causes the show command to provide additional details where
available. Job information will include CPUs and NUMA memory
allocated on each node. Note that on computers with
hyperthreading enabled and Slurm configured to allocate cores,
each listed CPU represents one physical core. Each hyperthread
on that core can be allocated a separate task, so a job's CPU
count and task count may differ. See the --cpu_bind and
--mem_bind option descriptions in srun man pages for more
information. The details option is currently only supported for
the show job command. To also list the batch script for batch
jobs, in addition to the details, use the script option
described below instead of this option.
errnumstr ERRNO
Given a Slurm error number, return a descriptive string.
exit Terminate the execution of scontrol. This is an independent
command with no options meant for use in interactive mode.
help Display a description of scontrol options and commands.
hide Do not display partition, job or jobs step information for
partitions that are configured as hidden or partitions that are
unavailable to the user's group. This is the default behavior.
hold job_list
Prevent a pending job from beginning started (sets it's priority
to 0). Use the release command to permit the job to be
scheduled. The job_list argument is a comma separated list of
job IDs OR "jobname=" with the job's name, which will attempt to
hold all jobs having that name. Note that when a job is held by
a system administrator using the hold command, only a system
administrator may release the job for execution (also see the
uhold command). When the job is held by its owner, it may also
be released by the job's owner.
notify job_id message
Send a message to standard error of the salloc or srun command
or batch job associated with the specified job_id.
oneliner
Print information one line per record.
pidinfo proc_id
Print the Slurm job id and scheduled termination time
corresponding to the supplied process id, proc_id, on the
current node. This will work only with processes on node on
which scontrol is run, and only for those processes spawned by
Slurm and their descendants.
listpids [job_id[.step_id]] [NodeName]
Print a listing of the process IDs in a job step (if
JOBID.STEPID is provided), or all of the job steps in a job (if
job_id is provided), or all of the job steps in all of the jobs
on the local node (if job_id is not provided or job_id is "*").
This will work only with processes on the node on which scontrol
is run, and only for those processes spawned by Slurm and their
descendants. Note that some Slurm configurations (ProctrackType
value of pgid or aix) are unable to identify all processes
associated with a job or job step.
Note that the NodeName option is only really useful when you
have multiple slurmd daemons running on the same host machine.
Multiple slurmd daemons on one host are, in general, only used
by Slurm developers.
ping Ping the primary and secondary slurmctld daemon and report if
they are responding.
quiet Print no warning or informational messages, only fatal error
messages.
quit Terminate the execution of scontrol.
reboot_nodes [NodeList]
Reboot all nodes in the system when they become idle using the
RebootProgram as configured in Slurm's slurm.conf file. Accepts
an option list of nodes to reboot. By default all nodes are
rebooted. NOTE: This command does not prevent additional jobs
from being scheduled on these nodes, so many jobs can be
executed on the nodes prior to them being rebooted. You can
explicitly drain the nodes in order to reboot nodes as soon as
possible, but the nodes must also explicitly be returned to
service after being rebooted. You can alternately create an
advanced reservation to prevent additional jobs from being
initiated on nodes to be rebooted. NOTE: Nodes will be placed
in a state of "MAINT" until rebooted and returned to service
with a normal state. Alternately the node's state "MAINT" may
be cleared by using the scontrol command to set the node state
to "RESUME", which clears the "MAINT" flag.
reconfigure
Instruct all Slurm daemons to re-read the configuration file.
This command does not restart the daemons. This mechanism would
be used to modify configuration parameters (Epilog, Prolog,
SlurmctldLogFile, SlurmdLogFile, etc.). The Slurm controller
(slurmctld) forwards the request all other daemons (slurmd
daemon on each compute node). Running jobs continue execution.
Most configuration parameters can be changed by just running
this command, however, Slurm daemons should be shutdown and
restarted if any of these parameters are to be changed:
AuthType, BackupAddr, BackupController, ControlAddr,
ControlMach, PluginDir, StateSaveLocation, SlurmctldPort or
SlurmdPort. The slurmctld daemon must be restarted if nodes are
added to or removed from the cluster.
release job_list
Release a previously held job to begin execution. The job_list
argument is a comma separated list of job IDs OR "jobname=" with
the job's name, which will attempt to hold all jobs having that
name. Also see hold.
requeue job_list
Requeue a running, suspended or finished Slurm batch job into
pending state. The job_list argument is a comma separated list
of job IDs.
requeuehold job_list
Requeue a running, suspended or finished Slurm batch job into
pending state, moreover the job is put in held state (priority
zero). The job_list argument is a comma separated list of job
IDs. A held job can be released using scontrol to reset its
priority (e.g. "scontrol release <job_id>"). The command
accepts the following option:
State=SpecialExit
The "SpecialExit" keyword specifies that the job has to
be put in a special state JOB_SPECIAL_EXIT. The
"scontrol show job" command will display the JobState as
SPECIAL_EXIT, while the "squeue" command as SE.
resume job_list
Resume a previously suspended job. The job_list argument is a
comma separated list of job IDs. Also see suspend.
NOTE: A suspended job releases its CPUs for allocation to other
jobs. Resuming a previously suspended job may result in
multiple jobs being allocated the same CPUs, which could trigger
gang scheduling with some configurations or severe degradation
in performance with other configurations. Use of the scancel
command to send SIGSTOP and SIGCONT signals would stop a job
without releasing its CPUs for allocation to other jobs and
would be a preferable mechanism in many cases. Use with
caution.
schedloglevel LEVEL
Enable or disable scheduler logging. LEVEL may be "0", "1",
"disable" or "enable". "0" has the same effect as "disable". "1"
has the same effect as "enable". This value is temporary and
will be overwritten when the slurmctld daemon reads the
slurm.conf configuration file (e.g. when the daemon is restarted
or scontrol reconfigure is executed) if the SlurmSchedLogLevel
parameter is present.
script Causes the show job command to list the batch script for batch
jobs in addition to the detail information described under the
details option above.
setdebug LEVEL
Change the debug level of the slurmctld daemon. LEVEL may be an
integer value between zero and nine (using the same values as
SlurmctldDebug in the slurm.conf file) or the name of the most
detailed message type to be printed: "quiet", "fatal", "error",
"info", "verbose", "debug", "debug2", "debug3", "debug4", or
"debug5". This value is temporary and will be overwritten
whenever the slurmctld daemon reads the slurm.conf configuration
file (e.g. when the daemon is restarted or scontrol reconfigure
is executed).
setdebugflags [+|-]FLAG
Add or remove DebugFlags of the slurmctld daemon. See "man
slurm.conf" for a list of supported DebugFlags. NOTE: Changing
the value of some DebugFlags will have no effect without
restarting the slurmctld daemon, which would set DebugFlags
based upon the contents of the slurm.conf configuration file.
show ENTITY ID
Display the state of the specified entity with the specified
identification. ENTITY may be aliases, assoc_mgr, burstbuffer,
config, daemons, frontend, job, node, partition, powercap,
reservation, slurmd, step, topology, hostlist, hostlistsorted or
hostnames (also block or submp on BlueGene systems). ID can be
used to identify a specific element of the identified entity:
job ID, node name, partition name, reservation name, or job step
ID for job, node, partition, or step respectively. For an
ENTITY of topology, the ID may be a node or switch name. If one
node name is specified, all switches connected to that node (and
their parent switches) will be shown. If more than one node
name is specified, only switches that connect to all named nodes
will be shown. aliases will return all NodeName values
associated to a given NodeHostname (useful to get the list of
virtual nodes associated with a real node in a configuration
where multiple slurmd daemons execute on a single compute node).
assoc_mgr displays the current contents of the slurmctld's
internal cache for users, associations and/or qos. The ID may be
users=<user1>,[...,<userN>], accounts=<acct1>,[...,<acctN>],
qos=<qos1>,[...,<qosN>] and/or flags=<users,assoc,qos>, used to
filter the desired section to be displayed. If no flags are
specified, all sections are displayed. burstbuffer displays the
current status of the BurstBuffer plugin. config displays
parameter names from the configuration files in mixed case (e.g.
SlurmdPort=7003) while derived parameters names are in upper
case only (e.g. SLURM_VERSION). hostnames takes an optional
hostlist expression as input and writes a list of individual
host names to standard output (one per line). If no hostlist
expression is supplied, the contents of the SLURM_NODELIST
environment variable is used. For example "tux[1-3]" is mapped
to "tux1","tux2" and "tux3" (one hostname per line). hostlist
takes a list of host names and prints the hostlist expression
for them (the inverse of hostnames). hostlist can also take the
absolute pathname of a file (beginning with the character '/')
containing a list of hostnames. Multiple node names may be
specified using simple node range expressions (e.g.
"lx[10-20]"). All other ID values must identify a single
element. The job step ID is of the form "job_id.step_id", (e.g.
"1234.1"). slurmd reports the current status of the slurmd
daemon executing on the same node from which the scontrol
command is executed (the local host). It can be useful to
diagnose problems. By default hostlist does not sort the node
list or make it unique (e.g. tux2,tux1,tux2 = tux[2,1-2]). If
you wanted a sorted list use hostlistsorted (e.g. tux2,tux1,tux2
= tux[1-2,2]). By default, all elements of the entity type
specified are printed. For an ENTITY of job, if the job does
not specify socket-per-node, cores-per-socket or threads-per-
core then it will display '*' in ReqS:C:T=*:*:* field.
shutdown OPTION
Instruct Slurm daemons to save current state and terminate. By
default, the Slurm controller (slurmctld) forwards the request
all other daemons (slurmd daemon on each compute node). An
OPTION of slurmctld or controller results in only the slurmctld
daemon being shutdown and the slurmd daemons remaining active.
suspend job_list
Suspend a running job. The job_list argument is a comma
separated list of job IDs. Use the resume command to resume its
execution. User processes must stop on receipt of SIGSTOP
signal and resume upon receipt of SIGCONT for this operation to
be effective. Not all architectures and configurations support
job suspension. If a suspended job is requeued, it will be
placed in a held state.
takeover
Instruct Slurm's backup controller (slurmctld) to take over
system control. Slurm's backup controller requests control from
the primary and waits for its termination. After that, it
switches from backup mode to controller mode. If primary
controller can not be contacted, it directly switches to
controller mode. This can be used to speed up the Slurm
controller fail-over mechanism when the primary node is down.
This can be used to minimize disruption if the computer
executing the primary Slurm controller is scheduled down.
(Note: Slurm's primary controller will take the control back at
startup.)
top job_id
Move the specified job ID to the top of the queue of jobs
belonging to the identical user ID, partition name, account, and
QOS. Any job not matching all of those fields will not be
effected. Only jobs submitted to a single partition will be
effected. This operation changes the order of jobs by adjusting
job nice values. The net effect on that user's throughput will
be negligible to slightly negative. This operation may be
disabled by the system administrator by including the option
"disable_user_top" in the SchedulerParameters configuration
parameter.
uhold job_list
Prevent a pending job from being started (sets it's priority to
0). The job_list argument is a space separated list of job IDs
or job names. Use the release command to permit the job to be
scheduled. This command is designed for a system administrator
to hold a job so that the job owner may release it rather than
requiring the intervention of a system administrator (also see
the hold command).
update SPECIFICATION
Update job, step, node, partition, powercapping or reservation
configuration per the supplied specification. SPECIFICATION is
in the same format as the Slurm configuration file and the
output of the show command described above. It may be desirable
to execute the show command (described above) on the specific
entity you want to update, then use cut-and-paste tools to enter
updated configuration values to the update. Note that while most
configuration values can be changed using this command, not all
can be changed using this mechanism. In particular, the hardware
configuration of a node or the physical addition or removal of
nodes from the cluster may only be accomplished through editing
the Slurm configuration file and executing the reconfigure
command (described above).
verbose
Print detailed event logging. This includes time-stamps on data
structures, record counts, etc.
version
Display the version number of scontrol being executed.
wait_job job_id
Wait until a job and all of its nodes are ready for use or the
job has entered some termination state. This option is
particularly useful in the Slurm Prolog or in the batch script
itself if nodes are powered down and restarted automatically as
needed.
write config
Write the current configuration to a file with the naming
convention of "slurm.conf.<datetime>" in the same directory as
the original slurm.conf file.
!! Repeat the last command executed.
SPECIFICATIONS FOR UPDATE COMMAND, JOBS
Account=<account>
Account name to be changed for this job's resource use. Value
may be cleared with blank data value, "Account=".
ArrayTaskThrottle=<count>
Speciify the maximum number of tasks in a job array that can
execute at the same time. Set the count to zero in order to
eliminate any limit. The task throttle count for a job array is
reported as part of its ArrayTaskId field, preceded with a
percent sign. For example "ArrayTaskId=1-10%2" indicates the
maximum number of running tasks is limited to 2.
BurstBuffer=<spec>
Burst buffer specification to be changed for this job's resource
use. Value may be cleared with blank data value,
"BurstBuffer=". Format is burst buffer plugin specific.
Conn-Type=<type>
Reset the node connection type. Supported only on IBM BlueGene
systems. Possible values on are "MESH", "TORUS" and "NAV" (mesh
else torus).
Contiguous=<yes|no>
Set the job's requirement for contiguous (consecutive) nodes to
be allocated. Possible values are "YES" and "NO". Only the
Slurm administrator or root can change this parameter.
Deadline=<time_spec>
It accepts times of the form HH:MM:SS to specify a deadline to a
job at a specific time of day (seconds are optional). You may
also specify midnight, noon, fika (3 PM) or teatime (4 PM) and
you can have a time-of-day suffixed with AM or PM for a deadline
in the morning or the evening. You can specify a deadline for
the job with a date of the form MMDDYY or MM/DD/YY or MM.DD.YY,
or a date and time as YYYY-MM-DD[THH:MM[:SS]]. You can also
give times like now + count time-units, where the time-units can
be minutes, hours, days, or weeks and you can tell Slurm to put
a deadline for tomorrow with the keyword tomorrow. The
specified deadline must be later than the current time. Only
pending jobs can have the deadline updated. Only the Slurm
administrator or root can change this parameter.
Dependency=<dependency_list>
Defer job's initiation until specified job dependency
specification is satisfied. Cancel dependency with an empty
dependency_list (e.g. "Dependency="). <dependency_list> is of
the form <type:job_id[:job_id][,type:job_id[:job_id]]>. Many
jobs can share the same dependency and these jobs may even
belong to different users.
after:job_id[:jobid...]
This job can begin execution after the specified jobs
have begun execution.
afterany:job_id[:jobid...]
This job can begin execution after the specified jobs
have terminated.
afternotok:job_id[:jobid...]
This job can begin execution after the specified jobs
have terminated in some failed state (non-zero exit code,
node failure, timed out, etc).
afterok:job_id[:jobid...]
This job can begin execution after the specified jobs
have successfully executed (ran to completion with an
exit code of zero).
singleton
This job can begin execution after any previously
launched jobs sharing the same job name and user have
terminated.
EligibleTime=<time_spec>
See StartTime.
ExcNodeList=<nodes>
Set the job's list of excluded node. Multiple node names may be
specified using simple node range expressions (e.g.
"lx[10-20]"). Value may be cleared with blank data value,
"ExcNodeList=".
Features=<features>
Set the job's required node features. The list of features may
include multiple feature names separated by ampersand (AND)
and/or vertical bar (OR) operators. For example:
Features="opteron&video" or Features="fast|faster". In the
first example, only nodes having both the feature "opteron" AND
the feature "video" will be used. There is no mechanism to
specify that you want one node with feature "opteron" and
another node with feature "video" in case no node has both
features. If only one of a set of possible options should be
used for all allocated nodes, then use the OR operator and
enclose the options within square brackets. For example:
"Features=[rack1|rack2|rack3|rack4]" might be used to specify
that all nodes must be allocated on a single rack of the
cluster, but any of those four racks can be used. A request can
also specify the number of nodes needed with some feature by
appending an asterisk and count after the feature name. For
example "Features=graphics*4" indicates that at least four
allocated nodes must have the feature "graphics." Constraints
with node counts may only be combined with AND operators. Value
may be cleared with blank data value, for example "Features=".
Geometry=<geo>
Reset the required job geometry. On Blue Gene the value should
be three digits separated by "x" or ",". The digits represent
the allocation size in X, Y and Z dimensions (e.g. "2x3x4").
Gres=<list>
Specifies a comma delimited list of generic consumable
resources. The format of each entry on the list is
"name[:count[*cpu]]". The name is that of the consumable
resource. The count is the number of those resources with a
default value of 1. The specified resources will be allocated
to the job on each node allocated unless "*cpu" is appended, in
which case the resources will be allocated on a per cpu basis.
The available generic consumable resources is configurable by
the system administrator. A list of available generic
consumable resources will be printed and the command will exit
if the option argument is "help". Examples of use include
"Gres=gpus:2*cpu,disk=40G" and "Gres=help".
JobId=<job_list>
Identify the job(s) to be updated. The job_list may be a comma
separated list of job IDs. Either JobId or JobName is required.
Licenses=<name>
Specification of licenses (or other resources available on all
nodes of the cluster) as described in salloc/sbatch/srun man
pages.
MinCPUsNode=<count>
Set the job's minimum number of CPUs per node to the specified
value.
MinMemoryCPU=<megabytes>
Set the job's minimum real memory required per allocated CPU to
the specified value. Either MinMemoryCPU or MinMemoryNode may be
set, but not both.
MinMemoryNode=<megabytes>
Set the job's minimum real memory required per node to the
specified value. Either MinMemoryCPU or MinMemoryNode may be
set, but not both.
MinTmpDiskNode=<megabytes>
Set the job's minimum temporary disk space required per node to
the specified value. Only the Slurm administrator or root can
change this parameter.
JobName=<name>
Identify the name of jobs to be modified or set the job's name
to the specified value. When used to identify jobs to be
modified, all jobs belonging to all users are modified unless
the UserID option is used to identify a specific user. Either
JobId or JobName is required.
Nice[=delta]
Adjust job's priority by the specified value. Default value is
100. The adjustment range is from -10000 (highest priority) to
10000 (lowest priority). Nice value changes are not additive,
but overwrite any prior nice value and are applied to the job's
base priority. Only privileged users, Slurm administrator or
root, can specify a negative adjustment.
NodeList=<nodes>
Change the nodes allocated to a running job to shrink it's size.
The specified list of nodes must be a subset of the nodes
currently allocated to the job. Multiple node names may be
specified using simple node range expressions (e.g.
"lx[10-20]"). After a job's allocation is reduced, subsequent
srun commands must explicitly specify node and task counts which
are valid for the new allocation.
NumCPUs=<min_count>[-<max_count>]
Set the job's minimum and optionally maximum count of CPUs to be
allocated.
NumNodes=<min_count>[-<max_count>]
Set the job's minimum and optionally maximum count of nodes to
be allocated. If the job is already running, use this to
specify a node count less than currently allocated and resources
previously allocated to the job will be relinquished. After a
job's allocation is reduced, subsequent srun commands must
explicitly specify node and task counts which are valid for the
new allocation. Also see the NodeList parameter above.
NumTasks=<count>
Set the job's count of required tasks to the specified value.
OverSubscribe=<yes|no>
Set the job's ability to share compute resources (i.e.
individual CPUs) with other jobs. Possible values are "YES" and
"NO". This option can only be changed for pending jobs.
Partition=<name>
Set the job's partition to the specified value.
Priority=<number>
Set the job's priority to the specified value. Note that a job
priority of zero prevents the job from ever being scheduled. By
setting a job's priority to zero it is held. Set the priority
to a non-zero value to permit it to run. Explicitly setting a
job's priority clears any previously set nice value and removes
the priority/multifactor plugin's ability to manage a job's
priority. In order to restore the priority/multifactor plugin's
ability to manage a job's priority, hold and then release the
job. Only the Slurm administrator or root can increase job's
priority.
QOS=<name>
Set the job's QOS (Quality Of Service) to the specified value.
Value may be cleared with blank data value, "QOS=".
ReqNodeList=<nodes>
Set the job's list of required node. Multiple node names may be
specified using simple node range expressions (e.g.
"lx[10-20]"). Value may be cleared with blank data value,
"ReqNodeList=".
Requeue=<0|1>
Stipulates whether a job should be requeued after a node
failure: 0 for no, 1 for yes.
ReservationName=<name>
Set the job's reservation to the specified value. Value may be
cleared with blank data value, "ReservationName=".
Rotate=<yes|no>
Permit the job's geometry to be rotated. Possible values are
"YES" and "NO".
Shared=<yes|no>
See OverSubscribe option above.
StartTime=<time_spec>
Set the job's earliest initiation time. It accepts times of the
form HH:MM:SS to run a job at a specific time of day (seconds
are optional). (If that time is already past, the next day is
assumed.) You may also specify midnight, noon, fika (3 PM) or
teatime (4 PM) and you can have a time-of-day suffixed with AM
or PM for running in the morning or the evening. You can also
say what day the job will be run, by specifying a date of the
form MMDDYY or MM/DD/YY or MM.DD.YY, or a date and time as
YYYY-MM-DD[THH:MM[:SS]]. You can also give times like now +
count time-units, where the time-units can be minutes, hours,
days, or weeks and you can tell Slurm to run the job today with
the keyword today and to run the job tomorrow with the keyword
tomorrow.
Notes on date/time specifications:
- although the 'seconds' field of the HH:MM:SS time
specification is allowed by the code, note that the poll time of
the Slurm scheduler is not precise enough to guarantee dispatch
of the job on the exact second. The job will be eligible to
start on the next poll following the specified time. The exact
poll interval depends on the Slurm scheduler (e.g., 60 seconds
with the default sched/builtin).
- if no time (HH:MM:SS) is specified, the default is
(00:00:00).
- if a date is specified without a year (e.g., MM/DD) then the
current year is assumed, unless the combination of MM/DD and
HH:MM:SS has already passed for that year, in which case the
next year is used.
Switches=<count>[@<max-time-to-wait>]
When a tree topology is used, this defines the maximum count of
switches desired for the job allocation. If Slurm finds an
allocation containing more switches than the count specified,
the job remain pending until it either finds an allocation with
desired switch count or the time limit expires. By default there
is no switch count limit and no time limit delay. Set the count
to zero in order to clean any previously set count (disabling
the limit). The job's maximum time delay may be limited by the
system administrator using the SchedulerParameters configuration
parameter with the max_switch_wait parameter option. Also see
wait-for-switch.
TimeLimit=<time>
The job's time limit. Output format is
[days-]hours:minutes:seconds or "UNLIMITED". Input format (for
update command) set is minutes, minutes:seconds,
hours:minutes:seconds, days-hours, days-hours:minutes or
days-hours:minutes:seconds. Time resolution is one minute and
second values are rounded up to the next minute. If changing
the time limit of a job, either specify a new time limit value
or precede the time with a "+" or "-" to increment or decrement
the current time limit (e.g. "TimeLimit=+30"). In order to
increment or decrement the current time limit, the JobId
specification must precede the TimeLimit specification. Only
the Slurm administrator or root can increase job's TimeLimit.
UserID=<UID or name>
Used with the JobName option to identify jobs to be modified.
Either a user name or numeric ID (UID), may be specified.
WCKey=<key>
Set the job's workload characterization key to the specified
value.
NOTE: The "show" command, when used with the "job" or "job <jobid>"
entity displays detailed information about a job or jobs. Much
of this information may be modified using the "update job"
command as described above. However, the following fields
displayed by the show job command are read-only and cannot be
modified:
AllocNode:Sid
Local node and system id making the resource allocation.
BatchFlag
Jobs submitted using the sbatch command have BatchFlag set to 1.
Jobs submitted using other commands have BatchFlag set to 0.
CoreSpec=<count>
Number of cores to reserve per node for system use. The job
will be charged for these cores, but be unable to use them.
Will be reported as "*" if not constrained.
EndTime
The time the job is expected to terminate based on the job's
time limit. When the job ends sooner, this field will be
updated with the actual end time.
ExitCode=<exit>:<sig>
Exit status reported for the job by the wait() function. The
first number is the exit code, typically as set by the exit()
function. The second number of the signal that caused the
process to terminate if it was terminated by a signal.
GroupId
The group under which the job was submitted.
JobState
The current state of the job.
NodeList
The list of nodes allocated to the job.
NodeListIndices
The NodeIndices expose the internal indices into the node table
associated with the node(s) allocated to the job.
NtasksPerN:B:S:C=
<tasks_per_node>:<tasks_per_baseboard>:<tasks_per_socket>:<tasks_per_core>
Specifies the number of tasks to be started per hardware
component (node, baseboard, socket and core). Unconstrained
values may be shown as "0" or "*".
PreemptTime
Time at which job was signaled that it was selected for
preemption. (Meaningful only for PreemptMode=CANCEL and the
partition or QOS with which the job is associated has a
GraceTime value designated.)
PreSusTime
Time the job ran prior to last suspend.
Reason The reason job is not running: e.g., waiting "Resources".
ReqB:S:C:T=
<baseboard_count>:<socket_per_baseboard_count>:<core_per_socket_count>:<thread_per_core_count>
Specifies the count of various hardware components requested by
the job. Unconstrained values may be shown as "0" or "*".
SecsPreSuspend=<seconds>
If the job is suspended, this is the run time accumulated by the
job (in seconds) prior to being suspended.
Socks/Node=<count>
Count of desired sockets per node
SubmitTime
The time and date stamp (in Universal Time Coordinated, UTC)
the job was submitted. The format of the output is identical to
that of the EndTime field.
NOTE: If a job is requeued, the submit time is reset. To obtain
the original submit time it is necessary to use the "sacct -j
<job_id[.<step_id>]" command also designating the -D or
--duplicate option to display all duplicate entries for a job.
SuspendTime
Time the job was last suspended or resumed.
UserId The user under which the job was submitted.
NOTE on information displayed for various job states:
When you submit a request for the "show job" function the
scontrol process makes an RPC request call to slurmctld with a
REQUEST_JOB_INFO message type. If the state of the job is
PENDING, then it returns some detail information such as:
min_nodes, min_procs, cpus_per_task, etc. If the state is other
than PENDING the code assumes that it is in a further state such
as RUNNING, COMPLETE, etc. In these cases the code explicitly
returns zero for these values. These values are meaningless once
the job resources have been allocated and the job has started.
SPECIFICATIONS FOR UPDATE COMMAND, STEPS
StepId=<job_id>[.<step_id>]
Identify the step to be updated. If the job_id is given, but no
step_id is specified then all steps of the identified job will
be modified. This specification is required.
CompFile=<completion file>
Update a step with information about a steps completion. Can be
useful if step statistics aren't directly available through a
jobacct_gather plugin. The file is a space-delimited file with
format for Version 1 is as follows
1 34461 0 2 0 3 1361906011 1361906015 1 1 3368 13357 /bin/sleep
A B C D E F G H I J K L M
Field Descriptions:
A file version
B ALPS apid
C inblocks
D outblocks
E exit status
F number of allocated CPUs
G start time
H end time
I utime
J stime
K maxrss
L uid
M command name
TimeLimit=<time>
The job's time limit. Output format is
[days-]hours:minutes:seconds or "UNLIMITED". Input format (for
update command) set is minutes, minutes:seconds,
hours:minutes:seconds, days-hours, days-hours:minutes or
days-hours:minutes:seconds. Time resolution is one minute and
second values are rounded up to the next minute. If changing
the time limit of a step, either specify a new time limit value
or precede the time with a "+" or "-" to increment or decrement
the current time limit (e.g. "TimeLimit=+30"). In order to
increment or decrement the current time limit, the StepId
specification must precede the TimeLimit specification.
SPECIFICATIONS FOR UPDATE COMMAND, NODES
NodeName=<name>
Identify the node(s) to be updated. Multiple node names may be
specified using simple node range expressions (e.g.
"lx[10-20]"). This specification is required.
ActiveFeatures=<features>
Identify the feature(s) currently active on the specified node.
Any previously active feature specification will be overwritten
with the new value. Also see AvailableFeatures. Typically
ActiveFeatures will be identical to AvailableFeatures; however
ActiveFeatures may be configured as a subset of the
AvailableFeatures. For example, a node may be booted in multiple
configurations. In that case, all possible configurations may be
identified as AvailableFeatures, while ActiveFeatures would
identify the current node configuration.
AvailableFeatures=<features>
Identify the feature(s) available on the specified node. Any
previously defined available feature specification will be
overwritten with the new value. AvailableFeatures assigned via
scontrol will only persist across the restart of the slurmctld
daemon with the -R option and state files preserved or
slurmctld's receipt of a SIGHUP. Update slurm.conf with any
changes meant to be persistent across normal restarts of
slurmctld or the execution of scontrol reconfig. Also see
ActiveFeatures.
Gres=<gres>
Identify generic resources to be associated with the specified
node. Any previously defined generic resources will be
overwritten with the new value. Specifications for multiple
generic resources should be comma separated. Each resource
specification consists of a name followed by an optional colon
with a numeric value (default value is one) (e.g.
"Gres=bandwidth:10000,gpus"). Generic resources assigned via
scontrol will only persist across the restart of the slurmctld
daemon with the -R option and state files preserved or
slurmctld's receipt of a SIGHUP. Update slurm.conf with any
changes meant to be persistent across normal restarts of
slurmctld or the execution of scontrol reconfig.
Reason=<reason>
Identify the reason the node is in a "DOWN". "DRAINED",
"DRAINING", "FAILING" or "FAIL" state. Use quotes to enclose a
reason having more than one word.
State=<state>
Identify the state to be assigned to the node. Possible node
states are "NoResp", "ALLOC", "ALLOCATED", "COMPLETING", "DOWN",
"DRAIN", "ERROR, "FAIL", "FAILING", "FUTURE" "IDLE", "MAINT",
"MIXED", "PERFCTRS/NPC", "RESERVED", "POWER_DOWN", "POWER_UP",
"RESUME" or "UNDRAIN". Not all of those states can be set using
the scontrol command only the following can: "NoResp", "DRAIN",
"FAIL", "FUTURE", "RESUME", "POWER_DOWN", "POWER_UP" and
"UNDRAIN". If a node is in a "MIXED" state it usually means the
node is in multiple states. For instance if only part of the
node is "ALLOCATED" and the rest of the node is "IDLE" the state
will be "MIXED". If you want to remove a node from service, you
typically want to set it's state to "DRAIN". "FAILING" is
similar to "DRAIN" except that some applications will seek to
relinquish those nodes before the job completes. "PERFCTRS/NPC"
indicates that Network Performance Counters associated with this
node are in use, rendering this node as not usable for any other
jobs. "RESERVED" indicates the node is in an advanced
reservation and not generally available. "RESUME" is not an
actual node state, but will change a node state from "DRAINED",
"DRAINING", "DOWN" or "MAINT" to either "IDLE" or "ALLOCATED"
state as appropriate. "UNDRAIN" clears the node from being
drained (like "RESUME"), but will not change the node's base
state (e.g. "DOWN"). Setting a node "DOWN" will cause all
running and suspended jobs on that node to be terminated.
"POWER_DOWN" and "POWER_UP" will use the configured SuspendProg
and ResumeProg programs to explicitly place a node in or out of
a power saving mode. If a node is already in the process of
being powered up or down, the command will have no effect until
the configured ResumeTimeout or SuspendTimeout is reached. The
"NoResp" state will only set the "NoResp" flag for a node
without changing its underlying state. While all of the above
states are valid, some of them are not valid new node states
given their prior state. If the node state code printed is
followed by "~", this indicates the node is presently in a power
saving mode (typically running at reduced frequency). If the
node state code is followed by "#", this indicates the node is
presently being powered up or configured. If the node state
code is followed by "$", this indicates the node is currently in
a reservation with a flag value of "maintenance" or is scheduled
to be rebooted. Generally only "DRAIN", "FAIL" and "RESUME"
should be used. NOTE: The scontrol command should not be used
to change node state on Cray systems. Use Cray tools such as
xtprocadmin instead.
Weight=<weight>
Identify weight to be associated with specified nodes. This
allows dynamic changes to weight associated with nodes, which
will be used for the subsequent node allocation decisions.
Weight assigned via scontrol will only persist across the
restart of the slurmctld daemon with the -R option and state
files preserved or slurmctld's receipt of a SIGHUP. Update
slurm.conf with any changes meant to be persistent across normal
restarts of slurmctld or the execution of scontrol reconfig.
SPECIFICATIONS FOR UPDATE COMMAND, FRONTEND
FrontendName=<name>
Identify the front end node to be updated. This specification is
required.
Reason=<reason>
Identify the reason the node is in a "DOWN" or "DRAIN" state.
Use quotes to enclose a reason having more than one word.
State=<state>
Identify the state to be assigned to the front end node.
Possible values are "DOWN", "DRAIN" or "RESUME". If you want to
remove a front end node from service, you typically want to set
it's state to "DRAIN". "RESUME" is not an actual node state,
but will return a "DRAINED", "DRAINING", or "DOWN" front end
node to service, either "IDLE" or "ALLOCATED" state as
appropriate. Setting a front end node "DOWN" will cause all
running and suspended jobs on that node to be terminated.
SPECIFICATIONS FOR CREATE, UPDATE, AND DELETE COMMANDS, PARTITIONS
AllowGroups=<name>
Identify the user groups which may use this partition. Multiple
groups may be specified in a comma separated list. To permit
all groups to use the partition specify "AllowGroups=ALL".
AllocNodes=<name>
Comma separated list of nodes from which users can execute jobs
in the partition. Node names may be specified using the node
range expression syntax described above. The default value is
"ALL".
Alternate=<partition name>
Alternate partition to be used if the state of this partition is
"DRAIN" or "INACTIVE." The value "NONE" will clear a previously
set alternate partition.
Default=<yes|no>
Specify if this partition is to be used by jobs which do not
explicitly identify a partition to use. Possible output values
are "YES" and "NO". In order to change the default partition of
a running system, use the scontrol update command and set
Default=yes for the partition that you want to become the new
default.
DefaultTime=<time>
Run time limit used for jobs that don't specify a value. If not
set then MaxTime will be used. Format is the same as for
MaxTime.
DefMemPerCPU=<MB>
Set the default memory to be allocated per CPU for jobs in this
partition. The memory size is specified in megabytes.
DefMemPerNode=<MB>
Set the default memory to be allocated per node for jobs in this
partition. The memory size is specified in megabytes.
DisableRootJobs=<yes|no>
Specify if jobs can be executed as user root. Possible values
are "YES" and "NO".
GraceTime=<seconds>
Specifies, in units of seconds, the preemption grace time to be
extended to a job which has been selected for preemption. The
default value is zero, no preemption grace time is allowed on
this partition or qos. (Meaningful only for PreemptMode=CANCEL)
Hidden=<yes|no>
Specify if the partition and its jobs should be hidden from
view. Hidden partitions will by default not be reported by
Slurm APIs or commands. Possible values are "YES" and "NO".
MaxMemPerCPU=<MB>
Set the maximum memory to be allocated per CPU for jobs in this
partition. The memory size is specified in megabytes.
MaxMemPerCNode=<MB>
Set the maximum memory to be allocated per node for jobs in this
partition. The memory size is specified in megabytes.
MaxNodes=<count>
Set the maximum number of nodes which will be allocated to any
single job in the partition. Specify a number, "INFINITE" or
"UNLIMITED". (On a Bluegene type system this represents a
c-node count.) Changing the MaxNodes of a partition has no
effect upon jobs that have already begun execution.
MaxTime=<time>
The maximum run time for jobs. Output format is
[days-]hours:minutes:seconds or "UNLIMITED". Input format (for
update command) is minutes, minutes:seconds,
hours:minutes:seconds, days-hours, days-hours:minutes or
days-hours:minutes:seconds. Time resolution is one minute and
second values are rounded up to the next minute. Changing the
MaxTime of a partition has no effect upon jobs that have already
begun execution.
MinNodes=<count>
Set the minimum number of nodes which will be allocated to any
single job in the partition. (On a Bluegene type system this
represents a c-node count.) Changing the MinNodes of a
partition has no effect upon jobs that have already begun
execution.
Nodes=<name>
Identify the node(s) to be associated with this partition.
Multiple node names may be specified using simple node range
expressions (e.g. "lx[10-20]"). Note that jobs may only be
associated with one partition at any time. Specify a blank data
value to remove all nodes from a partition: "Nodes=". Changing
the Nodes in a partition has no effect upon jobs that have
already begun execution.
OverSubscribe=<yes|no|exclusive|force>[:<job_count>]
Specify if compute resources (i.e. individual CPUs) in this
partition can be shared by multiple jobs. Possible values are
"YES", "NO", "EXCLUSIVE" and "FORCE". An optional job count
specifies how many jobs can be allocated to use each resource.
PartitionName=<name>
Identify the partition to be updated. This specification is
required.
PreemptMode=<mode>
Reset the mechanism used to preempt jobs in this partition if
PreemptType is configured to preempt/partition_prio. The default
preemption mechanism is specified by the cluster-wide
PreemptMode configuration parameter. Possible values are "OFF",
"CANCEL", "CHECKPOINT", "REQUEUE" and "SUSPEND".
Priority=<count>
Jobs submitted to a higher priority partition will be dispatched
before pending jobs in lower priority partitions and if possible
they will preempt running jobs from lower priority partitions.
Note that a partition's priority takes precedence over a job's
priority. The value may not exceed 65533.
QOS=<QOSname|blank to remove>
Set the partition QOS with a QOS name or to remove the Partition
QOS leave the option blank.
RootOnly=<yes|no>
Specify if only allocation requests initiated by user root will
be satisfied. This can be used to restrict control of the
partition to some meta-scheduler. Possible values are "YES" and
"NO".
ReqResv=<yes|no>
Specify if only allocation requests designating a reservation
will be satisfied. This is used to restrict partition usage to
be allowed only within a reservation. Possible values are "YES"
and "NO".
Shared=<yes|no|exclusive|force>[:<job_count>]
Renamed to OverSubscribe, see option descriptions above.
State=<up|down|drain|inactive>
Specify if jobs can be allocated nodes or queued in this
partition. Possible values are "UP", "DOWN", "DRAIN" and
"INACTIVE".
UP Designates that new jobs may queued on the partition,
and that jobs may be allocated nodes and run from the
partition.
DOWN Designates that new jobs may be queued on the
partition, but queued jobs may not be allocated nodes
and run from the partition. Jobs already running on
the partition continue to run. The jobs must be
explicitly canceled to force their termination.
DRAIN Designates that no new jobs may be queued on the
partition (job submission requests will be denied with
an error message), but jobs already queued on the
partition may be allocated nodes and run. See also
the "Alternate" partition specification.
INACTIVE Designates that no new jobs may be queued on the
partition, and jobs already queued may not be
allocated nodes and run. See also the "Alternate"
partition specification.
SPECIFICATIONS FOR UPDATE COMMAND, POWERCAP
PowerCap=<count>
Set the amount of watts the cluster is limited to. Specify a
number, "INFINITE" to enable the power capping logic without
power restriction or "0" to disable the power capping logic.
Update slurm.conf with any changes meant to be persistent across
normal restarts of slurmctld or the execution of scontrol
reconfig.
SPECIFICATIONS FOR CREATE, UPDATE, AND DELETE COMMANDS, RESERVATIONS
Reservation=<name>
Identify the name of the reservation to be created,
updated, or deleted. This parameter is required for
update and is the only parameter for delete. For create,
if you do not want to give a reservation name, use
"scontrol create res ..." and a name will be created
automatically.
Accounts=<account list>
List of accounts permitted to use the reserved nodes, for
example "Accounts=physcode1,physcode2". A user in any of
the accounts may use the reserved nodes. A new
reservation must specify Users and/or Accounts. If both
Users and Accounts are specified, a job must match both
in order to use the reservation. Accounts can also be
denied access to reservations by preceding all of the
account names with '-'. Alternately precede the equal
sign with '-'. For example,
"Accounts=-physcode1,-physcode2" or
"Accounts-=physcode1,physcode2" will permit any account
except physcode1 and physcode2 to use the reservation.
You can add or remove individual accounts from an
existing reservation by using the update command and
adding a '+' or '-' sign before the '=' sign. If
accounts are denied access to a reservation (account name
preceded by a '-'), then all other accounts are
implicitly allowed to use the reservation and it is not
possible to also explicitly specify allowed accounts.
BurstBuffer=<buffer_spec>[,<buffer_spec>,...]
Specification of burst buffer resources which are to be
reserved. "buffer_spec" consists of four elements:
[plugin:][type:]#[units] "plugin" is the burst buffer
plugin name, currently either "cray" or "generic". If no
plugin is specified, the reservation applies to all
configured burst buffer plugins. "type" specifies a Cray
generic burst buffer resource, for example "nodes". if
"type" is not specified, the number is a measure of
storage space. The "units" may be "N" (nodes), "GB"
(gigabytes), "TB" (terabytes), "PB" (petabytes), etc.
with the default units being gigabyes for reservations of
storage space. For example "BurstBuffer=cray:2TB"
(reserve 2TB of storage plus 3 nodes from the Cray
plugin) or "BurstBuffer=100GB" (reserve 100 GB of storage
from all configured burst buffer plugins). Jobs using
this reservation are not restricted to these burst buffer
resources, but may use these reserved resources plus any
which are generally available.
CoreCnt=<num>
This option is only supported when
SelectType=select/cons_res. Identify number of cores to
be reserved. If NodeCnt is used, this is the total number
of cores to reserve where cores per node is
CoreCnt/NodeCnt. If a nodelist is used, this should be an
array of core numbers by node: Nodes=node[1-5]
CoreCnt=2,2,3,3,4
Licenses=<license>
Specification of licenses (or other resources available
on all nodes of the cluster) which are to be reserved.
License names can be followed by a colon and count (the
default count is one). Multiple license names should be
comma separated (e.g. "Licenses=foo:4,bar"). A new
reservation must specify one or more resource to be
included: NodeCnt, Nodes and/or Licenses. If a
reservation includes Licenses, but no NodeCnt or Nodes,
then the option Flags=LICENSE_ONLY must also be
specified. Jobs using this reservation are not
restricted to these licenses, but may use these reserved
licenses plus any which are generally available.
NodeCnt=<num>[,num,...]
Identify number of nodes to be reserved. The number can
include a suffix of "k" or "K", in which case the number
specified is multiplied by 1024. On BlueGene systems,
this number represents a c-node (compute node) count and
will be rounded up as needed to reserve whole nodes
(midplanes). In order to optimize the topology of the
resource allocation on a new reservation (not on an
updated reservation), specific sizes required for the
reservation may be specified. For example, if you want to
reserve 4096 c-nodes on a BlueGene system that can be
used to allocate two jobs each with 2048 c-nodes, specify
"NodeCnt=2k,2k". A new reservation must specify one or
more resource to be included: NodeCnt, Nodes and/or
Licenses.
Nodes=<name>
Identify the node(s) to be reserved. Multiple node names
may be specified using simple node range expressions
(e.g. "Nodes=lx[10-20]"). Specify a blank data value to
remove all nodes from a reservation: "Nodes=". A new
reservation must specify one or more resource to be
included: NodeCnt, Nodes and/or Licenses. A specification
of "ALL" will reserve all nodes. Set Flags=PART_NODES and
PartitionName= in order for changes in the nodes
associated with a partition to also be reflected in the
nodes associated with a reservation.
StartTime=<time_spec>
The start time for the reservation. A new reservation
must specify a start time. It accepts times of the form
HH:MM:SS for a specific time of day (seconds are
optional). (If that time is already past, the next day
is assumed.) You may also specify midnight, noon, fika
(3 PM) or teatime (4 PM) and you can have a time-of-day
suffixed with AM or PM for running in the morning or the
evening. You can also say what day the job will be run,
by specifying a date of the form MMDDYY or MM/DD/YY or
MM.DD.YY, or a date and time as YYYY-MM-DD[THH:MM[:SS]].
You can also give times like now + count time-units,
where the time-units can be minutes, hours, days, or
weeks and you can tell Slurm to run the job today with
the keyword today and to run the job tomorrow with the
keyword tomorrow. You cannot update the StartTime of a
reservation in ACTIVE state.
EndTime=<time_spec>
The end time for the reservation. A new reservation must
specify an end time or a duration. Valid formats are the
same as for StartTime.
Duration=<time>
The length of a reservation. A new reservation must
specify an end time or a duration. Valid formats are
minutes, minutes:seconds, hours:minutes:seconds,
days-hours, days-hours:minutes,
days-hours:minutes:seconds, or UNLIMITED. Time
resolution is one minute and second values are rounded up
to the next minute. Output format is always
[days-]hours:minutes:seconds.
PartitionName=<name>
Identify the partition to be reserved.
Flags=<flags>
Flags associated with the reservation. You can add or
remove individual flags from an existing reservation by
adding a '+' or '-' sign before the '=' sign. For
example: Flags-=DAILY (NOTE: this shortcut is not
supported for all flags). Currently supported flags
include:
ANY_NODES This is a reservation for burst buffers
and/or licenses only and not compute nodes.
If this flag is set, a job using this
reservation may use the associated burst
buffers and/or licenses plus any compute
nodes. If this flag is not set, a job
using this reservation may use only the
nodes and licenses associated with the
reservation.
DAILY Repeat the reservation at the same time
every day
FIRST_CORES Use the lowest numbered cores on a node
only.
IGNORE_JOBS Ignore currently running jobs when creating
the reservation. This can be especially
useful when reserving all nodes in the
system for maintenance.
LICENSE_ONLY See ANY_NODES.
MAINT Maintenance mode, receives special
accounting treatment. This partition is
permitted to use resources that are already
in another reservation.
OVERLAP This reservation can be allocated resources
that are already in another reservation.
PART_NODES This flag can be used to reserve all nodes
within the specified partition.
PartitionName and Nodes=ALL must be
specified or this option is ignored.
PURGE_COMP Purge the reservation once the last
associated job has completed. Once the
reservation has been created, it must be
populated within 5 minutes of its start
time or it will be purged before any jobs
have been run.
REPLACE Resources allocated to jobs as
automaticallly replenished using idle
resources. This option can be used to
maintain a constant number of idle
resources available for pending jobs
(subject to availability of idle
resources). This should be used with the
NodeCnt reservation option; do not identify
specific nodes to be included in the
reservation. This option is not supported
on IBM Bluegene systems.
SPEC_NODES Reservation is for specific nodes (output
only)
STATIC_ALLOC Make it so after the nodes are selected for
a reservation they don't change. Without
this option when nodes are selected for a
reservation and one goes down the
reservation will select a new node to fill
the spot.
TIME_FLOAT The reservation start time is relative to
the current time and moves forward through
time (e.g. a StartTime=now+10minutes will
always be 10 minutes in the future).
WEEKLY Repeat the reservation at the same time
every week
Features=<features>
Set the reservation's required node features. Multiple
values may be "&" separated if all features are required
(AND operation) or separated by "|" if any of the
specified features are required (OR operation). Value
may be cleared with blank data value, "Features=".
Users=<user list>
List of users permitted to use the reserved nodes, for
example "User=jones1,smith2". A new reservation must
specify Users and/or Accounts. If both Users and
Accounts are specified, a job must match both in order to
use the reservation. Users can also be denied access to
reservations by preceding all of the user names with '-'.
Alternately precede the equal sign with '-'. For
example, "User=-jones1,-smith2" or "User-=jones1,smith2"
will permit any user except jones1 and smith2 to use the
reservation. You can add or remove individual users from
an existing reservation by using the update command and
adding a '+' or '-' sign before the '=' sign. If users
are denied access to a reservation (user name preceded by
a '-'), then all other users are implicitly allowed to
use the reservation and it is not possible to also
explicitly specify allowed users.
TRES=<tres_spec>
Comma-separated list of TRES required for the
reservation. Current supported TRES types with
reservations are: CPU, Node, License and BB. CPU and Node
follow the same format as CoreCnt and NodeCnt parameters
respectively. License names can be followed by an equal
'=' and a count:
License/<name1>=<count1>[,License/<name2>=<count2>,...]
BurstBuffer can be specified in a similar way as
BurstBuffer parameter. The only difference is that colon
symbol ':' should be replaced by an equal '=' in order to
follow the TRES format.
Some examples of TRES valid specifications:
TRES=cpu=5,bb/cray=4,license/iop1=1,license/iop2=3
TRES=node=5k,license/iop1=2
As specified in CoreCnt, if a nodelist is specified, cpu
can be an array of core numbers by node:
nodes=compute[1-3]
TRES=cpu=2,2,1,bb/cray=4,license/iop1=2
Please note that CPU, Node, License and BB can override
CoreCnt, NodeCnt, Licenses and BurstBuffer parameters
respectively. Also CPU represents CoreCnt, in a
reservation and will be adjusted if you have threads per
core on your nodes.
SPECIFICATIONS FOR UPDATE BLOCK/SUBMP
Bluegene systems only!
BlockName=<name>
Identify the bluegene block to be updated. This
specification is required.
State=<free|error|recreate|remove|resume>
This will update the state of a bluegene block. (i.e.
update BlockName=RMP0 STATE=ERROR) WARNING!!!! With the
exception of the RESUME state, all other state values
will cancel any running job on the block!
FREE Return the block to a free state.
ERROR Make it so jobs don't run on the block.
RECREATE Destroy the current block and create a new one
to take its place.
REMOVE Free and remove the block from the system. If
the block is smaller than a midplane every
block on that midplane will be removed. (only
available on dynamic laid out systems)
RESUME If a block is in ERROR state RESUME will return
the block to its previous usable state (FREE or
READY).
SubMPName=<name>
Identify the bluegene ionodes to be updated (i.e.
bg000[0-3]). This specification is required. NOTE: Even
on BGQ where node names are given in bg0000[00000] format
this option takes an ionode name bg0000[0].
SPECIFICATIONS FOR UPDATE COMMAND, LAYOUTS
Layout=<name>
Identify the layout to be updated. This specification is
required.
Entity=<entity list>
Identify the entities to be updated. This specification
is required.
Key=<value>
Keys/Values to update for the entities. The format must
respect the layout.d configuration files. Key=Type cannot
be updated. At least one Key/Value is required, several
can be set.
SPECIFICATIONS FOR SHOW COMMAND, LAYOUTS
Without options, lists all configured layouts. With a layout
specified, shows entities with following options:
Key=<value>
Keys/Values to update for the entities. The format must
respect the layout.d configuration files. Key=Type cannot
be updated. One Key/Value is required, several can be
set.
Entity=<value>
Entities to show, default is not used. Can be set to "*".
Type=<value>
Type of entities to show, default is not used.
nolayout
If not used, only entities with defining the tree are
shown. With the option, only leaves are shown.
DESCRIPTION FOR SHOW COMMAND, NODES
The meaning of the energy information is as follows:
CurrentWatts
The instantaneous power consumption of the node at the
time of the last node energy accounting sample, in watts.
LowestJoules
The energy consumed by the node between the last time it
was powered on and the last time it was registered by
slurmd, in joules.
ConsumedJoules
The energy consumed by the node between the last time it
was registered by the slurmd daemon and the last node
energy accounting sample, in joules.
If the reported value is "n/s" (not supported), the node does
not support the configured AcctGatherEnergyType plugin. If the
reported value is zero, energy accounting for nodes is disabled.
The meaning of the external sensors information is as follows:
ExtSensorsJoules
The energy consumed by the node between the last time it
was powered on and the last external sensors plugin node
sample, in joules.
ExtSensorsWatts
The instantaneous power consumption of the node at the
time of the last external sensors plugin node sample, in
watts.
ExtSensorsTemp
The temperature of the node at the time of the last
external sensors plugin node sample, in celsius.
If the reported value is "n/s" (not supported), the node does
not support the configured ExtSensorsType plugin.
The meaning of the resource specialization information is as
follows:
CPUSpecList
The list of Slurm abstract CPU IDs on this node reserved
for exclusive use by the Slurm compute node daemons
(slurmd, slurmstepd).
MemSpecLimit
The combined memory limit, in megabytes, on this node for
the Slurm compute node daemons (slurmd, slurmstepd).
The meaning of the memory information is as follows:
RealMemory
The total memory, in MB, on the node.
AllocMem
The total memory, in MB, currently allocated by jobs on
the node.
FreeMem
The total memory, in MB, currently free on the node as
reported by the OS.
Some scontrol options may be set via environment variables.
These environment variables, along with their corresponding
options, are listed below. (Note: Commandline options will
always override these settings.)
SCONTROL_ALL -a, --all
SLURM_BITSTR_LEN Specifies the string length to be used for
holding a job array's task ID expression.
The default value is 64 bytes. A value of 0
will print the full expression with any
length required. Larger values may
adversely impact the application
performance.
SLURM_CLUSTERS Same as --clusters
SLURM_CONF The location of the Slurm configuration
file.
SLURM_TIME_FORMAT Specify the format used to report time
stamps. A value of standard, the default
value, generates output in the form
"year-month-dateThour:minute:second". A
value of relative returns only
"hour:minute:second" if the current day.
For other dates in the current year it
prints the "hour:minute" preceded by
"Tomorr" (tomorrow), "Ystday" (yesterday),
the name of the day for the coming week
(e.g. "Mon", "Tue", etc.), otherwise the
date (e.g. "25 Apr"). For other years it
returns a date month and year without a time
(e.g. "6 Jun 2012"). All of the time stamps
use a 24 hour format.
A valid strftime() format can also be
specified. For example, a value of "%a %T"
will report the day of the week and a time
stamp (e.g. "Mon 12:34:56").
SLURM_TOPO_LEN Specify the maximum size of the line when
printing Topology. If not set, the default
value "512" will be used.
When using the Slurm db, users who have AdminLevel's defined (Operator or Admin) and users who are account coordinators are given the authority to view and modify jobs, reservations, nodes, etc., as defined in the following table - regardless of whether a PrivateData restriction has been defined in the slurm.conf file. scontrol show job(s): Admin, Operator, Coordinator scontrol update job: Admin, Operator, Coordinator scontrol requeue: Admin, Operator, Coordinator scontrol show step(s): Admin, Operator, Coordinator scontrol update step: Admin, Operator, Coordinator scontrol show block: Admin, Operator scontrol update block: Admin scontrol show node: Admin, Operator scontrol update node: Admin scontrol create partition: Admin scontrol show partition: Admin, Operator scontrol update partition: Admin scontrol delete partition: Admin scontrol create reservation: Admin, Operator scontrol show reservation: Admin, Operator scontrol update reservation: Admin, Operator scontrol delete reservation: Admin, Operator scontrol reconfig: Admin scontrol shutdown: Admin scontrol takeover: Admin
# scontrol
scontrol: show part debug
PartitionName=debug
AllocNodes=ALL AllowGroups=ALL Default=YES
DefaultTime=NONE DisableRootJobs=NO Hidden=NO
MaxNodes=UNLIMITED MaxTime=UNLIMITED MinNodes=1
Nodes=snowflake[0-48]
Priority=1 RootOnly=NO OverSubscribe=YES:4
State=UP TotalCPUs=694 TotalNodes=49
scontrol: update PartitionName=debug MaxTime=60:00 MaxNodes=4
scontrol: show job 71701
JobId=71701 Name=hostname
UserId=da(1000) GroupId=da(1000)
Priority=66264 Account=none QOS=normal WCKey=*123
JobState=COMPLETED Reason=None Dependency=(null)
TimeLimit=UNLIMITED Requeue=1 Restarts=0 BatchFlag=0
ExitCode=0:0
SubmitTime=2010-01-05T10:58:40
EligibleTime=2010-01-05T10:58:40
StartTime=2010-01-05T10:58:40 EndTime=2010-01-05T10:58:40
SuspendTime=None SecsPreSuspend=0
Partition=debug AllocNode:Sid=snowflake:4702
ReqNodeList=(null) ExcNodeList=(null)
NodeList=snowflake0
NumNodes=1 NumCPUs=10 CPUs/Task=2 ReqS:C:T=1:1:1
MinCPUsNode=2 MinMemoryNode=0 MinTmpDiskNode=0
Features=(null) Reservation=(null)
OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
scontrol: update JobId=71701 TimeLimit=30:00 Priority=500
scontrol: show hostnames tux[1-3]
tux1
tux2
tux3
scontrol: create res StartTime=2009-04-01T08:00:00
Duration=5:00:00 Users=dbremer NodeCnt=10
Reservation created: dbremer_1
scontrol: update Reservation=dbremer_1 Flags=Maint NodeCnt=20
scontrol: delete Reservation=dbremer_1
scontrol: quit
Copyright (C) 2002-2007 The Regents of the University of California. Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER). Copyright (C) 2008-2010 Lawrence Livermore National Security. Copyright (C) 2010-2016 SchedMD LLC. This file is part of Slurm, a resource management program. For details, see <http://slurm.schedmd.com/>. Slurm is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. Slurm is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
/etc/slurm.conf
scancel(1), sinfo(1), squeue(1), slurm_checkpoint (3), slurm_create_partition (3), slurm_delete_partition (3), slurm_load_ctl_conf (3), slurm_load_jobs (3), slurm_load_node (3), slurm_load_partitions (3), slurm_reconfigure (3), slurm_requeue (3), slurm_resume (3), slurm_shutdown (3), slurm_suspend (3), slurm_takeover (3), slurm_update_job (3), slurm_update_node (3), slurm_update_partition (3), slurm.conf(5), slurmctld(8)
Personal Opportunity - Free software gives you access to billions of dollars of software at no cost. Use this software for your business, personal use or to develop a profitable skill. Access to source code provides access to a level of capabilities/information that companies protect though copyrights. Open source is a core component of the Internet and it is available to you. Leverage the billions of dollars in resources and capabilities to build a career, establish a business or change the world. The potential is endless for those who understand the opportunity.
Business Opportunity - Goldman Sachs, IBM and countless large corporations are leveraging open source to reduce costs, develop products and increase their bottom lines. Learn what these companies know about open source and how open source can give you the advantage.
Free Software provides computer programs and capabilities at no cost but more importantly, it provides the freedom to run, edit, contribute to, and share the software. The importance of free software is a matter of access, not price. Software at no cost is a benefit but ownership rights to the software and source code is far more significant.
Free Office Software - The Libre Office suite provides top desktop productivity tools for free. This includes, a word processor, spreadsheet, presentation engine, drawing and flowcharting, database and math applications. Libre Office is available for Linux or Windows.
The Free Books Library is a collection of thousands of the most popular public domain books in an online readable format. The collection includes great classical literature and more recent works where the U.S. copyright has expired. These books are yours to read and use without restrictions.
Source Code - Want to change a program or know how it works? Open Source provides the source code for its programs so that anyone can use, modify or learn how to write those programs themselves. Visit the GNU source code repositories to download the source.
Study at Harvard, Stanford or MIT - Open edX provides free online courses from Harvard, MIT, Columbia, UC Berkeley and other top Universities. Hundreds of courses for almost all major subjects and course levels. Open edx also offers some paid courses and selected certifications.
Linux Manual Pages - A man or manual page is a form of software documentation found on Linux/Unix operating systems. Topics covered include computer programs (including library and system calls), formal standards and conventions, and even abstract concepts.