slurm.conf(5)

NAME

   slurm.conf - Slurm configuration file

DESCRIPTION

   slurm.conf is an ASCII file which describes general Slurm configuration
   information, the nodes to be managed, information about how those nodes
   are   grouped   into  partitions,  and  various  scheduling  parameters
   associated with those partitions. This file should be consistent across
   all nodes in the cluster.

   The  file  location  can  be  modified  at  system build time using the
   DEFAULT_SLURM_CONF parameter  or  at  execution  time  by  setting  the
   SLURM_CONF  environment  variable.  The Slurm daemons also allow you to
   override both the built-in and environment-provided location using  the
   "-f" option on the command line.

   The  contents  of the file are case insensitive except for the names of
   nodes and partitions. Any text following a  "#"  in  the  configuration
   file  is treated as a comment through the end of that line.  Changes to
   the configuration file take  effect  upon  restart  of  Slurm  daemons,
   daemon  receipt  of  the  SIGHUP  signal,  or  execution of the command
   "scontrol reconfigure" unless otherwise noted.

   If a line begins with the word "Include"  followed  by  whitespace  and
   then  a  file  name, that file will be included inline with the current
   configuration  file.   For   large   or   complex   systems,   multiple
   configuration files may prove easier to manage and enable reuse of some
   files (See INCLUDE MODIFIERS for more details).

   Note on file permissions:

   The slurm.conf file must be readable by all users of Slurm, since it is
   used  by  many  of the Slurm commands.  Other files that are defined in
   the slurm.conf file, such as log files and job  accounting  files,  may
   need  to  be  created/owned  by the user "SlurmUser" to be successfully
   accessed.  Use the "chown" and "chmod" commands to  set  the  ownership
   and  permissions  appropriately.   See  the  section FILE AND DIRECTORY
   PERMISSIONS for information about the  various  files  and  directories
   used by Slurm.

PARAMETERS

   The overall configuration parameters available include:

   AccountingStorageBackupHost
          The  name  of  the backup machine hosting the accounting storage
          database.  If used with the accounting_storage/slurmdbd  plugin,
          this  is  where the backup slurmdbd would be running.  Only used
          for database type storage plugins, ignored otherwise.

   AccountingStorageEnforce
          This controls what level  of  association-based  enforcement  to
          impose on job submissions.  Valid options are any combination of
          associations, limits, nojobs, nosteps, qos, safe, and wckeys, or
          all  for  all  things  (expect  nojobs and nosteps, they must be
          requested as well).

          If  limits,  qos,  or  wckeys   are   set,   associations   will
          automatically be set.

          If wckeys is set, TrackWCKey will automatically be set.

          If  safe  is  set, limits and associations will automatically be
          set.

          If nojobs is set nosteps will automatically be set.

          By enforcing Associations no new job is allowed to run unless  a
          corresponding  association  exists in the system.  If limits are
          enforced users can be limited by  association  to  whatever  job
          size or run time limits are defined.

          If nojobs is set Slurm will not account for any jobs or steps on
          the system, like wise if nosteps is set Slurm will  not  account
          for any steps ran limits will still be enforced.

          If  safe  is  enforced  a  job  will only be launched against an
          association or qos that has a GrpCPUMins limit set  if  the  job
          will  be  able  to  run to completion.  Without this option set,
          jobs will be launched as long as their usage hasn't reached  the
          cpu-minutes limit which can lead to jobs being launched but then
          killed when the limit is reached.

          With qos and/or wckeys  enforced  jobs  will  not  be  scheduled
          unless  a  valid  qos  and/or  workload  characterization key is
          specified.

          When AccountingStorageEnforce  is  changed,  a  restart  of  the
          slurmctld daemon is required (not just a "scontrol reconfig").

   AccountingStorageHost
          The name of the machine hosting the accounting storage database.
          Only used for database type storage plugins, ignored  otherwise.
          Also see DefaultStorageHost.

   AccountingStorageLoc
          The  fully  qualified  file  name  where  accounting records are
          written      when       the       AccountingStorageType       is
          "accounting_storage/filetxt"  or  else  the name of the database
          where    accounting    records    are    stored     when     the
          AccountingStorageType     is     a     database.     Also    see
          DefaultStorageLoc.

   AccountingStoragePass
          The password used to gain access to the database  to  store  the
          accounting  data.   Only used for database type storage plugins,
          ignored otherwise.  In the case of Slurm DBD  (Database  Daemon)
          with  MUNGE authentication this can be configured to use a MUNGE
          daemon specifically configured to provide authentication between
          clusters  while the default MUNGE daemon provides authentication
          within a cluster.  In that  case,  AccountingStoragePass  should
          specify  the  named  port to be used for communications with the
          alternate MUNGE daemon (e.g.  "/var/run/munge/global.socket.2").
          The default value is NULL.  Also see DefaultStoragePass.

   AccountingStoragePort
          The  listening  port  of the accounting storage database server.
          Only used for database type storage plugins, ignored  otherwise.
          Also see DefaultStoragePort.

   AccountingStorageTRES
          Comma  separated  list  of  resources  you  wish to track on the
          cluster.  These are the resources requested by  the  sbatch/srun
          job  when it is submitted.  Currently this consists of any GRES,
          BB (burst buffer) or license along with CPU, Memory,  Node,  and
          Energy.   By  default CPU, Energy, Memory, and Node are tracked.
          AccountingStorageTRES=gres/craynetwork,license/iop1  will  track
          cpu,  energy, memory, nodes along with a gres called craynetwork
          as well as a license called iop1.  Whenever these resources  are
          used   on   the   cluster   they  are  recorded.  The  TRES  are
          automatically set up  in  the  database  on  the  start  of  the
          slurmctld.

   AccountingStorageType
          The  accounting  storage  mechanism  type.  Acceptable values at
          present          include           "accounting_storage/filetxt",
          "accounting_storage/mysql",     "accounting_storage/none"    and
          "accounting_storage/slurmdbd".  The "accounting_storage/filetxt"
          value  indicates  that accounting records will be written to the
          file  specified  by  the  AccountingStorageLoc  parameter.   The
          "accounting_storage/mysql"   value   indicates  that  accounting
          records will be written to a MySQL or MariaDB database specified
          by       the      AccountingStorageLoc      parameter.       The
          "accounting_storage/slurmdbd" value  indicates  that  accounting
          records  will  be  written  to  the  Slurm DBD, which manages an
          underlying  MySQL  database.  See  "man   slurmdbd"   for   more
          information.  The default value is "accounting_storage/none" and
          indicates that account records are not  maintained.   Note:  The
          filetxt  plugin  records  only  a  limited  subset of accounting
          information and will prevent  some  sacct  options  from  proper
          operation.  Also see DefaultStorageType.

   AccountingStorageUser
          The  user account for accessing the accounting storage database.
          Only used for database type storage plugins, ignored  otherwise.
          Also see DefaultStorageUser.

   AccountingStoreJobComment
          If  set to "YES" then include the job's comment field in the job
          complete message sent to the Accounting Storage  database.   The
          default is "YES".

   AcctGatherNodeFreq
          The  AcctGather  plugins  sampling interval for node accounting.
          For AcctGather plugin values of none, this parameter is ignored.
          For  all  other  values  this parameter is the number of seconds
          between node accounting samples. For the acct_gather_energy/rapl
          plugin,  set  a  value  less  than  300 because the counters may
          overflow beyond this rate.  The  default  value  is  zero.  This
          value   disables   accounting  sampling  for  nodes.  Note:  The
          accounting sampling interval for jobs is determined by the value
          of JobAcctGatherFrequency.

   AcctGatherEnergyType
          Identifies   the  plugin  to  be  used  for  energy  consumption
          accounting.  The jobacct_gather plugin and  slurmd  daemon  call
          this  plugin  to  collect  energy  consumption data for jobs and
          nodes. The collection of energy consumption data takes place  on
          node  level,  hence only in case of exclusive job allocation the
          energy consumption  measurements  will  reflect  the  jobs  real
          consumption.  In  case of node sharing between jobs the reported
          consumed energy per  job  (through  sstat  or  sacct)  will  not
          reflect the real energy consumed by the jobs.

          Configurable values at present are:

          acct_gather_energy/none
                              No energy consumption data is collected.

          acct_gather_energy/ipmi
                              Energy  consumption  data  is collected from
                              the Baseboard  Management  Controller  (BMC)
                              using  the  Intelligent  Platform Management
                              Interface (IPMI).

          acct_gather_energy/rapl
                              Energy consumption data  is  collected  from
                              hardware  sensors  using the Running Average
                              Power  Limit  (RAPL)  mechanism.  Note  that
                              enabling  RAPL  may require the execution of
                              the command "sudo modprobe msr".

   AcctGatherInfinibandType
          Identifies the plugin to be used for infiniband network  traffic
          accounting.  The plugin is activated only when profiling on hdf5
          files is activated and the user asks for network data collection
          for  jobs through --profile=Network (or =All). The collection of
          network traffic data takes place on node level,  hence  only  in
          case  of  exclusive  job  allocation  the  collected values will
          reflect the jobs real traffic.  All  network  traffic  data  are
          logged  on  hdf5  files  per job on each node. No storage on the
          Slurm database takes place.

          Configurable values at present are:

          acct_gather_infiniband/none
                              No infiniband network data are collected.

          acct_gather_infiniband/ofed
                              Infiniband   network   traffic   data    are
                              collected   from   the  hardware  monitoring
                              counters of Infiniband devices  through  the
                              OFED library.

   AcctGatherFilesystemType
          Identifies   the  plugin  to  be  used  for  filesystem  traffic
          accounting.  The plugin is activated only when profiling on hdf5
          files  is  activated  and  the  user  asks  for  filesystem data
          collection for jobs  through  --profile=Lustre  (or  =All).  The
          collection of filesystem traffic data takes place on node level,
          hence only in case of exclusive  job  allocation  the  collected
          values  will  reflect  the  jobs  real  traffic.  All filesystem
          traffic data are logged on hdf5 files per job on each  node.  No
          storage on the Slurm database takes place.

          Configurable values at present are:

          acct_gather_filesystem/none
                              No filesystem data are collected.

          acct_gather_filesystem/lustre
                              Lustre filesystem traffic data are collected
                              from the counters found in /proc/fs/lustre/.

   AcctGatherProfileType
          Identifies the plugin to be used  for  detailed  job  profiling.
          The  jobacct_gather plugin and slurmd daemon call this plugin to
          collect detailed data such  as  I/O  counts,  memory  usage,  or
          energy  consumption  for jobs and nodes. There are interfaces in
          this plugin to collect data as step start and  completion,  task
          start  and  completion, and at the account gather frequency. The
          data collected at the node level is related to jobs only in case
          of exclusive job allocation.

          Configurable values at present are:

          acct_gather_profile/none
                              No profile data is collected.

          acct_gather_profile/hdf5
                              This  enables the HDF5 plugin. The directory
                              where the profile files are stored and which
                              values  are  collected are configured in the
                              acct_gather.conf file.

   AllowSpecResourcesUsage
          If set to 1, Slurm allows individual  jobs  to  override  node's
          configured  CoreSpecCount  value. For a job to take advantage of
          this feature, a command  line  option  of  --core-spec  must  be
          specified.   The  default  value  for  this option is 1 for Cray
          systems and 0 for other system types.

   AuthInfo
          Additional  information  to  be  used  for   authentication   of
          communications  between the Slurm daemons (slurmctld and slurmd)
          and the Slurm clients.  The interpretation  of  this  option  is
          specific  to  the  configured AuthType.  Multiple options may be
          specified in a comma delimited  list.   If  not  specified,  the
          default authentication information will be used.

          cred_expire   Default  job  step credential lifetime, in seconds
                        (e.g.    "cred_expire=1200").     It    must    be
                        sufficiently long enough to load user environment,
                        run prolog, deal with the slurmd getting paged out
                        of  memory,  etc.   This  also controls how long a
                        requeued job must wait before starting again.  The
                        default value is 120 seconds.

          socket        Path  name  to  a MUNGE daemon socket to use (e.g.
                        "socket=/var/run/munge/munge.socket.2").       The
                        default  value is "/var/run/munge/munge.socket.2".
                        Used by auth/munge and crypto/munge.

          ttl           Credential lifetime, in seconds (e.g.  "ttl=300").
                        The  default  value  is  dependent  upon the Munge
                        installation, but is typically 300 seconds.

   AuthType
          The  authentication  method  for  communications  between  Slurm
          components.   Acceptable  values  at present include "auth/none"
          and  "auth/munge".    The   default   value   is   "auth/munge".
          "auth/none"  includes  the  UID in each communication, but it is
          not verified.  This may be fine for testing purposes, but do not
          use  "auth/none"  if  you  desire  any  security.   "auth/munge"
          indicates that LLNL's MUNGE is to be  used  (this  is  the  best
          supported    authentication    mechanism    for    Slurm,    see
          "http://munge.googlecode.com/" for more information).  All Slurm
          daemons  and  commands  must be terminated prior to changing the
          value of  AuthType  and  later  restarted  (Slurm  jobs  can  be
          preserved).

   BackupAddr
          The   name  that  BackupController  should  be  referred  to  in
          establishing a communications path. This name will be used as an
          argument to the gethostbyname() function for identification. For
          example, "elx0000" might  be  used  to  designate  the  Ethernet
          address  for  node  "lx0000".  By default the BackupAddr will be
          identical in value to BackupController.

   BackupController
          The short, or long, name of  the  machine  where  Slurm  control
          functions  are  to  be executed in the event that ControlMachine
          fails (i.e. the name returned by  the  command  "hostname  -s").
          This node may also be used as a compute server if so desired. It
          will come into service as a controller only upon the failure  of
          ControlMachine  and  will  revert  to  a "standby" mode when the
          ControlMachine becomes available once again.

          The  backup  controller  recovers  state  information  from  the
          StateSaveLocation directory, which must be readable and writable
          from  both  the  primary  and  backup  controllers.   While  not
          essential,   it   is  recommended  that  you  specify  a  backup
          controller.  See  the  RELOCATING  CONTROLLERS  section  if  you
          change this.

   BatchStartTimeout
          The  maximum time (in seconds) that a batch job is permitted for
          launching before being  considered  missing  and  releasing  the
          allocation. The default value is 10 (seconds). Larger values may
          be required if more time is required to execute the Prolog, load
          user  environment  variables  (for Moab spawned jobs), or if the
          slurmd daemon gets paged from memory.
          Note: The test for a job being  successfully  launched  is  only
          performed  when  the  Slurm daemon on the compute node registers
          state with the slurmctld daemon on the head node, which  happens
          fairly   rarely.   Therefore  a  job  will  not  necessarily  be
          terminated if its start time  exceeds  BatchStartTimeout.   This
          configuration  parameter  is  also  applied  to launch tasks and
          avoid aborting srun commands due to long running Prolog scripts.

   BurstBufferType
          The plugin used to manage burst buffers.  Acceptable  values  at
          present include "burst_buffer/none".  More information later...

   CheckpointType
          The system-initiated checkpoint method to be used for user jobs.
          The  slurmctld  daemon  must  be  restarted  for  a  change   in
          CheckpointType   to  take  effect.  Supported  values  presently
          include:

          checkpoint/aix    for IBM AIX systems only

          checkpoint/blcr   Berkeley Lab Checkpoint Restart (BLCR).  NOTE:
                            If  a  file is found at sbin/scch (relative to
                            the Slurm installation location), it  will  be
                            executed  upon  completion  of the checkpoint.
                            This can be a script  used  for  managing  the
                            checkpoint  files.   NOTE:  Slurm's BLCR logic
                            only supports batch jobs.

          checkpoint/none   no checkpoint support (default)

          checkpoint/ompi   OpenMPI (version 1.3 or higher)

          checkpoint/poe    for  use  with  IBM  POE  (Parallel  Operating
                            Environment) only

   ChosLoc
          If  configured,  then  any  processes invoked on the user behalf
          (namely the  SPANK  prolog/epilog  scripts  and  the  slurmstepd
          processes,  which  in  turn  spawn  the  user  batch  script and
          applications) are not directly executed by  the  slurmd  daemon,
          but  instead  the ChosLoc program is executed.  Both are spawned
          with the same user ID as the  configured  SlurmdUser  (typically
          user  root).   That  program's  argument  are  the  program  and
          arguments that would otherwise be invoked directly by the slurmd
          daemon.   The intent of this feature is to be able to run a user
          application in some sort of container.   This  option  specified
          the   fully   qualified   pathname  of  the  chos  command  (see
          https://github.com/scanon/chos for details).

   ClusterName
          The name by which this Slurm managed cluster  is  known  in  the
          accounting  database.   This  is  needed  distinguish accounting
          records when multiple clusters  report  to  the  same  database.
          Because of limitations in some databases, any upper case letters
          in the name will be silently mapped to lower case. In  order  to
          avoid confusion, it is recommended that the name be lower case.

   CompleteWait
          The  time,  in  seconds, given for a job to remain in COMPLETING
          state before any additional jobs are scheduled.  If set to zero,
          pending  jobs  will  be  started  as  soon as possible.  Since a
          COMPLETING job's resources are released for use by other jobs as
          soon  as  the Epilog completes on each individual node, this can
          result in very fragmented resource allocations.  To provide jobs
          with  the  minimum response time, a value of zero is recommended
          (no waiting).  To minimize fragmentation of resources,  a  value
          equal  to  KillWait  plus  two  is  recommended.   In that case,
          setting KillWait to  a  small  value  may  be  beneficial.   The
          default  value  of  CompleteWait is zero seconds.  The value may
          not exceed 65533.

   ControlAddr
          Name that ControlMachine should be referred to in establishing a
          communications  path.  This  name will be used as an argument to
          the gethostbyname() function for  identification.  For  example,
          "elx0000"  might  be  used to designate the Ethernet address for
          node "lx0000".  By default the ControlAddr will be identical  in
          value to ControlMachine.

   ControlMachine
          The  short, or long, hostname of the machine where Slurm control
          functions are executed (i.e. the name returned  by  the  command
          "hostname  -s").   This  value  must  be specified.  In order to
          support some high availability architectures, multiple hostnames
          may  be listed with comma separators and one ControlAddr must be
          specified. The high availability system  must  insure  that  the
          slurmctld  daemon  is  running  on  only one of these hosts at a
          time.  See the RELOCATING  CONTROLLERS  section  if  you  change
          this.

   CoreSpecPlugin
          Identifies  the  plugins  to  be  used  for  enforcement of core
          specialization.  The slurmd  daemon  must  be  restarted  for  a
          change  in  CoreSpecPlugin to take effect.  Acceptable values at
          present include:

          core_spec/cray      used only for Cray systems

          core_spec/none      used for all other system types

   CpuFreqDef
          Default CPU frequency governor to use when running a job step if
          it  has  not  been  explicitly  set  with the --cpu-freq option.
          Acceptable values at present include:

          Conservative  attempts to use the Conservative CPU governor

          OnDemand      attempts to use the OnDemand CPU governor

          Performance   attempts to use the Performance CPU governor

          PowerSave     attempts to use the PowerSave CPU governor
   There is no default value. If unset, no attempt to set the governor  is
   made if the --cpu-freq option has not been set.

   CpuFreqGovernors
          List  of  CPU  frequency  governors  allowed  to be set with the
          salloc, sbatch, or srun option  --cpu-freq.   Acceptable  values
          at present include:

          Conservative  attempts to use the Conservative CPU governor

          OnDemand      attempts  to  use  the  OnDemand CPU governor (the
                        default value)

          Performance   attempts to use the Performance CPU governor  (the
                        default value)

          PowerSave     attempts to use the PowerSave CPU governor

          UserSpace     attempts to use the UserSpace CPU governor
   The default is OnDemand, Performance.

   CryptoType
          The  cryptographic  signature tool to be used in the creation of
          job step credentials.  The slurmctld daemon  must  be  restarted
          for a change in CryptoType to take effect.  Acceptable values at
          present  include  "crypto/munge"  and   "crypto/openssl".    The
          default value is "crypto/munge".

   DebugFlags
          Defines  specific  subsystems which should provide more detailed
          event logging.  Multiple subsystems can be specified with  comma
          separators.   Most DebugFlags will result in verbose logging for
          the identified subsystems and  could  impact  performance.   The
          below  DB_*  flags  are only useful when writing directly to the
          database.  If using  the  DBD  put  these  debug  flags  in  the
          slurmdbd.conf.   Valid  subsystems available today (with more to
          come) include:

          Backfill         Backfill scheduler details

          BackfillMap      Backfill scheduler to log a very verbose map of
                           reserved  resources  through time. Combine with
                           Backfill for a verbose and complete view of the
                           backfill scheduler's work.

          BGBlockAlgo      BlueGene block selection details

          BGBlockAlgoDeep  BlueGene block selection, more details

          BGBlockPick      BlueGene block selection for jobs

          BGBlockWires     BlueGene block wiring (switch state details)

          BurstBuffer      Burst Buffer plugin

          CPU_Bind         CPU binding details for jobs and steps

          CpuFrequency     Cpu  frequency details for jobs and steps using
                           the --cpu-freq option.

          DB_ASSOC         SQL  statements/queries   when   dealing   with
                           associations in the database.

          DB_EVENT         SQL statements/queries when dealing with (node)
                           events in the database.

          DB_JOB           SQL statements/queries when dealing  with  jobs
                           in the database.

          DB_QOS           SQL statements/queries when dealing with QOS in
                           the database.

          DB_QUERY         SQL  statements/queries   when   dealing   with
                           transactions and such in the database.

          DB_RESERVATION   SQL   statements/queries   when   dealing  with
                           reservations in the database.

          DB_RESOURCE      SQL  statements/queries   when   dealing   with
                           resources like licenses in the database.

          DB_STEP          SQL  statements/queries when dealing with steps
                           in the database.

          DB_USAGE         SQL statements/queries when dealing with  usage
                           queries and inserts in the database.

          DB_WCKEY         SQL statements/queries when dealing with wckeys
                           in the database.

          Elasticsearch    Elasticsearch debug info

          Energy           AcctGatherEnergy debug info

          ExtSensors       External Sensors debug info

          FrontEnd         Front end node details

          Gres             Generic resource details

          Gang             Gang scheduling details

          JobContainer     Job container plugin details

          License          License management details

          NodeFeatures     Node Features plugin debug info

          NO_CONF_HASH     Do not log when the  slurm.conf  files  differs
                           between Slurm daemons

          Power            Power management plugin

          Priority         Job prioritization

          Protocol         Communication protocol details

          Reservation      Advanced reservations

          SelectType       Resource selection plugin

          Steps            Slurmctld resource allocation for job steps

          Switch           Switch plugin

          TraceJobs        Trace jobs in slurmctld. It will print detailed
                           job information including state,  job  ids  and
                           allocated nodes counter.

          Triggers         Slurmctld triggers

          Wiki             Sched/wiki and wiki2 communications

   DefMemPerCPU
          Default   real  memory  size  available  per  allocated  CPU  in
          MegaBytes.  Used to avoid over-subscribing  memory  and  causing
          paging.   DefMemPerCPU  would  generally  be  used if individual
          processors are allocated to  jobs  (SelectType=select/cons_res).
          The  default value is 0 (unlimited).  Also see DefMemPerNode and
          MaxMemPerCPU.   DefMemPerCPU  and  DefMemPerNode  are   mutually
          exclusive.

          NOTE:  Enforcement  of memory limits currently requires enabling
          of accounting, which samples memory  use  on  a  periodic  basis
          (data need not be stored, just collected).

   DefMemPerNode
          Default  real  memory  size  available  per  allocated  node  in
          MegaBytes.  Used to avoid over-subscribing  memory  and  causing
          paging.   DefMemPerNode  would  generally be used if whole nodes
          are allocated to jobs (SelectType=select/linear)  and  resources
          are  over-subscribed (OverSubscribe=yes or OverSubscribe=force).
          The default value is 0 (unlimited).  Also see  DefMemPerCPU  and
          MaxMemPerNode.   DefMemPerCPU  and  DefMemPerNode  are  mutually
          exclusive.

          NOTE: Enforcement of memory limits currently  requires  enabling
          of  accounting,  which  samples  memory  use on a periodic basis
          (data need not be stored, just collected).

   DefaultStorageHost
          The default name of the machine hosting the  accounting  storage
          and  job  completion  databases.   Only  used  for database type
          storage  plugins  and   when   the   AccountingStorageHost   and
          JobCompHost have not been defined.

   DefaultStorageLoc
          The  fully  qualified  file name where accounting records and/or
          job completion records are written when  the  DefaultStorageType
          is  "filetxt"  or  the  name  of  the  database where accounting
          records and/or  job  completion  records  are  stored  when  the
          DefaultStorageType is a database.  Also see AccountingStorageLoc
          and JobCompLoc.

   DefaultStoragePass
          The password used to gain access to the database  to  store  the
          accounting and job completion data.  Only used for database type
          storage    plugins,     ignored     otherwise.      Also     see
          AccountingStoragePass and JobCompPass.

   DefaultStoragePort
          The   listening  port  of  the  accounting  storage  and/or  job
          completion database server.  Only used for database type storage
          plugins,  ignored otherwise.  Also see AccountingStoragePort and
          JobCompPort.

   DefaultStorageType
          The  accounting  and  job  completion  storage  mechanism  type.
          Acceptable  values  at  present  include  "filetxt", "mysql" and
          "none".  The value "filetxt"  indicates  that  records  will  be
          written  to a file.  The value "mysql" indicates that accounting
          records will be written to a MySQL  or  MariaDB  database.   The
          default  value  is  "none",  which  means  that  records are not
          maintained.  Also see AccountingStorageType and JobCompType.

   DefaultStorageUser
          The user account for accessing the accounting storage and/or job
          completion  database.   Only  used  for  database  type  storage
          plugins, ignored otherwise.  Also see AccountingStorageUser  and
          JobCompUser.

   DisableRootJobs
          If  set  to  "YES" then user root will be prevented from running
          any jobs.  The default value is "NO", meaning user root will  be
          able  to  execute  jobs.   DisableRootJobs  may  also  be set by
          partition.

   EioTimeout
          The number of seconds srun waits for  slurmstepd  to  close  the
          TCP/IP   connection   used   to  relay  data  between  the  user
          application and srun when the user application  terminates.  The
          default value is 60 seconds.  May not exceed 65533.

   EnforcePartLimits
          If set to "ALL" then jobs which exceed a partition's size and/or
          time limits will be rejected  at  submission  time.  If  job  is
          submitted  to  multiple  partitions,  the  job  must satisfy the
          limits on all the requested paritions. If set to "NO"  then  the
          job  will  be  accepted  and  remain  queued until the partition
          limits are altered(Time and Node Limits).  If set  to  "ANY"  or
          "YES"  a  job must satisfy any of the requested partitions to be
          submitted. The default value is "NO".   NOTE:  If  set,  then  a
          job's QOS can not be used to exceed partition limits.

   Epilog Fully  qualified pathname of a script to execute as user root on
          every   node    when    a    user's    job    completes    (e.g.
          "/usr/local/slurm/epilog").  A  glob  pattern (See glob (7)) may
          also  be  used  to  run  more  than  one  epilog  script   (e.g.
          "/etc/slurm/epilog.d/*").  The  Epilog  script or scripts may be
          used to purge files, disable user login, etc.  By default  there
          is   no   epilog.   See  Prolog  and  Epilog  Scripts  for  more
          information.

   EpilogMsgTime
          The number of microseconds that the slurmctld daemon requires to
          process  an  epilog  completion message from the slurmd daemons.
          This parameter  can  be  used  to  prevent  a  burst  of  epilog
          completion  messages  from  being  sent  at  the same time which
          should help prevent lost messages  and  improve  throughput  for
          large jobs.  The default value is 2000 microseconds.  For a 1000
          node job, this spreads the epilog completion messages  out  over
          two seconds.

   EpilogSlurmctld
          Fully  qualified  pathname  of  a  program  for the slurmctld to
          execute   upon   termination   of   a   job   allocation   (e.g.
          "/usr/local/slurm/epilog_controller").   The program executes as
          SlurmUser, which gives it permission to drain nodes and  requeue
          the job if a failure occurs (See scontrol(1)).  Exactly what the
          program does and how it accomplishes this is completely  at  the
          discretion  of  the system administrator.  Information about the
          job being initiated, it's allocated nodes, etc.  are  passed  to
          the  program using environment variables.  See Prolog and Epilog
          Scripts for more information.

   ExtSensorsFreq
          The   external   sensors   plugin   sampling    interval.     If
          ExtSensorsType=ext_sensors/none, this parameter is ignored.  For
          all other values of ExtSensorsType, this parameter is the number
          of   seconds  between  external  sensors  samples  for  hardware
          components (nodes, switches, etc.) The default  value  is  zero.
          This  value  disables  external  sensors  sampling.  Note:  This
          parameter does not affect external sensors data  collection  for
          jobs/steps.

   ExtSensorsType
          Identifies  the  plugin  to  be  used  for external sensors data
          collection.  Slurmctld calls this  plugin  to  collect  external
          sensors  data for jobs/steps and hardware components. In case of
          node sharing between  jobs  the  reported  values  per  job/step
          (through  sstat  or  sacct)  may not be accurate.  See also "man
          ext_sensors.conf".

          Configurable values at present are:

          ext_sensors/none    No external sensors data is collected.

          ext_sensors/rrd     External sensors data is collected from  the
                              RRD database.

   FairShareDampeningFactor
          Dampen  the  effect of exceeding a user or group's fair share of
          allocated resources. Higher values will provides greater ability
          to differentiate between exceeding the fair share at high levels
          (e.g. a value of 1  results  in  almost  no  difference  between
          overconsumption  by  a  factor of 10 and 100, while a value of 5
          will result in  a  significant  difference  in  priority).   The
          default value is 1.

   FastSchedule
          Controls how a node's configuration specifications in slurm.conf
          are used.  If the number of node configuration  entries  in  the
          configuration  file  is  significantly  lower than the number of
          nodes,  setting  FastSchedule  to  1  will  permit  much  faster
          scheduling  decisions to be made.  (The scheduler can just check
          the values in a few configuration records  instead  of  possibly
          thousands   of   node  records.)   Note  that  on  systems  with
          hyper-threading, the processor count reported by the  node  will
          be  twice  the actual processor count.  Consider which value you
          want to be used for scheduling purposes.

          0    Base scheduling decisions upon the actual configuration  of
               each individual node except that the node's processor count
               in Slurm's configuration must  match  the  actual  hardware
               configuration      if      PreemptMode=suspend,gang      or
               SelectType=select/cons_res are configured  (both  of  those
               plugins  maintain  resource  allocation  information  using
               bitmaps for the cores in the system and must remain static,
               while  the  node's memory and disk space can be established
               later).

          1 (default)
               Consider  the  configuration  of  each  node  to  be   that
               specified in the slurm.conf configuration file and any node
               with less than the configured  resources  will  be  set  to
               DRAIN.

          2    Consider   the  configuration  of  each  node  to  be  that
               specified in the slurm.conf configuration file and any node
               with  less  than  the  configured resources will not be set
               DRAIN.  This option is generally only  useful  for  testing
               purposes.

   FirstJobId
          The job id to be used for the first submitted to Slurm without a
          specific  requested  value.  Job  id   values   generated   will
          incremented  by  1  for each subsequent job. This may be used to
          provide a meta-scheduler with a job id space which  is  disjoint
          from  the  interactive  jobs.  The default value is 1.  Also see
          MaxJobId

   GetEnvTimeout
          Used for Moab scheduled jobs only. Controls how long job  should
          wait  in  seconds  for  loading  the  user's  environment before
          attempting to load it from a cache file. Applies when  the  srun
          or sbatch --get-user-env option is used. If set to 0 then always
          load the user's environment from the cache  file.   The  default
          value is 2 seconds.

   GresTypes
          A  comma  delimited  list  of  generic  resources to be managed.
          These generic resources may have an associated plugin  available
          to  provide  additional functionality.  No generic resources are
          managed by default.  Insure this parameter is consistent  across
          all  nodes  in  the cluster for proper operation.  The slurmctld
          daemon must be restarted for changes to this parameter to become
          effective.

   GroupUpdateForce
          If  set  to a non-zero value, then information about which users
          are members of groups allowed to use a partition will be updated
          periodically,  even  when  there  have  been  no  changes to the
          /etc/group file.  Otherwise group  member  information  will  be
          updated  periodically  only after the /etc/group file is updated
          The default value is 1.  Also see the GroupUpdateTime parameter.

   GroupUpdateTime
          Controls  how  frequently  information  about  which  users  are
          members  of  groups  allowed to use a partition will be updated,
          and how long user group membership lists will  be  cached.   The
          time  interval  is  given in seconds with a default value of 600
          seconds and a maximum value of 4095 seconds.  A  value  of  zero
          will  prevent periodic updating of group membership information.
          Also see the GroupUpdateForce parameter.

   HealthCheckInterval
          The    interval    in    seconds    between    executions     of
          HealthCheckProgram.   The  default value is zero, which disables
          execution.

   HealthCheckNodeState
          Identify what node states should execute the HealthCheckProgram.
          Multiple  state  values may be specified with a comma separator.
          The default value is ANY to execute on nodes in any state.

          ALLOC       Run  on  nodes  in  the  ALLOC   state   (all   CPUs
                      allocated).

          ANY         Run on nodes in any state.

          CYCLE       Rather  than running the health check program on all
                      nodes at the same time, cycle through running on all
                      compute    nodes   through   the   course   of   the
                      HealthCheckInterval.  May  be  combined   with   the
                      various node state options.

          IDLE        Run on nodes in the IDLE state.

          MIXED       Run  on nodes in the MIXED state (some CPUs idle and
                      other CPUs allocated).

   HealthCheckProgram
          Fully qualified pathname of a script to  execute  as  user  root
          periodically   on   all  compute  nodes  that  are  not  in  the
          NOT_RESPONDING state. This program may be  used  to  verify  the
          node  is fully operational and DRAIN the node or send email if a
          problem is detected.  Any action to be taken must be  explicitly
          performed   by   the  program  (e.g.  execute  "scontrol  update
          NodeName=foo State=drain Reason=tmp_file_system_full" to drain a
          node).    The   execution   interval  is  controlled  using  the
          HealthCheckInterval parameter.  Note that the HealthCheckProgram
          will  be  executed at the same time on all nodes to minimize its
          impact upon parallel programs.  This program is will  be  killed
          if  it  does  not  terminate  normally  within  60  seconds.  By
          default, no program will be executed.

   InactiveLimit
          The interval, in  seconds,  after  which  a  non-responsive  job
          allocation  command (e.g. srun or salloc) will result in the job
          being terminated. If the node on which the command  is  executed
          fails  or the command abnormally terminates, this will terminate
          its job allocation.  This option has no effect upon batch  jobs.
          When  setting  a  value, take into consideration that a debugger
          using srun to launch an application may leave the  srun  command
          in  a stopped state for extended periods of time.  This limit is
          ignored for jobs running in partitions with  the  RootOnly  flag
          set  (the  scheduler running as root will be responsible for the
          job).  The default value is unlimited (zero) and may not  exceed
          65533 seconds.

   JobAcctGatherType
          The job accounting mechanism type.  Acceptable values at present
          include  "jobacct_gather/aix"  (for   AIX   operating   system),
          "jobacct_gather/linux"    (for    Linux    operating    system),
          "jobacct_gather/cgroup" and "jobacct_gather/none" (no accounting
          data  collected).   The  default value is "jobacct_gather/none".
          "jobacct_gather/cgroup" is a  plugin  for  the  Linux  operating
          system  that  uses cgroups to collect accounting statistics. The
          plugin collects the following statistics: From the cgroup memory
          subsystem:  memory.usage_in_bytes  (reported as 'pages') and rss
          from memory.stat (reported as 'rss'). From  the  cgroup  cpuacct
          subsystem:  user  cpu  time  and  system  cpu  time. No value is
          provided by cgroups for virtual memory size ('vsize').  In order
          to     use     the     sstat     tool,     "jobacct_gather/aix",
          "jobacct_gather/linux",  or  "jobacct_gather/cgroup"   must   be
          configured.
          NOTE: Changing this configuration parameter changes the contents
          of the messages between Slurm daemons.  Any  previously  running
          job  steps  are managed by a slurmstepd daemon that will persist
          through the lifetime of  that  job  step  and  not  change  it's
          communication protocol. Only change this configuration parameter
          when there are no running job steps.

   JobAcctGatherFrequency
          The  job  accounting  and  profiling  sampling  intervals.   The
          supported format is follows:

          JobAcctGatherFrequency=<datatype>=<interval>
                      where   <datatype>=<interval>   specifies  the  task
                      sampling interval for the jobacct_gather plugin or a
                      sampling  interval  for  a  profiling  type  by  the
                      acct_gather_profile   plugin.    Multiple,    comma-
                      separated  <datatype>=<interval>  intervals  may  be
                      specified. Supported datatypes are as follows:

                      task=<interval>
                             where  <interval>  is   the   task   sampling
                             interval  in  seconds  for the jobacct_gather
                             plugins  and  for  task  profiling   by   the
                             acct_gather_profile plugin.

                      energy=<interval>
                             where  <interval> is the sampling interval in
                             seconds  for  energy  profiling   using   the
                             acct_gather_energy plugin

                      network=<interval>
                             where  <interval> is the sampling interval in
                             seconds for infiniband  profiling  using  the
                             acct_gather_infiniband plugin.

                      filesystem=<interval>
                             where  <interval> is the sampling interval in
                             seconds for filesystem  profiling  using  the
                             acct_gather_filesystem plugin.

          The default value for task sampling interval
          is  30  seconds. The default value for all other intervals is 0.
          An interval of 0 disables sampling of the  specified  type.   If
          the  task  sampling  interval  is  0,  accounting information is
          collected only at job termination (reducing  Slurm  interference
          with the job).
          Smaller  (non-zero)  values  have  a  greater  impact  upon  job
          performance, but a value of 30  seconds  is  not  likely  to  be
          noticeable for applications having less than 10,000 tasks.
          Users  can  independently  override  each  interval on a per job
          basis using the --acctg-freq option when submitting the job.

   JobAcctGatherParams
          Arbitrary  parameters  for  the  job   account   gather   plugin
          Acceptable values at present include:

          NoShared            Exclude shared memory from accounting.

          UsePss              Use  PSS  value  instead of RSS to calculate
                              real usage of memory.  The PSS value will be
                              saved as RSS.

          NoOverMemoryKill    Do  not  kill  process  that  uses more then
                              requested memory.  This parameter should  be
                              used  with  caution  as  if jobs exceeds its
                              memory  allocation  it  may   affect   other
                              processes and/or machine health.

   JobCheckpointDir
          Specifies  the  default  directory  for  storing  or reading job
          checkpoint information. The data  stored  here  is  only  a  few
          thousand  bytes  per  job  and  includes  information  needed to
          resubmit the job request, not job's memory image. The  directory
          must  be readable and writable by SlurmUser, but not writable by
          regular users. The job memory  images  may  be  in  a  different
          location  as  specified by --checkpoint-dir option at job submit
          time or scontrol's ImageDir option.

   JobCompHost
          The name of the machine hosting  the  job  completion  database.
          Only  used for database type storage plugins, ignored otherwise.
          Also see DefaultStorageHost.

   JobCompLoc
          The fully qualified file name where job completion  records  are
          written   when  the  JobCompType  is  "jobcomp/filetxt"  or  the
          database where  job  completion  records  are  stored  when  the
          JobCompType    is   a   database   or   an   url   with   format
          http://yourelasticserver:port where job completion  records  are
          indexed  when  the JobCompType is "jobcomp/elasticsearch".  Also
          see DefaultStorageLoc.

   JobCompPass
          The password used to gain access to the database  to  store  the
          job  completion  data.   Only  used  for  database  type storage
          plugins, ignored otherwise.  Also see DefaultStoragePass.

   JobCompPort
          The listening port of the job completion database server.   Only
          used for database type storage plugins, ignored otherwise.  Also
          see DefaultStoragePort.

   JobCompType
          The job completion logging mechanism type.  Acceptable values at
          present    include    "jobcomp/none",   "jobcomp/elasticsearch",
          "jobcomp/filetxt", "jobcomp/mysql" and  "jobcomp/script"".   The
          default  value  is  "jobcomp/none",  which  means  that upon job
          completion the record of the job is purged from the system.   If
          using  the  accounting  infrastructure this plugin may not be of
          interest since the information here  is  redundant.   The  value
          "jobcomp/elasticsearch"  indicates  that  a  record  of  the job
          should be written to an Elasticsearch server  specified  by  the
          JobCompLoc  parameter.   The  value  "jobcomp/filetxt" indicates
          that a record of the job  should  be  written  to  a  text  file
          specified    by    the    JobCompLoc   parameter.    The   value
          "jobcomp/mysql" indicates that a record of  the  job  should  be
          written  to  a  MySQL  or  MariaDB  database  specified  by  the
          JobCompLoc parameter.  The value "jobcomp/script" indicates that
          a script specified by the JobCompLoc parameter is to be executed
          with environment variables indicating the job information.

   JobCompUser
          The user account for  accessing  the  job  completion  database.
          Only  used for database type storage plugins, ignored otherwise.
          Also see DefaultStorageUser.

   JobContainerType
          Identifies the plugin to be used for job tracking.   The  slurmd
          daemon  must  be  restarted  for a change in JobContainerType to
          take effect.   NOTE:  The  JobContainerType  applies  to  a  job
          allocation,   while   ProctrackType   applies   to   job  steps.
          Acceptable values at present include:

          job_container/cncu  used only for Cray systems (CNCU  =  Compute
                              Node Clean Up)

          job_container/none  used for all other system types

   JobCredentialPrivateKey
          Fully qualified pathname of a file containing a private key used
          for authentication by Slurm daemons.  This parameter is  ignored
          if CryptoType=crypto/munge.

   JobCredentialPublicCertificate
          Fully  qualified pathname of a file containing a public key used
          for authentication by Slurm daemons.  This parameter is  ignored
          if CryptoType=crypto/munge.

   JobFileAppend
          This  option controls what to do if a job's output or error file
          exist when the job is started.  If JobFileAppend  is  set  to  a
          value  of  1, then append to the existing file.  By default, any
          existing file is truncated.

   JobRequeue
          This option controls the default ability for batch  jobs  to  be
          requeued.    Jobs   may  be  requeued  explicitly  by  a  system
          administrator, after node  failure,  or  upon  preemption  by  a
          higher priority job.  If JobRequeue is set to a value of 1, then
          batch job may be requeued  unless  explicitly  disabled  by  the
          user.  If JobRequeue is set to a value of 0, then batch job will
          not be requeued unless explicitly enabled by the user.  Use  the
          sbatch  --no-requeue  or  --requeue option to change the default
          behavior for individual jobs.  The default value is 1.

   JobSubmitPlugins
          A comma delimited list of job submission  plugins  to  be  used.
          The  specified  plugins  will  be  executed in the order listed.
          These are intended to be site-specific plugins which can be used
          to  set  default  job  parameters and/or logging events.  Sample
          plugins available in the distribution include  "all_partitions",
          "defaults",  "logging", "lua", and "partition".  For examples of
          use,  see  the  Slurm  code  in   "src/plugins/job_submit"   and
          "contribs/lua/job_submit*.lua"  then  modify the code to satisfy
          your needs.  Slurm can be configured to use multiple  job_submit
          plugins if desired, however the lua plugin will only execute one
          lua script named "job_submit.lua" located in the default  script
          directory  (typically the subdirectory "etc" of the installation
          directory).  No job submission plugins are used by default.

   KeepAliveTime
          Specifies how long sockets communications used between the  srun
          command   and  its  slurmstepd  process  are  kept  alive  after
          disconnect.  Longer values can be used to improve reliability of
          communications  in  the  event of network failures.  The default
          value leaves the system default value.  The value may not exceed
          65533.

   KillOnBadExit
          If  set to 1, the job will be terminated immediately when one of
          the processes is crashed or aborted. With the default  value  of
          0,  if  one  of  the  processes  is crashed or aborted the other
          processes will continue to  run.  The  user  can  override  this
          configuration parameter by using srun's -K, --kill-on-bad-exit.

   KillWait
          The interval, in seconds, given to a job's processes between the
          SIGTERM and SIGKILL signals upon reaching its  time  limit.   If
          the job fails to terminate gracefully in the interval specified,
          it will  be  forcibly  terminated.   The  default  value  is  30
          seconds.  The value may not exceed 65533.

   NodeFeaturesPlugins
          Identifies  the  plugins to be used for support of node features
          which can change through time. For example, a node  which  might
          be  booted  with various BIOS setting. This is supported through
          the use  of  a  node's  active_features  and  available_features
          information.  Acceptable values at present include:

          node_features/knl_cray
                              used   only   for   Intel   Knights  Landing
                              processors (KNL) on Cray systems

   LaunchParameters
          Identifies options to the job launch plugin.  Acceptable  values
          include:

          test_exec   Validate the executable command's existence prior to
                      attemping launch on the compute nodes

   LaunchType
          Identifies the mechanism to be used to launch application tasks.
          Acceptable values include:

          launch/aprun   For  use  with  Cray  systems  with  ALPS and the
                         default value for those systems

          launch/poe     For use with IBM Parallel  Environment  (PE)  and
                         the  default  value  for systems with the IBM NRT
                         library installed.

          launch/runjob  For use  with  IBM  BlueGene/Q  systems  and  the
                         default value for those systems

          launch/slurm   For  all  other systems and the default value for
                         those systems

   Licenses
          Specification of licenses (or other resources available  on  all
          nodes  of  the cluster) which can be allocated to jobs.  License
          names can optionally be followed by a colon  and  count  with  a
          default  count  of  one.  Multiple license names should be comma
          separated  (e.g.   "Licenses=foo:4,bar").    Note   that   Slurm
          prevents  jobs  from  being  scheduled if their required license
          specification is not available.  Slurm  does  not  prevent  jobs
          from  using  licenses  that are not explicitly listed in the job
          submission specification.

   LogTimeFormat
          Format of the timestamp  in  slurmctld  and  slurmd  log  files.
          Accepted   values   are   "iso8601",   "iso8601_ms",  "rfc5424",
          "rfc5424_ms",  "clock",  "short"  and  "thread_id".  The  values
          ending  in "_ms" differ from the ones without in that fractional
          seconds with millisecond  precision  are  printed.  The  default
          value is "iso8601_ms". The "rfc5424" formats are the same as the
          "iso8601" formats except that the timezone value is also  shown.
          The  "clock"  format shows a timestamp in microseconds retrieved
          with the C standard clock() function. The "short"  format  is  a
          short  date  and  time  format. The "thread_id" format shows the
          timestamp in the C standard ctime() function  form  without  the
          year but including the microseconds, the daemon's process ID and
          the current thread name and ID.

   MailProg
          Fully qualified pathname to the program used to send  email  per
          user request.  The default value is "/usr/bin/mail".

   MaxArraySize
          The  maximum  job  array size.  The maximum job array task index
          value will be one less than MaxArraySize to allow for  an  index
          value  of zero.  Configure MaxArraySize to 0 in order to disable
          job array use.  The value may not exceed 4000001.  The value  of
          MaxJobCount  should  be  much  larger  than  MaxArraySize.   The
          default value is 1001.

   MaxJobCount
          The maximum number of jobs Slurm can have in its active database
          at  one  time.  Set  the  values of MaxJobCount and MinJobAge to
          insure the slurmctld daemon does not exhaust its memory or other
          resources.  Once  this  limit  is  reached,  requests  to submit
          additional jobs will fail. The  default  value  is  10000  jobs.
          NOTE:  Each  task  of  a job array counts as one job even though
          they will not occupy separate  job  records  until  modified  or
          initiated.   Performance can suffer with more than a few hundred
          thousand jobs.  Setting per MaxSubmitJobs per user is  generally
          valuable  to  prevent a single user from filling the system with
          jobs.   This  is  accomplished  using   Slurm's   database   and
          configuring  enforcement of resource limits.  This value may not
          be reset via "scontrol reconfig".  It  only  takes  effect  upon
          restart of the slurmctld daemon.

   MaxJobId
          The  maximum  job  id  to  be  used  for jobs submitted to Slurm
          without a specific  requested  value  EXCEPT  for  jobs  visible
          between clusters.  Job id values generated will incremented by 1
          for each subsequent job.  Once MaxJobId is reached, the next job
          will be assigned FirstJobId.  The default value is 2,147,418,112
          (0x7fff0000).  Jobs visible across clusters will always  have  a
          job ID of 2,147,483,648 or higher.  Also see FirstJobId.

   MaxMemPerCPU
          Maximum   real  memory  size  available  per  allocated  CPU  in
          MegaBytes.  Used to avoid over-subscribing  memory  and  causing
          paging.   MaxMemPerCPU  would  generally  be  used if individual
          processors are allocated to  jobs  (SelectType=select/cons_res).
          The  default  value is 0 (unlimited).  Also see DefMemPerCPU and
          MaxMemPerNode.   MaxMemPerCPU  and  MaxMemPerNode  are  mutually
          exclusive.

          NOTE:  Enforcement  of memory limits currently requires enabling
          of accounting, which samples memory  use  on  a  periodic  basis
          (data need not be stored, just collected).

          NOTE:  If  a  job  specifies a memory per CPU limit that exceeds
          this system limit, that  job's  count  of  CPUs  per  task  will
          automatically  be  increased. This may result in the job failing
          due to CPU count limits.

   MaxMemPerNode
          Maximum  real  memory  size  available  per  allocated  node  in
          MegaBytes.   Used  to  avoid over-subscribing memory and causing
          paging.  MaxMemPerNode would generally be used  if  whole  nodes
          are  allocated  to jobs (SelectType=select/linear) and resources
          are over-subscribed (OverSubscribe=yes or  OverSubscribe=force).
          The  default value is 0 (unlimited).  Also see DefMemPerNode and
          MaxMemPerCPU.   MaxMemPerCPU  and  MaxMemPerNode  are   mutually
          exclusive.

          NOTE:  Enforcement  of memory limits currently requires enabling
          of accounting, which samples memory  use  on  a  periodic  basis
          (data need not be stored, just collected).

   MaxStepCount
          The  maximum  number  of  steps  that any job can initiate. This
          parameter is intended to limit the effect of bad batch  scripts.
          The default value is 40000 steps.

   MaxTasksPerNode
          Maximum  number of tasks Slurm will allow a job step to spawn on
          a single node. The default  MaxTasksPerNode  is  512.   May  not
          exceed 65533.

   MCSParameters
          MCS  =  Multi-Category  Security  MCS  Plugin  Parameters.   The
          supported parameters are specific to the MCSPlugin.  Changes  to
          this  value take effect when the Slurm daemons are reconfigured.
          More    information    about    MCS    is     available     here
          <http://slurm.schedmd.com/mcs_Plugins.html>.

   MCSPlugin
          MCS  =  Multi-Category  Security : associate a security label to
          jobs and ensure that nodes can only be shared among  jobs  using
          the same security label.  Acceptable values include:

          mcs/none    is  the default value.  No security label associated
                      with jobs, no particular security  restriction  when
                      sharing nodes among jobs.

          mcs/group   only users with the same group can share the nodes.

          mcs/user    a node cannot be shared with other users.

   MemLimitEnforce
          If  set to "no" then Slurm will not terminate the job or the job
          step if they exceeds the value requested using the --mem-per-cpu
          option  of  salloc/sbatch/srun.  This  is useful if jobs need to
          specify --mem-per-cpu for scheduling  but  they  should  not  be
          terminate  if they exceed the estimated value. The default value
          is 'yes', terminate the job/step if exceed the requested memory.

   MessageTimeout
          Time permitted for a round-trip  communication  to  complete  in
          seconds.  Default  value  is 10 seconds. For systems with shared
          nodes, the slurmd daemon could  be  paged  out  and  necessitate
          higher values.

   MinJobAge
          The  minimum  age of a completed job before its record is purged
          from Slurm's active database. Set the values of MaxJobCount and
            to insure the slurmctld daemon does not exhaust its memory  or
          other  resources.  The default value is 300 seconds.  A value of
          zero prevents any job record purging.   In  order  to  eliminate
          some  possible  race  conditions, the minimum non-zero value for
          MinJobAge recommended is 2.

   MpiDefault
          Identifies the default  type  of  MPI  to  be  used.   Srun  may
          override  this  configuration  parameter in any case.  Currently
          supported  versions  include:  lam,   mpich1_p4,   mpich1_shmem,
          mpichgm,  mpichmx,  mvapich, none (default, which works for many
          other versions of MPI)  and  openmpi.   pmi2,  More  information
          about        MPI        use        is       available       here
          <http://slurm.schedmd.com/mpi_guide.html>.

   MpiParams
          MPI parameters.  Used to identify ports used by OpenMPI only and
          the  input  format is "ports=12000-12999" to identify a range of
          communication ports to be used.

   MsgAggregationParams
          Message  aggregation  parameters.  Message  aggregation  is   an
          optional feature that may improve system performance by reducing
          the number  of  separate  messages  passed  between  nodes.  The
          feature  works  by  routing messages through one or more message
          collector nodes between their source and destination  nodes.  At
          each collector node, messages with the same destination received
          during a defined message collection window are packaged  into  a
          single composite message. When the window expires, the composite
          message is sent to the next collector node on the route  to  its
          destination.  The route between each source and destination node
          is provided by the Route plugin. When  a  composite  message  is
          received  at  its  destination  node,  the original messages are
          extracted and processed as if they had been sent directly.
          Currently,  the  only  message  types   supported   by   message
          aggregation  are the node registration, batch script completion,
          step completion, and epilog complete messages.
          The format for this parameter is as follows:

          MsgAggregationParams=<option>=<value>
                      where <option>=<value> specify a particular  control
                      variable. Multiple, comma-separated <option>=<value>
                      pairs may be specified.  Supported  options  are  as
                      follows:

                      WindowMsgs=<number>
                             where  <number>  is  the  maximum  number  of
                             messages in each message collection window.

                      WindowTime=<time>
                             where <time> is the maximum elapsed  time  in
                             milliseconds   of   each  message  collection
                             window.

          A window expires when  either  WindowMsgs  or
          WindowTime is
          reached.  By default, message aggregation is disabled. To enable
          the feature, set WindowMsgs to  a  value  greater  than  1.  The
          default value for WindowTime is 100 milliseconds.

   OverTimeLimit
          Number  of  minutes  by  which  a  job can exceed its time limit
          before being canceled.  The configured job time limit is treated
          as  a  soft  limit.   Adding  OverTimeLimit  to  the  soft limit
          provides a hard limit, at which point the job is canceled.  This
          is particularly useful for backfill scheduling, which bases upon
          each job's soft time limit.  The default value is zero.  May not
          exceed  exceed  65533  minutes.   A value of "UNLIMITED" is also
          supported.

   PluginDir
          Identifies the places in which to look for Slurm plugins.   This
          is   a  colon-separated  list  of  directories,  like  the  PATH
          environment     variable.      The     default     value      is
          "/usr/local/lib/slurm".

   PlugStackConfig
          Location of the config file for Slurm stackable plugins that use
          the  Stackable  Plugin  Architecture  for  Node  job  (K)control
          (SPANK).  This provides support for a highly configurable set of
          plugins to be called before and/or after execution of each  task
          spawned  as  part  of  a  user's  job step.  Default location is
          "plugstack.conf" in the same directory as the system slurm.conf.
          For more information on SPANK plugins, see the spank(8) manual.

   PowerParameters
          System  power  management  parameters.  The supported parameters
          are specific to the PowerPlugin.  Changes  to  this  value  take
          effect   when   the   Slurm   daemons  are  reconfigured.   More
          information about system  power  management  is  available  here
          <http://slurm.schedmd.com/power_mgmt.html>.    Options   current
          supported by any plugins are listed below.

          balance_interval=#
                 Specifies the time interval, in seconds, between attempts
                 to  rebalance  power  caps  across  the nodes.  This also
                 controls the frequency at which Slurm attempts to collect
                 current  power  consumption  data  (old  data may be used
                 until  new  data  is  available   from   the   underlying
                 infrastructure  and  values  below  10  seconds  are  not
                 recommended for Cray systems).  The default value  is  30
                 seconds.  Supported by the power/cray plugin.

          capmc_path=
                 Specifies  the  absolute  path of the capmc command.  The
                 default  value  is   "/opt/cray/capmc/default/bin/capmc".
                 Supported by the power/cray plugin.

          cap_watts=#
                 Specifies  the total power limit to be established across
                 all compute nodes managed by Slurm.  A value  of  0  sets
                 every compute node to have an unlimited cap.  The default
                 value is 0.  Supported by the power/cray plugin.

          decrease_rate=#
                 Specifies the maximum rate of change in the power cap for
                 a  node  where  the actual power usage is below the power
                 cap  by  an  amount  greater  than  lower_threshold  (see
                 below).   Value represents a percentage of the difference
                 between a node's minimum and maximum  power  consumption.
                 The  default  value  is  50  percent.   Supported  by the
                 power/cray plugin.

          get_timeout=#
                 Amount of time allowed to get power state information  in
                 milliseconds.  The default value is 5,000 milliseconds or
                 5  seconds.   Supported  by  the  power/cray  plugin  and
                 represents  the  time  allowed  for  the capmc command to
                 respond to various "get" options.

          increase_rate=#
                 Specifies the maximum rate of change in the power cap for
                 a   node   where   the   actual  power  usage  is  within
                 upper_threshold (see below)  of  the  power  cap.   Value
                 represents  a  percentage  of  the  difference  between a
                 node's  minimum  and  maximum  power  consumption.    The
                 default value is 20 percent.  Supported by the power/cray
                 plugin.

          job_level
                 All nodes associated with every job will  have  the  same
                 power   cap,  to  the  extent  possible.   Also  see  the
                 --power=level option on the job submission commands.

          job_no_level
                 Disable the user's ability to set every  node  associated
                 with  a  job  to the same power cap.  Each node will have
                 it's power cap  set  independently.   This  disables  the
                 --power=level option on the job submission commands.

          lower_threshold=#
                 Specify a lower power consumption threshold.  If a node's
                 current power consumption is below this percentage of its
                 current  cap,  then  its  power cap will be reduced.  The
                 default value is 90 percent.  Supported by the power/cray
                 plugin.

          recent_job=#
                 If  a job has started or resumed execution (from suspend)
                 on a compute node within this number of seconds from  the
                 current  time,  the node's power cap will be increased to
                 the  maximum.   The  default  value   is   300   seconds.
                 Supported by the power/cray plugin.

          set_timeout=#
                 Amount  of time allowed to set power state information in
                 milliseconds.  The default value is  30,000  milliseconds
                 or  30  seconds.   Supported by the power/cray plugin and
                 represents the time allowed  for  the  capmc  command  to
                 respond to various "set" options.

          set_watts=#
                 Specifies  the  power  limit  to  be set on every compute
                 nodes managed by Slurm.  Every node gets this same  power
                 cap  and  there  is  no variation through time based upon
                 actual  power  usage  on  the  node.   Supported  by  the
                 power/cray plugin.

          upper_threshold=#
                 Specify  an  upper  power  consumption  threshold.   If a
                 node's current power consumption is above this percentage
                 of  its current cap, then its power cap will be increased
                 to the extent possible.  The default value is 95 percent.
                 Supported by the power/cray plugin.

   PowerPlugin
          Identifies   the   plugin  used  for  system  power  management.
          Currently supported plugins include: cray and none.  Changes  to
          this  value  require  restarting  Slurm  daemons to take effect.
          More information about system power management is available here
          <http://slurm.schedmd.com/power_mgmt.html>.    By   default,  no
          power plugin is loaded.

   PreemptMode
          Enables gang scheduling and/or controls the  mechanism  used  to
          preempt  jobs.   When the PreemptType parameter is set to enable
          preemption, the PreemptMode selects the default  mechanism  used
          to  preempt the lower priority jobs for the cluster. PreemptMode
          may be specified on a  per  partition  basis  to  override  this
          default value if PreemptType=preempt/partition_prio, but a valid
          default PreemptMode value must be specified for the cluster as a
          whole  when  preemption  is enabled.  The GANG option is used to
          enable gang scheduling  independent  of  whether  preemption  is
          enabled  (the  PreemptType  setting).   The  GANG  option can be
          specified in addition to a  PreemptMode  setting  with  the  two
          options  comma separated.  The SUSPEND option requires that gang
          scheduling be enabled (i.e, "PreemptMode=SUSPEND,GANG").

          OFF         is the default value and disables job preemption and
                      gang scheduling.  This is the only option compatible
                      with           SchedulerType=sched/wiki           or
                      SchedulerType=sched/wiki2  (used  by  Maui  and Moab
                      respectively, which provide their own job preemption
                      functionality).

          CANCEL      always cancel the job.

          CHECKPOINT  preempts jobs by checkpointing them (if possible) or
                      canceling them.

          GANG        enables gang scheduling (time slicing)  of  jobs  in
                      the   same  partition.   NOTE:  Gang  scheduling  is
                      performed  independently  for  each  partition,   so
                      configuring  partitions  with  overlapping nodes and
                      gang scheduling is generally not recommended.

          REQUEUE     preempts jobs by requeuing  them  (if  possible)  or
                      canceling  them.   For jobs to be requeued they must
                      have the --requeue sbatch option set or the  cluster
                      wide  JobRequeue parameter in slurm.conf must be set
                      to one.

          SUSPEND     If PreemptType=preempt/partition_prio is  configured
                      then   suspend  and  automatically  resume  the  low
                      priority  jobs.    If   PreemptType=preempt/qos   is
                      configured,  then  the  jobs  sharing resources will
                      always time slice  rather  than  one  job  remaining
                      suspended.   The  SUSPEND  may only be used with the
                      GANG option (the gang scheduler module performs  the
                      job resume operation).

   PreemptType
          This  specifies  the  plugin  used to identify which jobs can be
          preempted in order to start a pending job.

          preempt/none
                 Job preemption is disabled.  This is the default.

          preempt/partition_prio
                 Job preemption is based  upon  partition  priority  tier.
                 Jobs  in  higher priority partitions (queues) may preempt
                 jobs  from  lower  priority  partitions.   This  is   not
                 compatible with PreemptMode=OFF.

          preempt/qos
                 Job  preemption rules are specified by Quality Of Service
                 (QOS) specifications in the Slurm database.  This optioin
                 is  not compatible with PreemptMode=OFF.  A configuration
                 of  PreemptMode=SUSPEND  is   only   supported   by   the
                 select/cons_res plugin.

   PriorityDecayHalfLife
          This  controls  how  long  prior  resource  use is considered in
          determining how over- or under-serviced an association is (user,
          bank  account  and  cluster)  in  determining job priority.  The
          record of usage will be decayed over  time,  with  half  of  the
          original  value cleared at age PriorityDecayHalfLife.  If set to
          0 no decay will be applied.  This is  helpful  if  you  want  to
          enforce   hard  time  limits  per  association.   If  set  to  0
          PriorityUsageResetPeriod  must  be   set   to   some   interval.
          Applicable  only if PriorityType=priority/multifactor.  The unit
          is a  time  string  (i.e.  min,  hr:min:00,  days-hr:min:00,  or
          days-hr).  The default value is 7-0 (7 days).

   PriorityCalcPeriod
          The  period of time in minutes in which the half-life decay will
          be       re-calculated.         Applicable        only        if
          PriorityType=priority/multifactor.    The  default  value  is  5
          (minutes).

   PriorityFavorSmall
          Specifies  that  small  jobs  should   be   given   preferential
          scheduling       priority.        Applicable       only       if
          PriorityType=priority/multifactor.  Supported values  are  "YES"
          and "NO".  The default value is "NO".

   PriorityFlags
          Flags   to   modify   priority   behavior   Applicable  only  if
          PriorityType=priority/multifactor.  The keywords below  have  no
          associated                      value                      (e.g.
          "PriorityFlags=ACCRUE_ALWAYS,SMALL_RELATIVE_TO_TIME").

          ACCRUE_ALWAYS    If set, priority age factor will  be  increased
                           despite job dependencies or holds.

          CALCULATE_RUNNING
                           If  set,  priorities  will  be recalculated not
                           only for pending jobs,  but  also  running  and
                           suspended jobs.

          FAIR_TREE        If  set,  priority will be calculated in such a
                           way that if accounts A and B are siblings and A
                           has  a  higher  fairshare  factor  than  B, all
                           children  of  A  will  have  higher   fairshare
                           factors than all children of B.

          DEPTH_OBLIVIOUS  If  set,  priority  will  be  calculated  based
                           similar to the normal multifactor  calculation,
                           but  depth  of  the associations in the tree do
                           not adversely effect their priority.

          SMALL_RELATIVE_TO_TIME
                           If set, the job's size component will be  based
                           upon not the job size alone, but the job's size
                           divided by it's time limit.

   PriorityParameters
          Arbitrary string used by the PriorityType plugin.

   PriorityMaxAge
          Specifies the job age which will be given the maximum age factor
          in  computing priority. For example, a value of 30 minutes would
          result in all jobs over  30  minutes  old  would  get  the  same
          age-based        priority.        Applicable       only       if
          PriorityType=priority/multifactor.  The unit is  a  time  string
          (i.e.  min, hr:min:00, days-hr:min:00, or days-hr).  The default
          value is 7-0 (7 days).

   PriorityUsageResetPeriod
          At this interval the usage of associations will be reset  to  0.
          This  is  used  if you want to enforce hard limits of time usage
          per association.  If PriorityDecayHalfLife is set  to  be  0  no
          decay  will  happen  and this is the only way to reset the usage
          accumulated by running jobs.  By default this is turned off  and
          it  is  advised to use the PriorityDecayHalfLife option to avoid
          not having anything running on your cluster, but if your  schema
          is  set  up to only allow certain amounts of time on your system
          this   is   the   way   to   do   it.    Applicable   only    if
          PriorityType=priority/multifactor.

          NONE        Never clear historic usage. The default value.

          NOW         Clear  the  historic usage now.  Executed at startup
                      and reconfiguration time.

          DAILY       Cleared every day at midnight.

          WEEKLY      Cleared every week on Sunday at time 00:00.

          MONTHLY     Cleared on the first  day  of  each  month  at  time
                      00:00.

          QUARTERLY   Cleared  on  the  first  day of each quarter at time
                      00:00.

          YEARLY      Cleared on the first day of each year at time 00:00.

   PriorityType
          This specifies the plugin to be used  in  establishing  a  job's
          scheduling priority. Supported values are "priority/basic" (jobs
          are  prioritized  by  order  of  arrival,  also   suitable   for
          sched/wiki  and  sched/wiki2),  "priority/multifactor" (jobs are
          prioritized based upon  size,  age,  fair-share  of  allocation,
          etc).   Also  see  PriorityFlags for configuration options.  The
          default value is "priority/basic".

          When not FIFO scheduling, jobs are prioritized in the  following
          order:

          1. Jobs that can preempt

          2. Jobs with an advanced reservation

          3. Partition Priority Tier

          4. Job Priority

          5. Job Id

   PriorityWeightAge
          An  integer  value  that sets the degree to which the queue wait
          time component contributes to the  job's  priority.   Applicable
          only if PriorityType=priority/multifactor.  The default value is
          0.

   PriorityWeightFairshare
          An integer value that sets the degree to  which  the  fair-share
          component contributes to the job's priority.  Applicable only if
          PriorityType=priority/multifactor.  The default value is 0.

   PriorityWeightJobSize
          An integer value that sets the degree  to  which  the  job  size
          component contributes to the job's priority.  Applicable only if
          PriorityType=priority/multifactor.  The default value is 0.

   PriorityWeightPartition
          Partition  factor  used  by   priority/multifactor   plugin   in
          calculating     job     priority.      Applicable     only    if
          PriorityType=priority/multifactor.  The default value is 0.

   PriorityWeightQOS
          An integer value that sets the degree to which  the  Quality  Of
          Service component contributes to the job's priority.  Applicable
          only if PriorityType=priority/multifactor.  The default value is
          0.

   PriorityWeightTRES
          A  comma  separated list of TRES Types and weights that sets the
          degree that each TRES Type contributes to the job's priority.

          e.g.
          PriorityWeightTRES=CPU=1000,Mem=2000,GRES/gpu=3000

          Applicable  only  if  PriorityType=priority/multifactor  and  if
          AccountingStorageTRES  is  configured  with each TRES Type.  The
          default values are 0.

   PrivateData
          This controls what type of information is  hidden  from  regular
          users.   By  default,  all  information is visible to all users.
          User  SlurmUser  and  root  can  always  view  all  information.
          Multiple  values  may  be  specified  with  a  comma  separator.
          Acceptable values include:

          accounts
                 (NON-SlurmDBD  ACCOUNTING  ONLY)  Prevents   users   from
                 viewing   any   account   definitions   unless  they  are
                 coordinators of them.

          cloud  Powered down nodes in the cloud are visible.

          jobs   Prevents users from viewing jobs or job  steps  belonging
                 to  other  users. (NON-SlurmDBD ACCOUNTING ONLY) Prevents
                 users from viewing job records belonging to  other  users
                 unless  they  are coordinators of the association running
                 the job when using sacct.

          nodes  Prevents users from viewing node state information.

          partitions
                 Prevents users from viewing partition state information.

          reservations
                 Prevents regular users from  viewing  reservations  which
                 they can not use.

          usage  Prevents users from viewing usage of any other user, this
                 applies  to  sshare.   (NON-SlurmDBD   ACCOUNTING   ONLY)
                 Prevents users from viewing usage of any other user, this
                 applies to sreport.

          users  (NON-SlurmDBD  ACCOUNTING  ONLY)  Prevents   users   from
                 viewing  information  of  any user other than themselves,
                 this also makes it so users  can  only  see  associations
                 they deal with.  Coordinators can see associations of all
                 users  they  are  coordinator  of,  but  can   only   see
                 themselves when listing users.

   ProctrackType
          Identifies  the  plugin to be used for process tracking on a job
          step basis.  The slurmd daemon uses this mechanism  to  identify
          all  processes  which  are children of processes it spawns for a
          user job step.  The slurmd daemon must be restarted for a change
          in  ProctrackType  to  take effect.  NOTE: "proctrack/linuxproc"
          and  "proctrack/pgid"  can  fail  to  identify   all   processes
          associated  with a job since processes can become a child of the
          init process (when the  parent  process  terminates)  or  change
          their  process  group.   To reliably track all processes, one of
          the  other  mechanisms   utilizing   kernel   modifications   is
          preferable.    NOTE:  The  JobContainerType  applies  to  a  job
          allocation,  while   ProctrackType   applies   to   job   steps.
          Acceptable values at present include:

          proctrack/aix       which  uses  an  AIX kernel extension and is
                              the default for AIX systems

          proctrack/cgroup    which uses linux cgroups  to  constrain  and
                              track    processes.     NOTE:    see    "man
                              cgroup.conf" for configuration details NOTE:
                              This  plugin  writes  to  disk often and can
                              impact performance.  If you are running lots
                              of short running jobs (less than a couple of
                              seconds) this plugin slows down  performance
                              dramatically.  It should probably be avoided
                              in an HTC environment.

          proctrack/cray      which uses Cray proprietary process tracking

          proctrack/linuxproc which uses linux process tree  using  parent
                              process IDs

          proctrack/lua       which  uses  a  site-specific  LUA script to
                              track processes

          proctrack/sgi_job   which uses SGI's Process  Aggregates  (PAGG)
                              kernel              module,              see
                              http://oss.sgi.com/projects/pagg/  for  more
                              information

          proctrack/pgid      which  uses  process  group  IDs  and is the
                              default for all other systems

   Prolog Fully qualified pathname of a program for the slurmd to  execute
          whenever it is asked to run a job step from a new job allocation
          (e.g.  "/usr/local/slurm/prolog").  A glob pattern (See glob(7))
          may  also  be used to specify more than one program to run (e.g.
          "/etc/slurm/prolog.d/*"). The slurmd executes the prolog  before
          starting  the  first job step.  The prolog script or scripts may
          be used to purge files, enable  user  login,  etc.   By  default
          there  is  no  prolog.  Any  configured  script  is  expected to
          complete execution quickly (in less time  than  MessageTimeout).
          If  the  prolog  fails (returns a non-zero exit code), this will
          result in the node being set to a DRAIN state and the job  being
          requeued  in  a  held  state,  unless  nohold_on_prolog_fail  is
          configured  in  SchedulerParameters.   See  Prolog  and   Epilog
          Scripts for more information.

   PrologEpilogTimeout
          The  interval  in  seconds  Slurms  waits  for Prolog and Epilog
          before  terminating  them.  The  default  behavior  is  to  wait
          indefinitely. This interval applies to the Prolog and Epilog run
          by slurmd daemon before and after the job,  the  PrologSlurmctld
          and  EpilogSlurmctld  run  by  slurmctld  daemon,  and the SPANK
          plugins run by the slurmstep daemon.

   PrologFlags
          Flags to control the Prolog behavior. By default  no  flags  are
          set.  Multiple flags may be specified in a comma-separated list.
          Currently supported options are:

          Alloc   If set, the  Prolog  script  will  be  executed  at  job
                  allocation.  By  default, Prolog is executed just before
                  the task is launched. Therefore, when salloc is started,
                  no  Prolog  is  executed.  Alloc is useful for preparing
                  things  before  a  user  starts  to  use  any  allocated
                  resources.  In particular, this flag is needed on a Cray
                  system when cluster compatibility mode is enabled.

                  NOTE: Use of the  Alloc  flag  will  increase  the  time
                  required to start jobs.

          Contain At  job  allocation  time,  use  the ProcTrack plugin to
                  create a job container on all allocated  compute  nodes.
                  This  container  may  be  used  for  user  processes not
                  launched under Slurm control, for example the PAM module
                  may  place  processes launch through a direct user login
                  into this container.   Setting  the  Contain  implicitly
                  sets the Alloc flag.

          NoHold  If  set,  the  Alloc flag should also be set.  This will
                  allow for salloc  to  not  block  until  the  prolog  is
                  finished  on  each  node.  The blocking will happen when
                  steps reach the slurmd  and  before  any  execution  has
                  happened in the step.  This is a much faster way to work
                  and if using srun to launch your tasks  you  should  use
                  this flag.

   PrologSlurmctld
          Fully  qualified  pathname of a program for the slurmctld daemon
          to  execute  before  granting  a  new   job   allocation   (e.g.
          "/usr/local/slurm/prolog_controller").   The program executes as
          SlurmUser on the same node where the slurmctld daemon  executes,
          giving  it  permission  to  drain nodes and requeue the job if a
          failure occurs or cancel the job if  appropriate.   The  program
          can  be  used  to  reboot nodes or perform other work to prepare
          resources for use.  Exactly what the program  does  and  how  it
          accomplishes  this is completely at the discretion of the system
          administrator.  Information about the job being initiated,  it's
          allocated   nodes,   etc.   are  passed  to  the  program  using
          environment variables.  While this program is running, the nodes
          associated with the job will be have a POWER_UP/CONFIGURING flag
          set in their state, which can be readily viewed.  The  slurmctld
          daemon  will  wait  indefinitely  for  this program to complete.
          Once the program completes with an exit code of zero, the  nodes
          will  be  considered  ready  for  use  and  the  program will be
          started.  If some node can not be made available  for  use,  the
          program  should  drain  the  node  (typically using the scontrol
          command) and terminate with a non-zero exit  code.   A  non-zero
          exit code will result in the job being requeued (where possible)
          or killed. Note that only  batch  jobs  can  be  requeued.   See
          Prolog and Epilog Scripts for more information.

   PropagatePrioProcess
          Controls  the  scheduling  priority (nice value) of user spawned
          tasks.

          0    The tasks will inherit the  scheduling  priority  from  the
               slurm daemon.  This is the default value.

          1    The  tasks  will  inherit  the  scheduling  priority of the
               command used to submit them (e.g. srun or sbatch).   Unless
               the  job  is  submitted by user root, the tasks will have a
               scheduling  priority  no  higher  than  the  slurm   daemon
               spawning them.

          2    The  tasks  will  inherit  the  scheduling  priority of the
               command used to submit them (e.g. srun or sbatch) with  the
               restriction that their nice value will always be one higher
               than the slurm daemon (i.e.  the tasks scheduling  priority
               will be lower than the slurm daemon).

   PropagateResourceLimits
          A  list  of  comma  separated  resource limit names.  The slurmd
          daemon uses these names to obtain the  associated  (soft)  limit
          values  from  the  users process environment on the submit node.
          These limits are then propagated and applied to  the  jobs  that
          will  run  on  the  compute nodes.  This parameter can be useful
          when system limits vary among nodes.  Any resource  limits  that
          do not appear in the list are not propagated.  However, the user
          can  override  this  by  specifying  which  resource  limits  to
          propagate  with  the  srun  commands  "--propagate"  option.  If
          neither  of  the  'propagate  resource  limit'  parameters   are
          specified,  then  the default action is to propagate all limits.
          Only one of the parameters,  either  PropagateResourceLimits  or
          PropagateResourceLimitsExcept,   may  be  specified.   The  user
          limits can not exceed hard limits under which the slurmd  daemon
          operates. If the user limits are not propagated, the limits from
          the slurmd daemon will be propagated  to  the  user's  job.  The
          limits   used   for   the  Slurm  daemons  can  be  set  in  the
          /etc/sysconf/slurm   file.   For    more    information,    see:
          http://slurm.schedmd.com/faq.html#memlock  The  following  limit
          names are supported by Slurm (although some options may  not  be
          supported on some systems):

          ALL       All limits listed below

          NONE      No limits listed below

          AS        The maximum address space for a process

          CORE      The maximum size of core file

          CPU       The maximum amount of CPU time

          DATA      The maximum size of a process's data segment

          FSIZE     The  maximum  size  of files created. Note that if the
                    user sets FSIZE to less than the current size  of  the
                    slurmd.log,  job  launches will fail with a 'File size
                    limit exceeded' error.

          MEMLOCK   The maximum size that may be locked into memory

          NOFILE    The maximum number of open files

          NPROC     The maximum number of processes available

          RSS       The maximum resident set size

          STACK     The maximum stack size

   PropagateResourceLimitsExcept
          A list of comma separated resource limit names.  By default, all
          resource  limits  will  be  propagated,  (as  described  by  the
          PropagateResourceLimits  parameter),  except  for   the   limits
          appearing  in  this  list.    The  user  can  override  this  by
          specifying which resource limits  to  propagate  with  the  srun
          commands   "--propagate"  option.   See  PropagateResourceLimits
          above for a list of valid limit names.

   RebootProgram
          Program to be executed  on  each  compute  node  to  reboot  it.
          Invoked  on  each  node  once  it becomes idle after the command
          "scontrol reboot_nodes" is executed by an authorized user  or  a
          job  is  submitted  with  the  "--reboot"  option.   After being
          rebooting, the node is  returned  to  normal  use.   NOTE:  This
          configuration option does not apply to IBM BlueGene systems.

   ReconfigFlags
          Flags  to  control  various  actions  that  may be taken when an
          "scontrol reconfig" command is  issued.  Currently  the  options
          are:

          KeepPartInfo     If  set,  an  "scontrol  reconfig" command will
                           maintain  the  in-memory  value  of   partition
                           "state" and other parameters that may have been
                           dynamically  updated  by   "scontrol   update".
                           Partition  information  in  the slurm.conf file
                           will be merged with in-memory data.  This  flag
                           supersedes the KeepPartState flag.

          KeepPartState    If  set,  an  "scontrol  reconfig" command will
                           preserve only  the  current  "state"  value  of
                           in-memory  partitions  and will reset all other
                           parameters of the partitions that may have been
                           dynamically updated by "scontrol update" to the
                           values from  the  slurm.conf  file.   Partition
                           information  in  the  slurm.conf  file  will be
                           merged with in-memory data.
          The default for the above flags is not set,  and  the  "scontrol
          reconfig"  will rebuild the partition information using only the
          definitions in the slurm.conf file.

   RequeueExit
          Enables automatic job requeue  for  jobs  which  exit  with  the
          specified values.  Separate multiple exit code by a comma and/or
          specify   numeric   ranges   using   a   "-"   separator   (e.g.
          "RequeueExit=1-9,18")  Jobs will be put back in to pending state
          and  later  scheduled  again.   Restarted  jobs  will  have  the
          environment  variable  SLURM_RESTART_COUNT  set to the number of
          times the job has been restarted.

   RequeueExitHold
          Enables automatic requeue of jobs into pending  state  in  hold,
          meaning  their priority is zero.  Separate multiple exit code by
          a comma and/or specify numeric  ranges  using  a  "-"  separator
          (e.g.  "RequeueExitHold=10-12,16")  These  jobs  are  put in the
          JOB_SPECIAL_EXIT exit  state.   Restarted  jobs  will  have  the
          environment  variable  SLURM_RESTART_COUNT  set to the number of
          times the job has been restarted.

   ResumeProgram
          Slurm supports a mechanism to reduce power consumption on  nodes
          that  remain  idle  for  an  extended  period  of time.  This is
          typically accomplished by  reducing  voltage  and  frequency  or
          powering  the node down.  ResumeProgram is the program that will
          be executed when a node in power save mode is assigned  work  to
          perform.   For reasons of reliability, ResumeProgram may execute
          more than once for a node when the slurmctld daemon crashes  and
          is  restarted.   If ResumeProgram is unable to restore a node to
          service, it should requeue any job associated with the node  and
          set the node state to DRAIN.  The program executes as SlurmUser.
          The argument to the program will be the names  of  nodes  to  be
          removed   from   power  savings  mode  (using  Slurm's  hostlist
          expression format).  By default  no  program  is  run.   Related
          configuration   options   include   ResumeTimeout,   ResumeRate,
          SuspendRate,   SuspendTime,   SuspendTimeout,    SuspendProgram,
          SuspendExcNodes,   and  SuspendExcParts.   More  information  is
          available      at      the      Slurm      web      site       (
          http://slurm.schedmd.com/power_save.html ).

   ResumeRate
          The  rate  at  which  nodes  in  power save mode are returned to
          normal operation by ResumeProgram.  The value is number of nodes
          per minute and it can be used to prevent power surges if a large
          number of nodes in power save mode are assigned work at the same
          time  (e.g.  a large job starts).  A value of zero results in no
          limits being imposed.   The  default  value  is  300  nodes  per
          minute.   Related  configuration  options include ResumeTimeout,
          ResumeProgram,   SuspendRate,    SuspendTime,    SuspendTimeout,
          SuspendProgram, SuspendExcNodes, and SuspendExcParts.

   ResumeTimeout
          Maximum time permitted (in second) between when a node is resume
          request is issued and when the node is  actually  available  for
          use.   Nodes  which  fail  to  respond in this time frame may be
          marked DOWN and the jobs scheduled on the  node  requeued.   The
          default  value  is  60  seconds.   Related configuration options
          include  ResumeProgram,  ResumeRate,  SuspendRate,  SuspendTime,
          SuspendTimeout,      SuspendProgram,     SuspendExcNodes     and
          SuspendExcParts.  More information is available at the Slurm web
          site ( http://slurm.schedmd.com/power_save.html ).

   ResvEpilog
          Fully  qualified  pathname  of  a  program  for the slurmctld to
          execute when a reservation ends. The  program  can  be  used  to
          cancel   jobs,   modify   partition   configuration,  etc.   The
          reservation named will be passed as an argument to the  program.
          By default there is no epilog.

   ResvOverRun
          Describes how long a job already running in a reservation should
          be permitted to execute after the end time  of  the  reservation
          has  been  reached.  The time period is specified in minutes and
          the default value is 0 (kill the job  immediately).   The  value
          may not exceed 65533 minutes, although a value of "UNLIMITED" is
          supported  to  permit  a  job  to  run  indefinitely  after  its
          reservation is terminated.

   ResvProlog
          Fully  qualified  pathname  of  a  program  for the slurmctld to
          execute when a reservation begins. The program can  be  used  to
          cancel   jobs,   modify   partition   configuration,  etc.   The
          reservation named will be passed as an argument to the  program.
          By default there is no prolog.

   ReturnToService
          Controls  when  a  DOWN  node  will be returned to service.  The
          default value is 0.  Supported values include

          0   A node  will  remain  in  the  DOWN  state  until  a  system
              administrator  explicitly  changes  its  state  (even if the
              slurmd daemon registers and resumes communications).

          1   A DOWN node will become available for use upon  registration
              with  a  valid  configuration only if it was set DOWN due to
              being non-responsive.  If the node  was  set  DOWN  for  any
              other  reason  (low  memory,  unexpected  reboot, etc.), its
              state will not automatically be changed.  A  node  registers
              with  a  valid configuration if its memory, GRES, CPU count,
              etc. are equal to or greater than the values  configured  in
              slurm.conf.

          2   A  DOWN node will become available for use upon registration
              with a valid configuration.  The node could  have  been  set
              DOWN  for  any  reason.   A  node  registers  with  a  valid
              configuration if its memory, GRES, CPU count, etc. are equal
              to  or  greater  than  the  values configured in slurm.conf.
              (Disabled on Cray ALPS systems.)

   RoutePlugin
          Identifies the plugin to be used for defining which  nodes  will
          be used for message forwarding and message aggregation.

          route/default
                 default, use TreeWidth.

          route/topology
                 use the switch hierarchy defined in a topology.conf file.
                 TopologyPlugin=topology/tree is required.

   SallocDefaultCommand
          Normally, salloc(1) will run the user's  default  shell  when  a
          command  to execute is not specified on the salloc command line.
          If SallocDefaultCommand is specified, salloc  will  instead  run
          the  configured  command. The command is passed to '/bin/sh -c',
          so shell metacharacters are allowed, and commands with  multiple
          arguments should be quoted. For instance:

              SallocDefaultCommand = "$SHELL"

          would  run  the shell in the user's $SHELL environment variable.
          and

              SallocDefaultCommand = "srun -n1 -N1 --mem-per-cpu=0 --pty --preserve-env --mpi=none $SHELL"

          would run spawn  the  user's  default  shell  on  the  allocated
          resources,  but  not consume any of the CPU or memory resources,
          configure it as a pseudo-terminal, and preserve all of the job's
          environment variables (i.e. and not over-write them with the job
          step's allocation information).

          For  systems  with  generic  resources   (GRES)   defined,   the
          SallocDefaultCommand  value  should  explicitly  specify  a zero
          count for the configured GRES.  Failure to do so will result  in
          the   launched   shell   consuming  those  GRES  and  preventing
          subsequent srun commands from using them.  For example, on  Cray
          systems add "--gres=craynetwork:0" as shown below:
              SallocDefaultCommand = "srun -n1 -N1 --mem-per-cpu=0 --gres=craynetwork:0 --pty --preserve-env --mpi=none $SHELL"

          For   systems   with   TaskPlugin   set,  adding  an  option  of
          "--cpu_bind=no" is recommended if the default shell should  have
          access  to  all  of  the CPUs allocated to the job on that node,
          otherwise the shell may be limited to a single cpu or core.

   SchedulerParameters
          The interpretation of this parameter  varies  by  SchedulerType.
          Multiple options may be comma separated.

          assoc_limit_stop
                 If  set and a job cannot start due to association limits,
                 then do not attempt to initiate any lower  priority  jobs
                 in  that  partition.  Setting  this  can  decrease system
                 throughput and utlization, but avoid potentially starving
                 larger   jobs   by   preventing   them   from   launching
                 indefinitely.

          batch_sched_delay=#
                 How long, in seconds, the scheduling of batch jobs can be
                 delayed.    This  can  be  useful  in  a  high-throughput
                 environment in which batch jobs are submitted at  a  very
                 high  rate (i.e. using the sbatch command) and one wishes
                 to reduce the overhead of attempting to schedule each job
                 at submit time.  The default value is 3 seconds.

          bf_busy_nodes
                 When  selecting resources for pending jobs to reserve for
                 future  execution  (i.e.  the  job  can  not  be  started
                 immediately),  then  preferentially select nodes that are
                 in use.  This will tend to leave currently idle resources
                 available  for  backfilling  longer running jobs, but may
                 result in allocations having less  than  optimal  network
                 topology.  This option is currently only supported by the
                 select/cons_res    plugin    (or     select/cray     with
                 SelectTypeParameters   set   to  "OTHER_CONS_RES",  which
                 layers the select/cray plugin  over  the  select/cons_res
                 plugin).

          bf_continue
                 The  backfill  scheduler  periodically  releases locks in
                 order to permit other operations to proceed  rather  than
                 blocking  all  activity  for  what  could  be an extended
                 period of time.   Setting  this  option  will  cause  the
                 backfill  scheduler  to  continue processing pending jobs
                 from its original job list after releasing locks even  if
                 job  or  node  state  changes.   This can result in lower
                 priority jobs being backfill scheduled instead  of  newly
                 arrived higher priority jobs, but will permit more queued
                 jobs to be considered for backfill scheduling.

          bf_interval=#
                 The number of seconds between iterations.  Higher  values
                 result  in  less overhead and better responsiveness.  The
                 backfill scheduler will start over  after  reaching  this
                 time  limit  (including time spent sleeping), even if the
                 maximum job counts have not been  reached.   This  option
                 applies   only   to   SchedulerType=sched/backfill.   The
                 default value is 30 seconds.

          bf_max_job_array_resv=#
                 The maximum number of tasks from a job array for which to
                 reserve  resources  in  the future.  Since job arrays can
                 potentially have  millions  of  tasks,  the  overhead  in
                 reserving resources for all tasks can be prohibitive.  In
                 addition various limits may prevent  all  the  jobs  from
                 starting  at the expected times.  This has no impact upon
                 the number of tasks from a job array that can be  started
                 immediately,  only  those tasks expected to start at some
                 future time.  The default value is 20 tasks.

          bf_max_job_part=#
                 The maximum number  of  jobs  per  partition  to  attempt
                 starting   with  the  backfill  scheduler.  This  can  be
                 especially helpful for  systems  with  large  numbers  of
                 partitions and jobs.  The default value is 0, which means
                 no    limit.     This    option    applies    only     to
                 SchedulerType=sched/backfill.       Also      see     the
                 partition_job_depth  and  bf_max_job_test  options.   Set
                 bf_max_job_test    to    a   value   much   higher   than
                 bf_max_job_part.

          bf_max_job_start=#
                 The maximum number of jobs which can be  initiated  in  a
                 single  iteration of the backfill scheduler.  The default
                 value is 0, which means no limit.   This  option  applies
                 only to SchedulerType=sched/backfill.

          bf_max_job_test=#
                 The maximum number of jobs to attempt backfill scheduling
                 for (i.e. the queue depth).  Higher values result in more
                 overhead  and  less  responsiveness.  Until an attempt is
                 made to backfill schedule a job, its expected  initiation
                 time  value  will  not be set.  The default value is 100.
                 In the case of large clusters, configuring  a  relatively
                 small  value  may be desirable.  This option applies only
                 to SchedulerType=sched/backfill.

          bf_max_job_user=#
                 The maximum number of jobs per user to  attempt  starting
                 with  the  backfill  scheduler. One can set this limit to
                 prevent users from flooding the backfill queue with  jobs
                 that  cannot start and that prevent jobs from other users
                 to start.  This is similar to the MAXIJOB limit in  Maui.
                 The  default  value  is  0,  which  means no limit.  This
                 option  applies  only  to   SchedulerType=sched/backfill.
                 Also see the bf_max_job_part and bf_max_job_test options.
                 Set  bf_max_job_test  to  a  value   much   higher   than
                 bf_max_job_user.

          bf_min_age_reserve=#
                 The  backfill  and main scheduling logic will not reserve
                 resources for pending jobs until they have  been  pending
                 and  runnable  for  at  least  the  specified  number  of
                 seconds.  In addition, jobs waiting  for  less  than  the
                 specified  number  of  seconds  will  not prevent a newly
                 submitted job from  starting  immediately,  even  if  the
                 newly  submitted  job  has a lower priority.  This can be
                 valuable if jobs lack time limits or all time limits have
                 the  same  value.   The default value is zero, which will
                 reserve  resources  for  any  pending   job   and   delay
                 initiation    of   lower   priority   jobs.    Also   see
                 bf_min_prio_reserve.

          bf_min_prio_reserve=#
                 The backfill and main scheduling logic will  not  reserve
                 resources  for  pending  jobs unless they have a priority
                 equal  to  or  higher  than  the  specified  value.    In
                 addition,  jobs  with a lower priority will not prevent a
                 newly submitted job from starting  immediately,  even  if
                 the  newly  submitted job has a lower priority.  This can
                 be valuable if one wished to maximum  system  utilization
                 without   regard   for   job  priority  below  a  certain
                 threshold.  The default value is zero, which will reserve
                 resources  for  any  pending  job and delay initiation of
                 lower priority jobs.  Also see bf_min_age_reserve.

          bf_resolution=#
                 The  number  of  seconds  in  the  resolution   of   data
                 maintained  about when jobs begin and end.  Higher values
                 result in less overhead and better  responsiveness.   The
                 default value is 60 seconds.  This option applies only to
                 SchedulerType=sched/backfill.

          bf_window=#
                 The number of  minutes  into  the  future  to  look  when
                 considering  jobs  to  schedule.  Higher values result in
                 more overhead and less responsiveness.  The default value
                 is  1440  minutes (one day).  A value at least as long as
                 the highest allowed time limit is generally advisable  to
                 prevent  job starvation.  In order to limit the amount of
                 data managed by the backfill scheduler, if the  value  of
                 bf_window is increased, then it is generally advisable to
                 also increase bf_resolution.  This option applies only to
                 SchedulerType=sched/backfill.

          bf_yield_interval=#
                 The backfill scheduler will periodically relinquish locks
                 in order for other  pending  operations  to  take  place.
                 This specifies the times when the locks are relinquish in
                 microseconds.    The   default   value    is    2,000,000
                 microseconds  (2 seconds).  Smaller values may be helpful
                 for high throughput computing when  used  in  conjunction
                 with the bf_continue option.  Also see the bf_yield_sleep
                 option.

          bf_yield_sleep=#
                 The backfill scheduler will periodically relinquish locks
                 in  order  for  other  pending  operations to take place.
                 This specifies the length of time for which the locks are
                 relinquish in microseconds.  The default value is 500,000
                 microseconds    (0.5    seconds).     Also    see     the
                 bf_yield_interval option.

          build_queue_timeout=#
                 Defines  the maximum time that can be devoted to building
                 a queue of jobs to be  tested  for  scheduling.   If  the
                 system  has a huge number of jobs with dependencies, just
                 building the job queue  can  take  so  much  time  as  to
                 adversely  impact  overall  system  performance  and this
                 parameter can be adjusted as needed.  The  default  value
                 is 2,000,000 microseconds (2 seconds).

          default_queue_depth=#
                 The  default  number  of jobs to attempt scheduling (i.e.
                 the queue depth) when a running job  completes  or  other
                 routine  actions  occur, however the frequency with which
                 the scheduler is run may be limited by using the defer or
                 sched_min_interval  parameters described below.  The full
                 queue will be tested on a less frequent basis as  defined
                 by the sched_interval option described below. The default
                 value is 100.   See  the  partition_job_depth  option  to
                 limit depth by partition.

          defer  Setting  this  option  will  avoid attempting to schedule
                 each job individually at job submit time,  but  defer  it
                 until   a   later  time  when  scheduling  multiple  jobs
                 simultaneously may be possible.  This option may  improve
                 system  responsiveness  when  large numbers of jobs (many
                 hundreds) are submitted at the same  time,  but  it  will
                 delay  the  initiation  time of individual jobs. Also see
                 default_queue_depth above.

          disable_user_top
                 Disable  use  of  the  "scontrol  top"  command  by  non-
                 privileged users.

          Ignore_NUMA
                 Some  processors  (e.g.  AMD Opteron 6000 series) contain
                 multiple NUMA nodes per socket. This is  a  configuration
                 which  does not map into the hardware entities that Slurm
                 optimizes  resource  allocation  for  (PU/thread,   core,
                 socket,  baseboard, node and network switch). In order to
                 optimize resource allocations  on  such  hardware,  Slurm
                 will  consider  each  NUMA  node  within  the socket as a
                 separate socket by default. Use the Ignore_NUMA option to
                 report   the  correct  socket  count,  but  not  optimize
                 resource allocations on the NUMA nodes.

          inventory_interval=#
                 On a Cray system using Slurm on top of ALPS  this  limits
                 the  number  of  times  a  Basil  Inventory call is made.
                 Normally this call happens every scheduling consideration
                 to  attempt  to  close  a  node  state change window with
                 respects to what ALPS has.  This call is rather slow,  so
                 making    it   less   frequently   improves   performance
                 dramatically, but in the situation where a  node  changes
                 state  the window is as large as this setting.  In an HTC
                 environment this setting is a must and we  advise  around
                 10 seconds.

          kill_invalid_depend
                 If  a  job has an invalid dependency and it can never run
                 terminate it and set its state to  be  JOB_CANCELLED.  By
                 default    the    job    stays    pending   with   reason
                 DependencyNeverSatisfied.

          max_depend_depth=#
                 Maximum number  of  jobs  to  test  for  a  circular  job
                 dependency.   Stop  testing  after  this  number  of  job
                 dependencies have been tested. The default  value  is  10
                 jobs.

          max_rpc_cnt=#
                 If  the  number of active threads in the slurmctld daemon
                 is equal to or larger than this value,  defer  scheduling
                 of  jobs.   This  can  improve Slurm's ability to process
                 requests  at  a  cost  of  initiating   new   jobs   less
                 frequently.   The  default  value is zero, which disables
                 this option.  If a value is set, then a value  of  10  or
                 higher is recommended.

          max_sched_time=#
                 How  long, in seconds, that the main scheduling loop will
                 execute for before exiting.  If a value is configured, be
                 aware  that  all  other Slurm operations will be deferred
                 during this time period.  Make certain the value is lower
                 than  MessageTimeout.   If  a  value  is  not  explicitly
                 configured, the default value is half  of  MessageTimeout
                 with  a  minimum  default value of 1 second and a maximum
                 default   value   of   2   seconds.    For   example   if
                 MessageTimeout=10, the time limit will be 2 seconds (i.e.
                 MIN(10/2, 2) = 2).

          max_script_size=#
                 Specify the maximum size of a  batch  script,  in  bytes.
                 The  default  value  is  4  megabytes.  Larger values may
                 adversely impact system performance.

          max_switch_wait=#
                 Maximum number of seconds that a job can delay  execution
                 waiting  for  the  specified  desired  switch  count. The
                 default value is 300 seconds.

          no_backup_scheduling
                 If used, the backup controller  will  not  schedule  jobs
                 when it takes over. The backup controller will allow jobs
                 to  be  submitted,  modified  and  cancelled  but   won't
                 schedule  new  jobs.  This is useful in Cray environments
                 when the backup controller resides on  an  external  Cray
                 node.   A  restart is required to alter this option. This
                 is explicitly set on a Cray/ALPS system.

          no_env_cache
                 If used, any job started on node that fails to  load  the
                 env  from  a  node  will fail instead of using the cached
                 env.    This   will    also    implicitly    imply    the
                 requeue_setup_env_fail option as well.

          pack_serial_at_end
                 If  used  with the select/cons_res plugin then put serial
                 jobs at the end of the available nodes rather than  using
                 a   best   fit   algorithm.   This  may  reduce  resource
                 fragmentation for some workloads.

          partition_job_depth=#
                 The default number of jobs to  attempt  scheduling  (i.e.
                 the  queue  depth)  from  each partition/queue in Slurm's
                 main scheduling logic.  The functionality is  similar  to
                 that  provided  by  the  bf_max_job_part  option  for the
                 backfill scheduling logic.  The default value  is  0  (no
                 limit).   Job's  excluded from attempted scheduling based
                 upon  partition  will  not   be   counted   against   the
                 default_queue_depth  limit.  Also see the bf_max_job_part
                 option.

          preempt_reorder_count=#
                 Specify how many  attempt  should  be  made  in  reording
                 preemptable jobs to minimize the count of jobs preempted.
                 The default value is 1. High values may adversely  impact
                 performance.   The  logic  to support this option is only
                 available in the select/cons_res plugin.

          preempt_strict_order
                 If set, then execute extra logic in an attempt to preempt
                 only  the  lowest  priority jobs.  It may be desirable to
                 set this configuration parameter when there are  multiple
                 priorities  of  preemptable  jobs.   The logic to support
                 this option is  only  available  in  the  select/cons_res
                 plugin.

          nohold_on_prolog_fail
                 By  default if the Prolog exits with a non-zero value the
                 job  is  requeued  in  held  state.  By  specifying  this
                 parameter  the  job will be requeued but not held so that
                 the scheduler can dispatch it to another host.

          requeue_setup_env_fail
                 By default if a job environment setup fails the job keeps
                 running  with  a  limited environment. By specifying this
                 parameter the job will be requeued in held state and  the
                 execution node drained.

          sched_interval=#
                 How frequently, in seconds, the main scheduling loop will
                 execute and test all pending jobs.  The default value  is
                 60 seconds.

          sched_max_job_start=#
                 The maximum number of jobs that the main scheduling logic
                 will start in any single execution.  The default value is
                 zero, which imposes no limit.

          sched_min_interval=#
                 How frequently, in microseconds, the main scheduling loop
                 will execute and test any pending  jobs.   The  scheduler
                 runs  in  a  limited  fashion  every  time that any event
                 happens which could enable  a  job  to  start  (e.g.  job
                 submit,  job terminate, etc.).  If these events happen at
                 a high frequency, the scheduler can run  very  frequently
                 and  consume  significant  resources  if not throttled by
                 this option.  This  option  specifies  the  minimum  time
                 between the end of one scheduling cycle and the beginning
                 of the next scheduling  cycle.   A  value  of  zero  will
                 disable throttling of the scheduling logic interval.  The
                 default value  is  1,000,000  microseconds  on  Cray/ALPS
                 systems and zero microseconds (throttling is disabled) on
                 other systems.

   SchedulerPort
          The port number on which slurmctld should listen for  connection
          requests.   This  value  is only used by the Maui Scheduler (see
          SchedulerType).  The default value is 7321.

   SchedulerRootFilter
          Identifies whether or not RootOnly partitions should be filtered
          from  any  external  scheduling  activities.  If  set to 0, then
          RootOnly partitions are treated like any other partition. If set
          to  1,  then  RootOnly  partitions  are exempt from any external
          scheduling activities. The default value is  1.  Currently  only
          used by the built-in backfill scheduling module "sched/backfill"
          (see SchedulerType).

   SchedulerTimeSlice
          Number of seconds in each time slice  when  gang  scheduling  is
          enabled (PreemptMode=SUSPEND,GANG).  The value must be between 5
          seconds and 65533 seconds.  The default value is 30 seconds.

   SchedulerType
          Identifies the type of scheduler to be used.  Note the slurmctld
          daemon  must  be  restarted  for  a  change in scheduler type to
          become effective (reconfiguring a running daemon has  no  effect
          for  this  parameter).   The  scontrol  command  can  be used to
          manually change job priorities if  desired.   Acceptable  values
          include:

          sched/backfill
                 For  a  backfill scheduling module to augment the default
                 FIFO  scheduling.   Backfill  scheduling  will   initiate
                 lower-priority  jobs  if  doing  so  does  not  delay the
                 expected initiation time  of  any  higher  priority  job.
                 Effectiveness  of  backfill  scheduling is dependent upon
                 users specifying job time limits, otherwise all jobs will
                 have  the  same time limit and backfilling is impossible.
                 Note documentation  for  the  SchedulerParameters  option
                 above.  This is the default configuration.

          sched/builtin
                 This  is  the  FIFO  scheduler  which  initiates  jobs in
                 priority order.  If any job in the partition can  not  be
                 scheduled,  no  lower priority job in that partition will
                 be scheduled.  An exception is made for jobs that can not
                 run due to partition constraints (e.g. the time limit) or
                 down/drained nodes.  In that case,  lower  priority  jobs
                 can be initiated and not impact the higher priority job.

          sched/hold
                 To   hold   all   newly   arriving   jobs   if   a   file
                 "/etc/slurm.hold" exists otherwise use the built-in  FIFO
                 scheduler

          sched/wiki
                 For the Wiki interface to the Maui Scheduler

          sched/wiki2
                 For the Wiki interface to the Moab Cluster Suite

   SelectType
          Identifies  the type of resource selection algorithm to be used.
          Changing this value can only be done by restarting the slurmctld
          daemon  and  will  result  in  the  loss  of all job information
          (running and pending) since the job state save  format  used  by
          each plugin is different.  Acceptable values include

          select/bluegene
                 for  a  three-dimensional  BlueGene  system.  The default
                 value is "select/bluegene" for BlueGene systems.

          select/cons_res
                 The resources within a node are individually allocated as
                 consumable  resources.   Note  that  whole  nodes  can be
                 allocated to jobs for selected partitions  by  using  the
                 OverSubscribe=Exclusive   option.    See   the  partition
                 OverSubscribe parameter for more information.

          select/cray
                 for a Cray system.  The default  value  is  "select/cray"
                 for all Cray systems.

          select/linear
                 for allocation of entire nodes assuming a one-dimensional
                 array of nodes in which sequentially  ordered  nodes  are
                 preferable.   This  is the default value for non-BlueGene
                 systems.

          select/serial
                 for allocating resources to single CPU jobs only.  Highly
                 optimized    for   maximum   throughput.    NOTE:   SPANK
                 environment variables are NOT  propagated  to  the  job's
                 Epilog program.

   SelectTypeParameters
          The  permitted  values  of  SelectTypeParameters depend upon the
          configured  value  of  SelectType.    SelectType=select/bluegene
          supports  no  SelectTypeParameters.   The only supported options
          for  SelectType=select/linear   are   CR_ONE_TASK_PER_CORE   and
          CR_Memory,  which  treats  memory  as  a consumable resource and
          prevents memory over subscription with job  preemption  or  gang
          scheduling.  By default SelectType=select/linear allocates whole
          nodes to jobs without considering their memory consumption.   By
          default  SelectType=select/cons_res, SelectType=select/cray, and
          SelectType=select/serial use CR_CPU, which allocates CPU to jobs
          without considering their memory consumption.

          The following options are supported for SelectType=select/cray:

                 OTHER_CONS_RES
                        Layer   the   select/cons_res   plugin  under  the
                        select/cray plugin, the default  is  to  layer  on
                        select/linear.   This  also allows all the options
                        for SelectType=select/cons_res.

                 NHC_NO_STEPS
                        Do not run the node health check after each  step.
                        Default is to run after each step.

                 NHC_NO Do  not  run  the  node  health  check  after each
                        allocation.   Default  is  to   run   after   each
                        allocation.   This  also sets NHC_NO_STEPS, so the
                        NHC will never run.

          The      following      options      are      supported      for
          SelectType=select/cons_res:

                 CR_CPU CPUs  are  consumable  resources.   Configure  the
                        number of CPUs on each node, which may be equal to
                        the  count  of  cores or hyper-threads on the node
                        depending  upon  the  desired   minimum   resource
                        allocation.     The    node's   Boards,   Sockets,
                        CoresPerSocket and ThreadsPerCore  may  optionally
                        be  configured and result in job allocations which
                        have improved  locality;  however  doing  so  will
                        prevent   more  than  one  job  being  from  being
                        allocated on each core.

                 CR_CPU_Memory
                        CPUs  and   memory   are   consumable   resources.
                        Configure  the  number of CPUs on each node, which
                        may  be  equal  to   the   count   of   cores   or
                        hyper-threads  on  the  node  depending  upon  the
                        desired minimum resource allocation.   The  node's
                        Boards, Sockets, CoresPerSocket and ThreadsPerCore
                        may optionally be configured  and  result  in  job
                        allocations  which have improved locality; however
                        doing so will prevent more than one job being from
                        being allocated on each core.  Setting a value for
                        DefMemPerCPU is strongly recommended.

                 CR_Core
                        Cores are consumable  resources.   On  nodes  with
                        hyper-threads,  each thread is counted as a CPU to
                        satisfy a job's resource requirement, but multiple
                        jobs  are  not allocated threads on the same core.
                        The count of  CPUs  allocated  to  a  job  may  be
                        rounded   up  to  account  for  every  CPU  on  an
                        allocated core.

                 CR_Core_Memory
                        Cores and memory  are  consumable  resources.   On
                        nodes  with  hyper-threads, each thread is counted
                        as a CPU to satisfy a job's resource  requirement,
                        but multiple jobs are not allocated threads on the
                        same core.  The count of CPUs allocated to  a  job
                        may  be  rounded up to account for every CPU on an
                        allocated core.  Setting a value for  DefMemPerCPU
                        is strongly recommended.

                 CR_ONE_TASK_PER_CORE
                        Allocate  one  task  per core by default.  Without
                        this option, by default one task will be allocated
                        per   thread   on   nodes   with   more  than  one
                        ThreadsPerCore configured.

                 CR_CORE_DEFAULT_DIST_BLOCK
                        Allocate  cores  within   a   node   using   block
                        distribution    by    default.     This    is    a
                        pseudo-best-fit  algorithm  that   minimizes   the
                        number  of  boards  and  minimizes  the  number of
                        sockets  (within  minimum  boards)  used  for  the
                        allocation.    This   default   behavior   can  be
                        overridden specifying a particular "-m"  parameter
                        with  srun/salloc/sbatch.   Without  this  option,
                        cores  will  be  allocated  cyclicly  across   the
                        sockets.

                 CR_LLN Schedule  resources  to  jobs  on the least loaded
                        nodes (based upon the number of idle  CPUs).  This
                        is  generally  only recommended for an environment
                        with serial jobs as idle resources will tend to be
                        highly  fragmented,  resulting  in  parallel  jobs
                        being distributed across many nodes.  Also see the
                        partition  configuration  parameter  LLN  use  the
                        least loaded nodes in selected partitions.

                 CR_Pack_Nodes
                        If a job allocation contains more  resources  than
                        will  be  used  for launching tasks (e.g. if whole
                        nodes are allocated to a job),  then  rather  than
                        distributing  a  job's  tasks  evenly  across it's
                        allocated nodes, pack them as tightly as  possible
                        on  these  nodes.   For  example,  consider  a job
                        allocation containing two entire nodes with  eight
                        CPUs  each.   If  the  job starts ten tasks across
                        those two nodes without this option, it will start
                        five  tasks  on  each of the two nodes.  With this
                        option, eight tasks will be started on  the  first
                        node and two tasks on the second node.

                 CR_Socket
                        Sockets  are  consumable resources.  On nodes with
                        multiple cores, each core or thread is counted  as
                        a CPU to satisfy a job's resource requirement, but
                        multiple jobs are not allocated resources  on  the
                        same socket.

                 CR_Socket_Memory
                        Memory  and  sockets are consumable resources.  On
                        nodes with multiple cores, each core or thread  is
                        counted  as  a  CPU  to  satisfy  a job's resource
                        requirement, but multiple jobs are  not  allocated
                        resources on the same socket.  Setting a value for
                        DefMemPerCPU is strongly recommended.

                 CR_Memory
                        Memory  is  a  consumable  resource.   NOTE:  This
                        implies  OverSubscribe=YES  or OverSubscribe=FORCE
                        for  all  partitions.    Setting   a   value   for
                        DefMemPerCPU is strongly recommended.

   SlurmUser
          The name of the user that the slurmctld daemon executes as.  For
          security purposes, a user  other  than  "root"  is  recommended.
          This   user   must  exist  on  all  nodes  of  the  cluster  for
          authentication of communications between Slurm components.   The
          default value is "root".

   SlurmdUser
          The  name  of the user that the slurmd daemon executes as.  This
          user must exist on all nodes of the cluster  for  authentication
          of  communications  between Slurm components.  The default value
          is "root".

   SlurmctldDebug
          The level of detail to provide  slurmctld  daemon's  logs.   The
          default  value  is  info.   If the slurmctld daemon is initiated
          with -v or --verbose options, that debug level will be  preserve
          or restored upon reconfiguration.

          quiet     Log nothing

          fatal     Log only fatal errors

          error     Log only errors

          info      Log errors and general informational messages

          verbose   Log errors and verbose informational messages

          debug     Log  errors  and  verbose  informational  messages and
                    debugging messages

          debug2    Log errors and verbose informational messages and more
                    debugging messages

          debug3    Log errors and verbose informational messages and even
                    more debugging messages

          debug4    Log errors and verbose informational messages and even
                    more debugging messages

          debug5    Log errors and verbose informational messages and even
                    more debugging messages

   SlurmctldLogFile
          Fully qualified pathname of a  file  into  which  the  slurmctld
          daemon's  logs are written.  The default value is none (performs
          logging via syslog).
          See the section LOGGING if a pathname is specified.

   SlurmctldPidFile
          Fully qualified pathname of a file  into  which  the   slurmctld
          daemon  may write its process id. This may be used for automated
          signal     processing.       The      default      value      is
          "/var/run/slurmctld.pid".

   SlurmctldPlugstack
          A comma delimited list of Slurm controller plugins to be started
          when the daemon begins and terminated when it  ends.   Only  the
          plugin's init and fini functions are called.

   SlurmctldPort
          The port number that the Slurm controller, slurmctld, listens to
          for work. The default value is SLURMCTLD_PORT as established  at
          system  build  time. If none is explicitly specified, it will be
          set to 6817.  SlurmctldPort may also be configured to support  a
          range  of  port  numbers  in  order  to  accept larger bursts of
          incoming messages by specifying two numbers separated by a  dash
          (e.g.  SlurmctldPort=6817-6818).   NOTE:  Either  slurmctld  and
          slurmd daemons must not execute on the same nodes or the  values
          of SlurmctldPort and SlurmdPort must be different.

          Note:  On Cray systems, Realm-Specific IP Addressing (RSIP) will
          automatically try to interact  with  anything  opened  on  ports
          8192-60000.   Configure  SlurmctldPort  to use a port outside of
          the configured SrunPortRange and RSIP's port range.

   SlurmctldTimeout
          The interval, in seconds, that the backup controller  waits  for
          the  primary controller to respond before assuming control.  The
          default value is 120 seconds.  May not exceed 65533.

   SlurmdDebug
          The level of  detail  to  provide  slurmd  daemon's  logs.   The
          default value is info.

          quiet     Log nothing

          fatal     Log only fatal errors

          error     Log only errors

          info      Log errors and general informational messages

          verbose   Log errors and verbose informational messages

          debug     Log  errors  and  verbose  informational  messages and
                    debugging messages

          debug2    Log errors and verbose informational messages and more
                    debugging messages

          debug3    Log errors and verbose informational messages and even
                    more debugging messages

          debug4    Log errors and verbose informational messages and even
                    more debugging messages

          debug5    Log errors and verbose informational messages and even
                    more debugging messages

   SlurmdLogFile
          Fully qualified pathname  of  a  file  into  which  the   slurmd
          daemon's  logs are written.  The default value is none (performs
          logging via syslog).  Any "%h" within the name is replaced  with
          the  hostname  on  which the slurmd is running.  Any "%n" within
          the name is replaced with the  Slurm  node  name  on  which  the
          slurmd is running.
          See the section LOGGING if a pathname is specified.

   SlurmdPidFile
          Fully qualified pathname of a file into which the  slurmd daemon
          may write its process id. This may be used for automated  signal
          processing.   Any  "%h"  within  the  name  is replaced with the
          hostname on which the slurmd is running.  Any  "%n"  within  the
          name is replaced with the Slurm node name on which the slurmd is
          running.  The default value is "/var/run/slurmd.pid".

   SlurmdPlugstack
          A comma delimited list of  Slurm  compute  node  plugins  to  be
          started  when  the  daemon  begins  and terminated when it ends.
          Only the plugin's init and fini functions are called.

   SlurmdPort
          The port number that the  Slurm  compute  node  daemon,  slurmd,
          listens  to  for  work.  The  default  value  is  SLURMD_PORT as
          established  at  system  build  time.  If  none  is   explicitly
          specified,  its  value will be 6818.  NOTE: Either slurmctld and
          slurmd daemons must not execute on the same nodes or the  values
          of SlurmctldPort and SlurmdPort must be different.

          Note:  On Cray systems, Realm-Specific IP Addressing (RSIP) will
          automatically try to interact  with  anything  opened  on  ports
          8192-60000.   Configure  SlurmdPort to use a port outside of the
          configured SrunPortRange and RSIP's port range.

   SlurmdSpoolDir
          Fully qualified pathname of a directory into  which  the  slurmd
          daemon's  state information and batch job script information are
          written. This must be a  common  pathname  for  all  nodes,  but
          should  represent  a  directory  which  is  local  to  each node
          (reference  a  local  file  system).  The   default   value   is
          "/var/spool/slurmd".   Any "%h" within the name is replaced with
          the hostname on which the slurmd is running.   Any  "%n"  within
          the  name  is  replaced  with  the  Slurm node name on which the
          slurmd is running.

   SlurmdTimeout
          The interval, in seconds, that the Slurm  controller  waits  for
          slurmd  to respond before configuring that node's state to DOWN.
          A value of zero  indicates  the  node  will  not  be  tested  by
          slurmctld  to  confirm the state of slurmd, the node will not be
          automatically set to a DOWN state  indicating  a  non-responsive
          slurmd,  and  some  other  tool  will  take  responsibility  for
          monitoring the state of each compute node and its slurmd daemon.
          Slurm's hierarchical communication mechanism is used to ping the
          slurmd daemons in order to minimize system noise  and  overhead.
          The  default  value  is  300  seconds.  The value may not exceed
          65533 seconds.

   SlurmSchedLogFile
          Fully qualified pathname of the scheduling event  logging  file.
          The   syntax   of   this   parameter   is   the   same   as  for
          SlurmctldLogFile.  In order to configure scheduler logging,  set
          both the SlurmSchedLogFile and SlurmSchedLogLevel parameters.

   SlurmSchedLogLevel
          The  initial  level  of scheduling event logging, similar to the
          SlurmctldDebug parameter used to control the  initial  level  of
          slurmctld  logging.  Valid values for SlurmSchedLogLevel are "0"
          (scheduler  logging  disabled)  and   "1"   (scheduler   logging
          enabled).   If  this parameter is omitted, the value defaults to
          "0" (disabled).  In order to configure  scheduler  logging,  set
          both  the  SlurmSchedLogFile  and SlurmSchedLogLevel parameters.
          The scheduler logging level can  be  changed  dynamically  using
          scontrol.

   SrunEpilog
          Fully  qualified  pathname  of  an  executable to be run by srun
          following the completion  of  a  job  step.   The  command  line
          arguments  for  the executable will be the command and arguments
          of the job step.  This configuration parameter may be overridden
          by srun's --epilog parameter. Note that while the other "Epilog"
          executables (e.g., TaskEpilog) are run by slurmd on the  compute
          nodes  where  the tasks are executed, the SrunEpilog runs on the
          node where the "srun" is executing.

   SrunPortRange
          The srun creates a set of listening ports  to  communicate  with
          the  controller,  the  slurmstepd  and to handle the application
          I/O.  By default these ports  are  ephemeral  meaning  the  port
          numbers  are  selected by the kernel. Using this parameter allow
          sites to configure a range of ports from which srun  ports  will
          be  selected. This is useful if sites want to allow only certain
          port range on their network.

          Note: On Cray systems, Realm-Specific IP Addressing (RSIP)  will
          automatically  try  to  interact  with  anything opened on ports
          8192-60000.  Configure SrunPortRange to use  a  range  of  ports
          above  those  used  by  RSIP,  ideally  1000  or more ports, for
          example "SrunPortRange=60001-63000".

          Note: A sufficient number of ports must be configured  based  on
          the estimated number of srun on the submission nodes considering
          that srun opens 3 listening ports  plus  2  more  for  every  48
          hosts. Example:

          srun -N 48 will use 5 listening ports.

          srun -N 50 will use 7 listening ports.

          srun -N 200 will use 13 listening ports.

   SrunProlog
          Fully  qualified  pathname  of  an  executable to be run by srun
          prior to the launch of a job step.  The command  line  arguments
          for  the executable will be the command and arguments of the job
          step.  This configuration parameter may be overridden by  srun's
          --prolog   parameter.   Note   that  while  the  other  "Prolog"
          executables (e.g., TaskProlog) are run by slurmd on the  compute
          nodes  where  the tasks are executed, the SrunProlog runs on the
          node where the "srun" is executing.

   StateSaveLocation
          Fully qualified pathname of a directory  into  which  the  Slurm
          controller,     slurmctld,     saves     its     state     (e.g.
          "/usr/local/slurm/checkpoint").  Slurm state will saved here  to
          recover  from system failures.  SlurmUser must be able to create
          files  in  this  directory.   If  you  have  a  BackupController
          configured,  this  location  should  be readable and writable by
          both systems.  Since all running and pending job information  is
          stored  here,  the  use of a reliable file system (e.g. RAID) is
          recommended.  The default value is "/var/spool".  If  any  slurm
          daemons  terminate  abnormally,  their  core  files will also be
          written into this directory.

   SuspendExcNodes
          Specifies the nodes which are to not be  placed  in  power  save
          mode,  even  if  the node remains idle for an extended period of
          time.  Use Slurm's hostlist expression to  identify  nodes.   By
          default  no  nodes  are excluded.  Related configuration options
          include      ResumeTimeout,      ResumeProgram,      ResumeRate,
          SuspendProgram,  SuspendRate,  SuspendTime,  SuspendTimeout, and
          SuspendExcParts.

   SuspendExcParts
          Specifies the partitions whose nodes are to  not  be  placed  in
          power  save  mode, even if the node remains idle for an extended
          period of time.   Multiple  partitions  can  be  identified  and
          separated by commas.  By default no nodes are excluded.  Related
          configuration  options  include  ResumeTimeout,   ResumeProgram,
          ResumeRate,     SuspendProgram,     SuspendRate,     SuspendTime
          SuspendTimeout, and SuspendExcNodes.

   SuspendProgram
          SuspendProgram is the program that will be executed when a  node
          remains  idle  for  an extended period of time.  This program is
          expected to place the node into some power save mode.  This  can
          be  used  to  reduce  the  frequency  and  voltage  of a node or
          completely  power  the  node  off.   The  program  executes   as
          SlurmUser.   The  argument  to  the program will be the names of
          nodes to be  placed  into  power  savings  mode  (using  Slurm's
          hostlist  expression  format).   By  default, no program is run.
          Related    configuration    options    include    ResumeTimeout,
          ResumeProgram,     ResumeRate,     SuspendRate,     SuspendTime,
          SuspendTimeout, SuspendExcNodes, and SuspendExcParts.

   SuspendRate
          The rate at which nodes  are  place  into  power  save  mode  by
          SuspendProgram.   The value is number of nodes per minute and it
          can be used to prevent a large drop in power  consumption  (e.g.
          after  a  large  job  completes).  A value of zero results in no
          limits being imposed.  The default value is 60 nodes per minute.
          Related    configuration    options    include    ResumeTimeout,
          ResumeProgram,    ResumeRate,    SuspendProgram,    SuspendTime,
          SuspendTimeout, SuspendExcNodes, and SuspendExcParts.

   SuspendTime
          Nodes  which  remain  idle  for  this  number of seconds will be
          placed into power save mode by SuspendProgram.  A  value  of  -1
          disables   power   save   mode  and  is  the  default.   Related
          configuration  options  include  ResumeTimeout,   ResumeProgram,
          ResumeRate,    SuspendProgram,    SuspendRate,   SuspendTimeout,
          SuspendExcNodes, and SuspendExcParts.

   SuspendTimeout
          Maximum time permitted (in second) between when a  node  suspend
          request  is issued and when the node shutdown.  At that time the
          node must ready for a resume request to be issued as needed  for
          new   work.    The   default   value  is  30  seconds.   Related
          configuration   options   include   ResumeProgram,   ResumeRate,
          ResumeTimeout,    SuspendRate,    SuspendTime,   SuspendProgram,
          SuspendExcNodes  and  SuspendExcParts.   More   information   is
          available       at      the      Slurm      web      site      (
          http://slurm.schedmd.com/power_save.html ).

   SwitchType
          Identifies  the  type  of  switch  or  interconnect   used   for
          application    communications.     Acceptable   values   include
          "switch/none" for switches not requiring special processing  for
          job  launch  or  termination (Myrinet, Ethernet, and InfiniBand)
          and "switch/nrt" for IBM's  Network  Resource  Table  API.   The
          default value is "switch/none".  All Slurm daemons, commands and
          running jobs must be restarted for a  change  in  SwitchType  to
          take  effect.   If  running  jobs exist at the time slurmctld is
          restarted with a new value of SwitchType, records of all jobs in
          any state may be lost.

   TaskEpilog
          Fully qualified pathname of a program to be execute as the slurm
          job's owner after termination of each task.  See TaskProlog  for
          execution order details.

   TaskPlugin
          Identifies  the  type  of  task launch plugin, typically used to
          provide resource management within a node (e.g. pinning tasks to
          specific processors). More than one task plugin can be specified
          in a comma separated list. The prefix of  "task/"  is  optional.
          Acceptable values include:

          task/affinity  enables resource containment using CPUSETs.  This
                         enables the  --cpu_bind  and/or  --mem_bind  srun
                         options.    If   you   use   "task/affinity"  and
                         encounter problems, it may be due to the  variety
                         of  system  calls used to implement task affinity
                         on different operating systems.

          task/cgroup    enables resource containment using Linux  control
                         cgroups.   This  enables  the  --cpu_bind  and/or
                         --mem_bind  srun   options.    NOTE:   see   "man
                         cgroup.conf"  for  configuration  details.  NOTE:
                         This plugin  writes  to  disk  and  can  slightly
                         impact  performance.   If you are running lots of
                         short  running  jobs  (less  than  a  couple   of
                         seconds)   this  plugin  slows  down  performance
                         slightly.  It should probably be  avoided  in  an
                         HTC environment.

          task/none      for systems requiring no special handling of user
                         tasks.  Lacks support for the  --cpu_bind  and/or
                         --mem_bind  srun  options.   The default value is
                         "task/none".

   TaskPluginParam
          Optional parameters  for  the  task  plugin.   Multiple  options
          should  be  comma  separated.   If None, Boards, Sockets, Cores,
          Threads, and/or Verbose are specified, they  will  override  the
          --cpu_bind  option  specified  by  the user in the srun command.
          None, Boards, Sockets, Cores and Threads are mutually  exclusive
          and since they decrease scheduling flexibility are not generally
          recommended (select no more than  one  of  them).   Cpusets  and
          Sched  are  mutually  exclusive  (select only one of them).  All
          TaskPluginParam options are supported on FreeBSD except Cpusets.
          The  Sched  option  uses  cpuset_setaffinity()  on  FreeBSD, not
          sched_setaffinity().

          Boards    Bind tasks to boards by default.  Overrides  automatic
                    binding.

          Cores     Bind  tasks  to cores by default.  Overrides automatic
                    binding.

          Cpusets   Use cpusets to perform task  affinity  functions.   By
                    default, Sched task binding is performed.

          None      Perform   no   task  binding  by  default.   Overrides
                    automatic binding.

          Sched     Use sched_setaffinity (if available) to bind tasks  to
                    processors.

          Sockets   Bind  to  sockets  by  default.   Overrides  automatic
                    binding.

          Threads   Bind  to  threads  by  default.   Overrides  automatic
                    binding.

          Verbose   Verbosely  report binding before tasks run.  Overrides
                    user options.

          Autobind  Set a default binding in the event that "auto binding"
                    doesn't  find  a  match.   Set  to  Threads,  Cores or
                    Sockets (E.g. TaskPluginParam=autobind=threads).

   TaskProlog
          Fully qualified pathname of a program to be execute as the slurm
          job's  owner  prior  to  initiation  of  each task.  Besides the
          normal environment variables, this has SLURM_TASK_PID  available
          to  identify the process ID of the task being started.  Standard
          output from this program can be used to control the  environment
          variables and output for the user program.

          export NAME=value   Will  set environment variables for the task
                              being spawned.  Everything after  the  equal
                              sign  to the end of the line will be used as
                              the  value  for  the  environment  variable.
                              Exporting  of  functions  is  not  currently
                              supported.

          print ...           Will cause that line  (without  the  leading
                              "print   ")  to  be  printed  to  the  job's
                              standard output.

          unset NAME          Will clear  environment  variables  for  the
                              task being spawned.

          The order of task prolog/epilog execution is as follows:

          1. pre_launch_priv()
                              Function in TaskPlugin

          1. pre_launch()     Function in TaskPlugin

          2. TaskProlog       System-wide  per  task  program  defined  in
                              slurm.conf

          3. user prolog      Job step specific task program defined using
                              srun's      --task-prolog      option     or
                              SLURM_TASK_PROLOG environment variable

          4. Execute the job step's task

          5. user epilog      Job step specific task program defined using
                              srun's      --task-epilog      option     or
                              SLURM_TASK_EPILOG environment variable

          6. TaskEpilog       System-wide  per  task  program  defined  in
                              slurm.conf

          7. post_term()      Function in TaskPlugin

   TCPTimeout
          Time  permitted  for  TCP  connection to be established. Default
          value is 2 seconds.

   TmpFS  Fully qualified pathname of the file system  available  to  user
          jobs   for   temporary   storage.  This  parameter  is  used  in
          establishing a node's  TmpDisk  space.   The  default  value  is
          "/tmp".

   TopologyParam
          Comma separated options identifing network topology options.

          Dragonfly      Optimize allocation for Dragonfly network.  Valid
                         when TopologyPlugin=topology/tree.

          NoCtldInAddrAny
                         Used to directly bind to the address of what  the
                         node resolves to running the slurmctld instead of
                         binding messages to  any  address  on  the  node,
                         which is the default.

          NoInAddrAny    Used  to directly bind to the address of what the
                         node resolves to instead of binding  messages  to
                         any  address  on  the  node which is the default.
                         This option is for all daemons/clients except for
                         the slurmctld.

          TopoOptional   Only  optimize allocation for network topology if
                         the  job  includes   a   switch   option.   Since
                         optimizing   resource   allocation  for  topology
                         involves much higher system overhead, this option
                         can  be used to impose the extra overhead only on
                         jobs which can take advantage of it. If most  job
                         allocations   are   not   optimized  for  network
                         topology, they make  fragment  resources  to  the
                         point  that  topology optimization for other jobs
                         will be difficult to achieve.

   TopologyPlugin
          Identifies the plugin to be used  for  determining  the  network
          topology  and  optimizing  job  allocations  to minimize network
          contention.  See NETWORK TOPOLOGY below for details.  Additional
          plugins  may  be  provided  in  the future which gather topology
          information  directly  from  the  network.   Acceptable   values
          include:

          topology/3d_torus    best-fit   logic   over   three-dimensional
                               topology

          topology/node_rank   orders  nodes  based  upon  information   a
                               node_rank  field  in  the  node  record  as
                               generated  by  a   select   plugin.   Slurm
                               performs  a  best-fit  algorithm over those
                               ordered nodes

          topology/none        default for other systems,  best-fit  logic
                               over one-dimensional topology

          topology/tree        used   for   a   hierarchical   network  as
                               described in a topology.conf file

   TrackWCKey
          Boolean yes or no.   Used  to  set  display  and  track  of  the
          Workload  Characterization  Key.   Must  be set to track correct
          wckey usage.   NOTE:  You  must  also  set  TrackWCKey  in  your
          slurmdbd.conf file to create historical usage reports.

   TreeWidth
          Slurmd  daemons  use  a virtual tree network for communications.
          TreeWidth specifies the width of the tree (i.e. the fanout).  On
          architectures  with  a front end node running the slurmd daemon,
          the value must always be equal to or greater than the number  of
          front end nodes which eliminates the need for message forwarding
          between the slurmd daemons.  On other architectures the  default
          value  is 50, meaning each slurmd daemon can communicate with up
          to 50 other slurmd daemons and over 2500 nodes can be  contacted
          with  two  message  hops.   The default value will work well for
          most clusters.  Optimal  system  performance  can  typically  be
          achieved if TreeWidth is set to the square root of the number of
          nodes in the cluster for systems having no more than 2500  nodes
          or  the  cube  root for larger systems. The value may not exceed
          65533.

   UnkillableStepProgram
          If the processes in a job step are determined to  be  unkillable
          for  a  period  of  time  specified by the UnkillableStepTimeout
          variable, the program specified by UnkillableStepProgram will be
          executed.   This  program can be used to take special actions to
          clean  up  the  unkillable  processes  and/or  notify   computer
          administrators.   The  program  will  be run SlurmdUser (usually
          "root") on the compute node.  By default no program is run.

   UnkillableStepTimeout
          The length of time, in seconds,  that  Slurm  will  wait  before
          deciding that processes in a job step are unkillable (after they
          have    been    signaled    with    SIGKILL)     and     execute
          UnkillableStepProgram  as  described above.  The default timeout
          value is 60 seconds.

   UsePAM If set to 1, PAM (Pluggable Authentication  Modules  for  Linux)
          will  be enabled.  PAM is used to establish the upper bounds for
          resource  limits.  With  PAM  support  enabled,   local   system
          administrators can dynamically configure system resource limits.
          Changing the upper bound of a resource limit will not alter  the
          limits  of  running  jobs,  only jobs started after a change has
          been made will pick up the new limits.  The default value  is  0
          (not to enable PAM support).  Remember that PAM also needs to be
          configured to support Slurm as a service.  For sites using PAM's
          directory based configuration option, a configuration file named
          slurm should be created.  The  module-type,  control-flags,  and
          module-path names that should be included in the file are:
          auth        required      pam_localuser.so
          auth        required      pam_shells.so
          account     required      pam_unix.so
          account     required      pam_access.so
          session     required      pam_unix.so
          For sites configuring PAM with a general configuration file, the
          appropriate lines (see above), where slurm is the  service-name,
          should be added.

   VSizeFactor
          Memory  specifications in job requests apply to real memory size
          (also known as resident set size). It  is  possible  to  enforce
          virtual  memory  limits  for both jobs and job steps by limiting
          their virtual memory to some percentage  of  their  real  memory
          allocation. The VSizeFactor parameter specifies the job's or job
          step's virtual memory limit as a percentage of its  real  memory
          limit.  For  example,  if a job's real memory limit is 500MB and
          VSizeFactor is set to 101 then the job will  be  killed  if  its
          real  memory  exceeds  500MB or its virtual memory exceeds 505MB
          (101 percent of the real memory limit).  The default value is 0,
          which  disables enforcement of virtual memory limits.  The value
          may not exceed 65533 percent.

   WaitTime
          Specifies how many seconds the srun command  should  by  default
          wait  after  the  first  task  terminates before terminating all
          remaining tasks. The "--wait" option on the  srun  command  line
          overrides  this  value.   The default value is 0, which disables
          this feature.  May not exceed 65533 seconds.

   The configuration of nodes (or machines) to be managed by Slurm is also
   specified  in  /etc/slurm.conf.   Changes  in  node configuration (e.g.
   adding nodes, changing their processor count, etc.) require  restarting
   both  the  slurmctld daemon and the slurmd daemons.  All slurmd daemons
   must know each node in the system to forward  messages  in  support  of
   hierarchical communications.  Only the NodeName must be supplied in the
   configuration  file.   All  other  node  configuration  information  is
   optional.   It  is advisable to establish baseline node configurations,
   especially if the cluster is heterogeneous.  Nodes  which  register  to
   the  system  with  less  than the configured resources (e.g. too little
   memory), will be placed in the "DOWN" state to avoid scheduling jobs on
   them.   Establishing  baseline  configurations  will also speed Slurm's
   scheduling process by permitting it to compare job requirements against
   these  (relatively  few)  configuration  parameters  and possibly avoid
   having to  check  job  requirements  against  every  individual  node's
   configuration.   The  resources  checked at node registration time are:
   CPUs, RealMemory and TmpDisk.  While baseline values for each of  these
   can  be  established  in the configuration file, the actual values upon
   node registration are recorded and these actual values may be used  for
   scheduling  purposes  (depending  upon the value of FastSchedule in the
   configuration file.

   Default values can be specified with a  record  in  which  NodeName  is
   "DEFAULT".  The default entry values will apply only to lines following
   it in the configuration file  and  the  default  values  can  be  reset
   multiple  times  in  the configuration file with multiple entries where
   "NodeName=DEFAULT".  Each line where NodeName is "DEFAULT" will replace
   or  add  to  previous default values and not a reinitialize the default
   values.  The "NodeName=" specification must be  placed  on  every  line
   describing  the  configuration  of  nodes.   A single node name can not
   appear as a NodeName value in more than one line (duplicate  node  name
   records  will  be  ignored).   In  fact,  it  is generally possible and
   desirable to define the configurations of  all  nodes  in  only  a  few
   lines.    This  convention  permits  significant  optimization  in  the
   scheduling of larger clusters.  In order to support the concept of jobs
   requiring  consecutive nodes on some architectures, node specifications
   should be place in this file in consecutive order.  No single node name
   may   be  listed  more  than  once  in  the  configuration  file.   Use
   "DownNodes=" to record the state of nodes which are  temporarily  in  a
   DOWN,  DRAIN  or FAILING state without altering permanent configuration
   information.  A job step's tasks are allocated to nodes  in  order  the
   nodes   appear  in  the  configuration  file.  There  is  presently  no
   capability within Slurm to arbitrarily order a job step's tasks.

   Multiple node names may be comma  separated  (e.g.  "alpha,beta,gamma")
   and/or a simple node range expression may optionally be used to specify
   numeric ranges of nodes to avoid building  a  configuration  file  with
   large  numbers  of  entries.  The node range expression can contain one
   pair of square brackets with a  sequence  of  comma  separated  numbers
   and/or ranges of numbers separated by a "-" (e.g. "linux[0-64,128]", or
   "lx[15,18,32-33]").  Note that the numeric ranges can  include  one  or
   more  leading  zeros to indicate the numeric portion has a fixed number
   of digits (e.g. "linux[0000-1023]").  Up to two numeric ranges  can  be
   included  in the expression (e.g. "rack[0-63]_blade[0-41]").  If one or
   more numeric expressions are included, one of them must be at  the  end
   of the name (e.g. "unit[0-31]rack" is invalid), but arbitrary names can
   always be used in a comma separated list.

   On BlueGene systems only, the square brackets should contain  pairs  of
   three  digit  numbers  separated  by a "x".  These numbers indicate the
   boundaries of a rectangular prism (e.g.  "bgl[000x144,400x544]").   See
   BlueGene  documentation  for  more  details.   The  node  configuration
   specified the following information:

   NodeName
          Name that Slurm uses to refer to a node (or base  partition  for
          BlueGene  systems).   Typically  this  would  be the string that
          "/bin/hostname -s" returns.  It may also be the fully  qualified
          domain   name   as   returned   by   "/bin/hostname   -f"  (e.g.
          "foo1.bar.com"), or any valid domain name  associated  with  the
          host through the host database (/etc/hosts) or DNS, depending on
          the resolver settings.  Note that  if  the  short  form  of  the
          hostname is not used, it may prevent use of hostlist expressions
          (the numeric portion in brackets must  be  at  the  end  of  the
          string).   Only  short  hostname  forms  are compatible with the
          switch/nrt plugin at this time.  It may  also  be  an  arbitrary
          string  if  NodeHostname  is  specified.   If  the  NodeName  is
          "DEFAULT", the values specified with that record will  apply  to
          subsequent  node  specifications  unless explicitly set to other
          values in that node record or replaced with a different  set  of
          default  values.   Each  line  where  NodeName is "DEFAULT" will
          replace or add to previous default values and not a reinitialize
          the  default  values.  For architectures in which the node order
          is significant, nodes will  be  considered  consecutive  in  the
          order   defined.    For   example,   if  the  configuration  for
          "NodeName=charlie" immediately  follows  the  configuration  for
          "NodeName=baker"   they  will  be  considered  adjacent  in  the
          computer.

   NodeHostname
          Typically this would  be  the  string  that  "/bin/hostname  -s"
          returns.   It  may  also  be  the fully qualified domain name as
          returned by "/bin/hostname -f"  (e.g.  "foo1.bar.com"),  or  any
          valid  domain  name  associated  with  the host through the host
          database  (/etc/hosts)  or  DNS,  depending  on   the   resolver
          settings.   Note  that  if the short form of the hostname is not
          used, it may prevent use of hostlist  expressions  (the  numeric
          portion  in  brackets  must  be at the end of the string).  Only
          short hostname forms are compatible with the  switch/nrt  plugin
          at  this time.  A node range expression can be used to specify a
          set of nodes.  If an expression is used,  the  number  of  nodes
          identified  by  NodeHostname on a line in the configuration file
          must be identical to the number of nodes identified by NodeName.
          By  default,  the  NodeHostname  will  be  identical in value to
          NodeName.

   NodeAddr
          Name that a  node  should  be  referred  to  in  establishing  a
          communications  path.   This name will be used as an argument to
          the gethostbyname() function  for  identification.   If  a  node
          range  expression is used to designate multiple nodes, they must
          exactly   match   the   entries   in    the    NodeName    (e.g.
          "NodeName=lx[0-7]   NodeAddr=elx[0-7]").    NodeAddr   may  also
          contain  IP  addresses.   By  default,  the  NodeAddr  will   be
          identical in value to NodeHostname.

   Boards Number of Baseboards in nodes with a baseboard controller.  Note
          that when Boards is specified, SocketsPerBoard,  CoresPerSocket,
          and  ThreadsPerCore  should  be  specified.  Boards and CPUs are
          mutually exclusive.  The default value is 1.

   CoreSpecCount
          Number of cores on which Slurm  compute  node  daemons  (slurmd,
          slurmstepd)  will be confined. These cores will not be available
          for allocation to user jobs.  Isolation  of  the  Slurm  daemons
          from  user  jobs  may  improve  performance.  If this option and
          CPUSpecList  are  both  designated  for  a  node,  an  error  is
          generated.   For  information  on the algorithm used by Slurm to
          select the cores refer to the core specialization  documentation
          (  http://slurm.schedmd.com/core_spec.html ). This option has no
          effect  unless  cgroup  job  confinement  is   also   configured
          (TaskPlugin=task/cgroup with ConstrainCores=yes in cgroup.conf).

   CoresPerSocket
          Number  of  cores  in  a  single physical processor socket (e.g.
          "2").  The CoresPerSocket value describes  physical  cores,  not
          the  logical number of processors per socket.  NOTE: If you have
          multi-core processors, you will  likely  need  to  specify  this
          parameter in order to optimize scheduling.  The default value is
          1.

   CPUs   Number of logical processors on the node (e.g. "2").   CPUs  and
          Boards are mutually exclusive. It can be set to the total number
          of sockets, cores or threads. This can be useful when  you  want
          to schedule only the cores on a hyper-threaded node.  If CPUs is
          omitted, it will  be  set  equal  to  the  product  of  Sockets,
          CoresPerSocket, and ThreadsPerCore.  The default value is 1.

   CPUSpecList
          A  comma delimited list of Slurm abstract CPU IDs on which Slurm
          compute node daemons (slurmd, slurmstepd) will be confined.  The
          list  will be expanded to include all other CPUs, if any, on the
          same cores. These cores will not be available for allocation  to
          user  jobs.  Isolation  of  the Slurm daemons from user jobs may
          improve performance.  If this option and CoreSpecCount are  both
          designated  for  a node, an error is generated.  This option has
          no effect unless  cgroup  job  confinement  is  also  configured
          (TaskPlugin=task/cgroup with ConstrainCores=yes in cgroup.conf).

   Feature
          A  comma  delimited list of arbitrary strings indicative of some
          characteristic associated with the  node.   There  is  no  value
          associated  with  a  feature  at  this time, a node either has a
          feature or it does not.  If desired  a  feature  may  contain  a
          numeric  component indicating, for example, processor speed.  By
          default a node has no features.  Also see Gres.

   Gres   A comma delimited list of generic resources specifications for a
          node.                 The               format               is:
          "<name>[:<type>][:no_consume]:<number>[K|M|G]".  The first field
          is  the  resource name, which matches the GresType configuration
          parameter name.  The  optional  type  field  might  be  used  to
          identify  a  model of that generic resource.  A generic resource
          can also be specified as non-consumable (i.e. multiple jobs  can
          use   the   same  generic  resource)  with  the  optional  field
          ":no_consume".  The final field must specify a generic resources
          count.   A  suffix  of  "K", "M", "G", "T" or "P" may be used to
          multiply  the  number  by  1024,   1048576,   1073741824,   etc.
          respectively.
          (e.g."Gres=gpu:tesla:1,gpu:kepler:1,bandwidth:lustre:no_consume:4G").
          By default a node has no generic resources and its maximum count
          is that of an unsigned 64bit integer.  Also see Feature.

   MemSpecLimit
          Limit on  combined  real  memory  allocation  for  compute  node
          daemons  (slurmd,  slurmstepd), in megabytes. This memory is not
          available to job allocations. The daemons won't be  killed  when
          they  exhaust  the  memory  allocation  (ie.  the  OOM Killer is
          disabled for the daemon's memory cgroup).  This  option  has  no
          effect   unless   cgroup  job  confinement  is  also  configured
          (TaskPlugin=task/cgroup    with     ConstrainRAMSpace=yes     in
          cgroup.conf).

   Port   The  port  number  that  the  Slurm compute node daemon, slurmd,
          listens to for work on this particular node. By default there is
          a single port number for all slurmd daemons on all compute nodes
          as defined by the SlurmdPort  configuration  parameter.  Use  of
          this  option is not generally recommended except for development
          or testing purposes. If multiple slurmd  daemons  execute  on  a
          node this can specify a range of ports.

          Note:  On Cray systems, Realm-Specific IP Addressing (RSIP) will
          automatically try to interact  with  anything  opened  on  ports
          8192-60000.   Configure  Port  to  use  a  port  outside  of the
          configured SrunPortRange and RSIP's port range.

   Procs  See CPUs.

   RealMemory
          Size of real memory on the node in MegaBytes (e.g. "2048").  The
          default value is 1.

   Reason Identifies  the  reason  for  a  node  being  in  state  "DOWN",
          "DRAINED"  "DRAINING",  "FAIL"  or  "FAILING".   Use  quotes  to
          enclose a reason having more than one word.

   Sockets
          Number  of  physical  processor  sockets/chips on the node (e.g.
          "2").  If Sockets is omitted, it will  be  inferred  from  CPUs,
          CoresPerSocket,   and   ThreadsPerCore.    NOTE:   If  you  have
          multi-core processors, you will likely  need  to  specify  these
          parameters.  Sockets and SocketsPerBoard are mutually exclusive.
          If Sockets is specified when Boards is  also  used,  Sockets  is
          interpreted  as  SocketsPerBoard rather than total sockets.  The
          default value is 1.

   SocketsPerBoard
          Number of  physical  processor  sockets/chips  on  a  baseboard.
          Sockets and SocketsPerBoard are mutually exclusive.  The default
          value is 1.

   State  State of the node with respect to the initiation of  user  jobs.
          Acceptable   values   are   "CLOUD",  "DOWN",  "DRAIN",  "FAIL",
          "FAILING", "FUTURE" and "UNKNOWN".  Node states  of  "BUSY"  and
          "IDLE"  should  not  be specified in the node configuration, but
          set the node state to "UNKNOWN" instead.  Setting the node state
          to  "UNKNOWN" will result in the node state being set to "BUSY",
          "IDLE" or other appropriate state based  upon  recovered  system
          state  information.   The  default value is "UNKNOWN".  Also see
          the DownNodes parameter below.

          CLOUD     Indicates the node exists in the cloud.  It's  initial
                    state  will be treated as powered down.  The node will
                    be available for use after  it's  state  is  recovered
                    from  Slurm's  state  save  file  or the slurmd daemon
                    starts on the compute node.

          DOWN      Indicates the node failed and  is  unavailable  to  be
                    allocated work.

          DRAIN     Indicates  the  node  is  unavailable  to be allocated
                    work.on.

          FAIL      Indicates the node is expected to fail  soon,  has  no
                    jobs allocated to it, and will not be allocated to any
                    new jobs.

          FAILING   Indicates the node is expected to fail soon,  has  one
                    or  more  jobs  allocated  to  it,  but  will  not  be
                    allocated to any new jobs.

          FUTURE    Indicates the node is defined for future use and  need
                    not  exist  when  the Slurm daemons are started. These
                    nodes can be made available for use simply by updating
                    the  node state using the scontrol command rather than
                    restarting the slurmctld daemon. After these nodes are
                    made  available,  change their State in the slurm.conf
                    file. Until these nodes are made available, they  will
                    not  be  seen using any Slurm commands or nor will any
                    attempt be made to contact them.

          UNKNOWN   Indicates the  node's  state  is  undefined  (BUSY  or
                    IDLE),  but will be established when the slurmd daemon
                    on  that  node  registers.   The  default   value   is
                    "UNKNOWN".

   ThreadsPerCore
          Number  of logical threads in a single physical core (e.g. "2").
          Note that the Slurm can allocate resources to jobs down  to  the
          resolution  of  a  core.  If your system is configured with more
          than one thread per core, execution of a different job  on  each
          thread     is     not    supported    unless    you    configure
          SelectTypeParameters=CR_CPU plus CPUs; do not configure Sockets,
          CoresPerSocket  or ThreadsPerCore.  A job can execute a one task
          per thread from within one job step or execute  a  distinct  job
          step  on each of the threads.  Note also if you are running with
          more than 1 thread per  core  and  running  the  select/cons_res
          plugin you will want to set the SelectTypeParameters variable to
          something other than CR_CPU to avoid  unexpected  results.   The
          default value is 1.

   TmpDisk
          Total size of temporary disk storage in TmpFS in MegaBytes (e.g.
          "16384"). TmpFS (for "Temporary  File  System")  identifies  the
          location which jobs should use for temporary storage.  Note this
          does not indicate the amount of free space available to the user
          on  the  node,  only  the  total  file  system  size. The system
          administration should insure  this  file  system  is  purged  as
          needed so that user jobs have access to most of this space.  The
          Prolog and/or Epilog programs (specified  in  the  configuration
          file)  might  be  used  to insure the file system is kept clean.
          The default value is 0.

   Weight The priority of the node for scheduling  purposes.   All  things
          being  equal,  jobs  will be allocated the nodes with the lowest
          weight which  satisfies  their  requirements.   For  example,  a
          heterogeneous  collection of nodes might be placed into a single
          partition for greater  system  utilization,  responsiveness  and
          capability.  It  would  be preferable to allocate smaller memory
          nodes rather than larger memory nodes if either will  satisfy  a
          job's  requirements.   The  units  of  weight are arbitrary, but
          larger weights should be assigned to nodes with more processors,
          memory, disk space, higher processor speed, etc.  Note that if a
          job allocation request can not be satisfied using the nodes with
          the  lowest weight, the set of nodes with the next lowest weight
          is added to the set of nodes under consideration for use (repeat
          as  needed  for higher weight values). If you absolutely want to
          minimize the number of higher weight nodes allocated  to  a  job
          (at  a  cost  of  higher  scheduling overhead), give each node a
          distinct Weight value and they will be  added  to  the  pool  of
          nodes being considered for scheduling individually.  The default
          value is 1.

   The "DownNodes=" configuration permits you to mark certain nodes as  in
   a  DOWN,  DRAIN,  FAIL, or FAILING state without altering the permanent
   configuration information listed under a "NodeName=" specification.

   DownNodes
          Any node name, or list  of  node  names,  from  the  "NodeName="
          specifications.

   Reason Identifies the reason for a node being in state "DOWN", "DRAIN",
          "FAIL" or "FAILING.  Use quotes to enclose a reason having  more
          than one word.

   State  State  of  the node with respect to the initiation of user jobs.
          Acceptable values are "DOWN",  "DRAIN",  "FAIL",  "FAILING"  and
          "UNKNOWN".   Node  states  of  "BUSY"  and  "IDLE" should not be
          specified in the node configuration, but set the node  state  to
          "UNKNOWN"  instead.   Setting  the  node state to "UNKNOWN" will
          result in the node state being set to "BUSY",  "IDLE"  or  other
          appropriate state based upon recovered system state information.
          The default value is "UNKNOWN".

          DOWN      Indicates the node failed and  is  unavailable  to  be
                    allocated work.

          DRAIN     Indicates  the  node  is  unavailable  to be allocated
                    work.on.

          FAIL      Indicates the node is expected to fail  soon,  has  no
                    jobs allocated to it, and will not be allocated to any
                    new jobs.

          FAILING   Indicates the node is expected to fail soon,  has  one
                    or  more  jobs  allocated  to  it,  but  will  not  be
                    allocated to any new jobs.

          UNKNOWN   Indicates the  node's  state  is  undefined  (BUSY  or
                    IDLE),  but will be established when the slurmd daemon
                    on  that  node  registers.   The  default   value   is
                    "UNKNOWN".

   On  computers  where  frontend  nodes are used to execute batch scripts
   rather than compute nodes (BlueGene or Cray systems), one may configure
   one  or  more frontend nodes using the configuration parameters defined
   below. These options are very similar  to  those  used  in  configuring
   compute nodes. These options may only be used on systems configured and
   built    with    the    appropriate    parameters    (--have-front-end,
   --enable-bluegene-emulation)   or  a  system  determined  to  have  the
   appropriate architecture by the  configure  script  (BlueGene  or  Cray
   systems).    The   front  end  configuration  specifies  the  following
   information:

   AllowGroups
          Comma separated list of group names which may  execute  jobs  on
          this  front  end node. By default, all groups may use this front
          end node.  If at  least  one  group  associated  with  the  user
          attempting  to  execute  the  job  is in AllowGroups, he will be
          permitted to use this front end node.  May not be used with  the
          DenyGroups option.

   AllowUsers
          Comma  separated  list  of  user names which may execute jobs on
          this front end node. By default, all users may  use  this  front
          end node.  May not be used with the DenyUsers option.

   DenyGroups
          Comma  separated  list  of  group names which are prevented from
          executing jobs on this front end node.  May not be used with the
          AllowGroups option.

   DenyUsers
          Comma  separated  list  of  user  names which are prevented from
          executing jobs on this front end node.  May not be used with the
          AllowUsers option.

   FrontendName
          Name  that  Slurm  uses  to refer to a frontend node.  Typically
          this would be the string that "/bin/hostname  -s"  returns.   It
          may  also  be  the  fully  qualified  domain name as returned by
          "/bin/hostname -f" (e.g. "foo1.bar.com"), or  any  valid  domain
          name   associated  with  the  host  through  the  host  database
          (/etc/hosts) or DNS, depending on the resolver  settings.   Note
          that  if  the  short  form  of  the hostname is not used, it may
          prevent use of hostlist  expressions  (the  numeric  portion  in
          brackets must be at the end of the string).  If the FrontendName
          is "DEFAULT", the values specified with that record  will  apply
          to subsequent node specifications unless explicitly set to other
          values in that frontend node record or replaced with a different
          set   of  default  values.   Each  line  where  FrontendName  is
          "DEFAULT" will replace or add to previous default values and not
          a  reinitialize  the default values.  Note that since the naming
          of front end nodes  would  typically  not  follow  that  of  the
          compute  nodes (e.g. lacking X, Y and Z coordinates found in the
          compute node naming scheme), each front end node name should  be
          listed  separately  and  without  a  hostlist  expression  (i.e.
          frontend00,frontend01" rather than "frontend[00-01]").</p>

   FrontendAddr
          Name that a frontend node should be referred to in  establishing
          a  communications path. This name will be used as an argument to
          the  gethostbyname()  function  for  identification.   As   with
          FrontendName,  list  the  individual  node addresses rather than
          using a hostlist expression.  The number of FrontendAddr records
          per  line must equal the number of FrontendName records per line
          (i.e. you can't map to node names to one address).  FrontendAddr
          may  also  contain  IP  addresses.  By default, the FrontendAddr
          will be identical in value to FrontendName.

   Port   The port number that the  Slurm  compute  node  daemon,  slurmd,
          listens to for work on this particular frontend node. By default
          there is a single port number for  all  slurmd  daemons  on  all
          frontend  nodes  as  defined  by  the  SlurmdPort  configuration
          parameter. Use of  this  option  is  not  generally  recommended
          except for development or testing purposes.

          Note:  On Cray systems, Realm-Specific IP Addressing (RSIP) will
          automatically try to interact  with  anything  opened  on  ports
          8192-60000.   Configure  Port  to  use  a  port  outside  of the
          configured SrunPortRange and RSIP's port range.

   Reason Identifies the reason for a frontend node being in state "DOWN",
          "DRAINED"  "DRAINING",  "FAIL"  or  "FAILING".   Use  quotes  to
          enclose a reason having more than one word.

   State  State of the frontend node with respect  to  the  initiation  of
          user  jobs.   Acceptable  values  are  "DOWN",  "DRAIN", "FAIL",
          "FAILING" and "UNKNOWN".  "DOWN" indicates the frontend node has
          failed  and  is  unavailable  to  be  allocated  work.   "DRAIN"
          indicates the frontend node is unavailable to be allocated work.
          "FAIL" indicates the frontend node is expected to fail soon, has
          no jobs allocated to it, and will not be allocated  to  any  new
          jobs.  "FAILING" indicates the frontend node is expected to fail
          soon, has one or more jobs allocated to  it,  but  will  not  be
          allocated  to  any  new  jobs.  "UNKNOWN" indicates the frontend
          node's  state  is  undefined  (BUSY  or  IDLE),  but   will   be
          established  when the slurmd daemon on that node registers.  The
          default value is "UNKNOWN".  Also see  the  DownNodes  parameter
          below.

          For            example:            "FrontendName=frontend[00-03]
          FrontendAddr=efrontend[00-03] State=UNKNOWN" is used  to  define
          four front end nodes for running slurmd daemons.

   The  partition  configuration  permits  you  to establish different job
   limits or access controls for various groups (or partitions) of  nodes.
   Nodes  may  be  in  more than one partition, making partitions serve as
   general purpose queues.  For example one may put the same set of  nodes
   into  two  different  partitions, each with different constraints (time
   limit, job sizes, groups allowed to use the partition, etc.).  Jobs are
   allocated  resources  within a single partition.  Default values can be
   specified with a record  in  which  PartitionName  is  "DEFAULT".   The
   default  entry  values  will  apply  only  to lines following it in the
   configuration file and the default values can be reset  multiple  times
   in    the    configuration    file    with   multiple   entries   where
   "PartitionName=DEFAULT".  The "PartitionName="  specification  must  be
   placed  on every line describing the configuration of partitions.  Each
   line where PartitionName is "DEFAULT" will replace or add  to  previous
   default  values  and  not  a reinitialize the default values.  A single
   partition name can not appear as a PartitionName value in more than one
   line  (duplicate  partition  name  records  will  be  ignored).   If  a
   partition that is in use is deleted from the configuration and slurm is
   restarted  or  reconfigured  (scontrol  reconfigure),  jobs  using  the
   partition are canceled.  NOTE: Put all parameters for each partition on
   a single line.  Each line of partition configuration information should
   represent a different  partition.   The  partition  configuration  file
   contains the following information:

   AllocNodes
          Comma  separated  list of nodes from which users can submit jobs
          in the partition.  Node names may be specified  using  the  node
          range  expression  syntax described above.  The default value is
          "ALL".

   AllowAccounts
          Comma separated list of accounts which may execute jobs  in  the
          partition.   The default value is "ALL".  NOTE: If AllowAccounts
          is used then DenyAccounts will not be enforced.  Also  refer  to
          DenyAccounts.

   AllowGroups
          Comma  separated  list  of group names which may execute jobs in
          the partition.  If at least one group associated with  the  user
          attempting  to  execute  the  job  is in AllowGroups, he will be
          permitted to use this partition.  Jobs executed as user root can
          use  any  partition  without regard to the value of AllowGroups.
          If user root attempts to execute a job  as  another  user  (e.g.
          using  srun's  --uid  option), this other user must be in one of
          groups identified by AllowGroups for  the  job  to  successfully
          execute.   The  default  value  is "ALL".  NOTE: For performance
          reasons, Slurm maintains a list of user IDs allowed to use  each
          partition and this is checked at job submission time.  This list
          of user IDs is updated when the slurmctld daemon  is  restarted,
          reconfigured  (e.g.  "scontrol  reconfig")  or  the  partition's
          AllowGroups value is reset, even if is value is unchanged  (e.g.
          "scontrol  update PartitionName=name AllowGroups=group").  For a
          user's  access  to  a  partition  to  change,  both  his   group
          membership  must  change  and Slurm's internal user ID list must
          change using one of the methods described above.

   AllowQos
          Comma separated list of  Qos  which  may  execute  jobs  in  the
          partition.   Jobs  executed  as  user root can use any partition
          without regard to the value of AllowQos.  The default  value  is
          "ALL".   NOTE:  If  AllowQos  is  used  then DenyQos will not be
          enforced.  Also refer to DenyQos.

   Alternate
          Partition name of alternate partition to be used if the state of
          this partition is "DRAIN" or "INACTIVE."

   Default
          If  this  keyword  is  set,  jobs  submitted without a partition
          specification will utilize this partition.  Possible values  are
          "YES" and "NO".  The default value is "NO".

   DefMemPerCPU
          Default   real  memory  size  available  per  allocated  CPU  in
          MegaBytes.  Used to avoid over-subscribing  memory  and  causing
          paging.   DefMemPerCPU  would  generally  be  used if individual
          processors are allocated to  jobs  (SelectType=select/cons_res).
          If  not  set, the DefMemPerCPU value for the entire cluster will
          be used.  Also see DefMemPerNode and MaxMemPerCPU.  DefMemPerCPU
          and  DefMemPerNode are mutually exclusive.  NOTE: Enforcement of
          memory limits currently requires enabling of  accounting,  which
          samples memory use on a periodic basis (data need not be stored,
          just collected).

   DefMemPerNode
          Default  real  memory  size  available  per  allocated  node  in
          MegaBytes.   Used  to  avoid over-subscribing memory and causing
          paging.  DefMemPerNode would generally be used  if  whole  nodes
          are  allocated  to jobs (SelectType=select/linear) and resources
          are over-subscribed (OverSubscribe=yes or  OverSubscribe=force).
          If  not set, the DefMemPerNode value for the entire cluster will
          be used.  Also see DefMemPerCPU and MaxMemPerNode.  DefMemPerCPU
          and  DefMemPerNode are mutually exclusive.  NOTE: Enforcement of
          memory limits currently requires enabling of  accounting,  which
          samples memory use on a periodic basis (data need not be stored,
          just collected).

   DenyAccounts
          Comma separated list of accounts which may not execute  jobs  in
          the  partition.  By default, no accounts are denied access NOTE:
          If AllowAccounts is used then DenyAccounts will not be enforced.
          Also refer to AllowAccounts.

   DenyQos
          Comma  separated  list  of Qos which may not execute jobs in the
          partition.  By default,  no  QOS  are  denied  access  NOTE:  If
          AllowQos  is used then DenyQos will not be enforced.  Also refer
          AllowQos.

   DefaultTime
          Run time limit used for jobs that don't specify a value. If  not
          set  then  MaxTime  will  be  used.   Format  is the same as for
          MaxTime.

   DisableRootJobs
          If set to "YES" then user root will be  prevented  from  running
          any jobs on this partition.  The default value will be the value
          of DisableRootJobs set  outside  of  a  partition  specification
          (which is "NO", allowing user root to execute jobs).

   ExclusiveUser
          If  set  to  "YES"  then  nodes will be exclusively allocated to
          users.  Multiple jobs may be run for the same user, but only one
          user can be active at a time.  This capability is also available
          on a per-job basis by using the --exclusive=user option.

   GraceTime
          Specifies, in units of seconds, the preemption grace time to  be
          extended  to  a job which has been selected for preemption.  The
          default value is zero, no preemption grace time  is  allowed  on
          this  partition.   Once  a job has been selected for preemption,
          it's end time is set to the current time plus GraceTime. The job
          is  immediately  sent  SIGCONT  and  SIGTERM signals in order to
          provide notification  of  its  imminent  termination.   This  is
          followed  by  the  SIGCONT,  SIGTERM and SIGKILL signal sequence
          upon  reaching  its  new  end  time.    (Meaningful   only   for
          PreemptMode=CANCEL)

   Hidden Specifies  if  the  partition  and  its jobs are to be hidden by
          default.  Hidden partitions will by default not be  reported  by
          the Slurm APIs or commands.  Possible values are "YES" and "NO".
          The default value is "NO".  Note that  partitions  that  a  user
          lacks access to by virtue of the AllowGroups parameter will also
          be hidden by default.

   LLN    Schedule resources to jobs on the least loaded nodes (based upon
          the number of idle CPUs). This is generally only recommended for
          an environment with serial jobs as idle resources will  tend  to
          be   highly   fragmented,   resulting  in  parallel  jobs  being
          distributed across many nodes.  Also  see  the  SelectParameters
          configuration  parameter CR_LLN to use the least loaded nodes in
          every partition.

   MaxCPUsPerNode
          Maximum number of CPUs on any node available to  all  jobs  from
          this partition.  This can be especially useful to schedule GPUs.
          For example a node can be associated with two  Slurm  partitions
          (e.g.  "cpu"  and  "gpu") and the partition/queue "cpu" could be
          limited to only a subset of the node's CPUs, insuring  that  one
          or   more   CPUs  would  be  available  to  jobs  in  the  "gpu"
          partition/queue.

   MaxMemPerCPU
          Maximum  real  memory  size  available  per  allocated  CPU   in
          MegaBytes.   Used  to  avoid over-subscribing memory and causing
          paging.  MaxMemPerCPU would  generally  be  used  if  individual
          processors  are  allocated to jobs (SelectType=select/cons_res).
          If not set, the MaxMemPerCPU value for the entire  cluster  will
          be used.  Also see DefMemPerCPU and MaxMemPerNode.  MaxMemPerCPU
          and MaxMemPerNode are mutually exclusive.  NOTE: Enforcement  of
          memory  limits  currently requires enabling of accounting, which
          samples memory use on a periodic basis (data need not be stored,
          just collected).

   MaxMemPerNode
          Maximum  real  memory  size  available  per  allocated  node  in
          MegaBytes.  Used to avoid over-subscribing  memory  and  causing
          paging.   MaxMemPerNode  would  generally be used if whole nodes
          are allocated to jobs (SelectType=select/linear)  and  resources
          are  over-subscribed (OverSubscribe=yes or OverSubscribe=force).
          If not set, the MaxMemPerNode value for the entire cluster  will
          be used.  Also see DefMemPerNode and MaxMemPerCPU.  MaxMemPerCPU
          and MaxMemPerNode are mutually exclusive.  NOTE: Enforcement  of
          memory  limits  currently requires enabling of accounting, which
          samples memory use on a periodic basis (data need not be stored,
          just collected).

   MaxNodes
          Maximum count of nodes which may be allocated to any single job.
          For BlueGene systems this will be a  c-nodes count and  will  be
          converted  to  a  midplane count with a reduction in resolution.
          The  default  value  is  "UNLIMITED",   which   is   represented
          internally as -1.  This limit does not apply to jobs executed by
          SlurmUser or user root.

   MaxTime
          Maximum  run  time  limit  for   jobs.    Format   is   minutes,
          minutes:seconds,        hours:minutes:seconds,       days-hours,
          days-hours:minutes, days-hours:minutes:seconds  or  "UNLIMITED".
          Time  resolution  is one minute and second values are rounded up
          to the next minute.  This limit does not apply to jobs  executed
          by SlurmUser or user root.

   MinNodes
          Minimum count of nodes which may be allocated to any single job.
          For BlueGene systems this will be a  c-nodes count and  will  be
          converted  to  a  midplane count with a reduction in resolution.
          The default value is 1.  This  limit  does  not  apply  to  jobs
          executed by SlurmUser or user root.

   Nodes  Comma  separated  list of nodes (or base partitions for BlueGene
          systems) which are associated with this partition.   Node  names
          may   be  specified  using  the  node  range  expression  syntax
          described above. A blank list of nodes (i.e. "Nodes= ")  can  be
          used  if  one  wants a partition to exist, but have no resources
          (possibly on a temporary basis).  A value of "ALL" is mapped  to
          all nodes configured in the cluster.

   OverSubscribe
          Controls  the  ability of the partition to execute more than one
          job at a time on each resource (node, socket or  core  depending
          upon the value of SelectTypeParameters).  If resources are to be
          over-subscribed,  avoiding  memory  over-subscription  is   very
          important.   SelectTypeParameters  should be configured to treat
          memory as a consumable resource and the --mem option  should  be
          used  for  job  allocations.   Sharing of resources is typically
          useful      only      when      using      gang       scheduling
          (PreemptMode=suspend,gang).   Possible  values for OverSubscribe
          are "EXCLUSIVE", "FORCE", "YES", and "NO".  Note that a value of
          "YES"  or  "FORCE" can negatively impact performance for systems
          with many thousands of running jobs.  The default value is "NO".
          For more information see the following web pages:
          http://slurm.schedmd.com/cons_res.html,
          http://slurm.schedmd.com/cons_res_share.html,
          http://slurm.schedmd.com/gang_scheduling.html, and
          http://slurm.schedmd.com/preempt.html.

          EXCLUSIVE   Allocates   entire   nodes   to   jobs   even   with
                      select/cons_res  configured.   Jobs  that   run   in
                      partitions  with "OverSubscribe=EXCLUSIVE" will have
                      exclusive access to all allocated nodes.

          FORCE       Makes all resources in the partition  available  for
                      sharing  without  any means for users to disable it.
                      May be followed with a colon and maximum  number  of
                      jobs  in  running  or  suspended state.  For example
                      "OverSubscribe=FORCE:4" enables each node, socket or
                      core   to   execute   up   to  four  jobs  at  once.
                      Recommended only  for  BlueGene  systems  configured
                      with  small  blocks or for systems running with gang
                      scheduling    (PreemptMode=suspend,gang).      NOTE:
                      PreemptType=QOS will permit one additional job to be
                      run  on  the  partition  if  started  due   to   job
                      preemption.   For   example,   a   configuration  of
                      OverSubscribe=FORCE:1 will only permit one  job  per
                      resources  normally, but a second job can be started
                      if done so through preemption based upon  QOS.   The
                      use  of PreemptType=QOS and PreemptType=Suspend only
                      applies with SelectType=cons_res.

          YES         Makes all resources in the partition  available  for
                      sharing  upon  request  by  the job.  Resources will
                      only be over-subscribed when explicitly requested by
                      the   user   using   the  "--share"  option  on  job
                      submission.   May  be  followed  with  a  colon  and
                      maximum  number  of  jobs  in  running  or suspended
                      state.  For  example  "OverSubscribe=YES:4"  enables
                      each node, socket or core to execute up to four jobs
                      at once.  Recommended only for systems running  with
                      gang scheduling (PreemptMode=suspend,gang).

          NO          Selected resources are allocated to a single job. No
                      resource will be allocated to more than one job.

   PartitionName
          Name  by  which  the   partition   may   be   referenced   (e.g.
          "Interactive").   This  name  can  be  specified  by  users when
          submitting jobs.  If the PartitionName is "DEFAULT", the  values
          specified  with  that  record will apply to subsequent partition
          specifications unless explicitly set to  other  values  in  that
          partition  record  or  replaced  with a different set of default
          values.  Each line where PartitionName is "DEFAULT" will replace
          or  add  to  previous  default values and not a reinitialize the
          default values.

   PreemptMode
          Mechanism  used  to  preempt  jobs  from  this  partition   when
          PreemptType=preempt/partition_prio    is    configured.     This
          partition  specific  PreemptMode  configuration  parameter  will
          override  the  PreemptMode  configuration  parameter set for the
          cluster as a whole.  The cluster-level PreemptMode must  include
          the  GANG option if PreemptMode is configured to SUSPEND for any
          partition.  The cluster-level PreemptMode must  not  be  OFF  if
          PreemptMode  is enabled for any  partition.  See the description
          of the cluster-level PreemptMode configuration  parameter  above
          for further information.

   PriorityJobFactor
          Partition   factor   used   by  priority/multifactor  plugin  in
          calculating job priority.  The value may not exceed 65533.  Also
          see PriorityTier.

   PriorityTier
          Jobs  submitted to a partition with a higher priority tier value
          will be dispatched before pending jobs in partition  with  lower
          priority  tier value and, if possible, they will preempt running
          jobs from partitions with lower priority tier values.  Note that
          a  partition's  priority  tier  takes  precedence  over  a job's
          priority.   The  value  may  not   exceed   65533.    Also   see
          PriorityJobFactor.

   QOS    Used  to  extend  the  limits available to a QOS on a partition.
          Jobs will not  be  associated  to  this  QOS  outside  of  being
          associated  to  the partition.  They will still be associated to
          their requested QOS.  By default, no QOS is used.   NOTE:  If  a
          limit  is  set in both the Partition's QOS and the Job's QOS the
          Partition QOS will be honored  unless  the  Job's  QOS  has  the
          OverPartQOS flag set in which the Job's QOS will have priority.

   ReqResv
          Specifies  users  of  this partition are required to designate a
          reservation when submitting a job. This option can be useful  in
          restricting  usage  of a partition that may have higher priority
          or additional resources to be allowed only within a reservation.
          Possible values are "YES" and "NO".  The default value is "NO".

   RootOnly
          Specifies  if  only  user  ID zero (i.e. user root) may allocate
          resources in this partition. User root  may  allocate  resources
          for  any  other  user, but the request must be initiated by user
          root.  This option can be useful for a partition to  be  managed
          by  some  external  entity (e.g. a higher-level job manager) and
          prevents users from directly using  those  resources.   Possible
          values are "YES" and "NO".  The default value is "NO".

   SelectTypeParameters
          Partition-specific   resource   allocation  type.   This  option
          replaces  the  global  SelectTypeParameters  value.    Supported
          values    are    CR_Core,    CR_Core_Memory,    CR_Socket    and
          CR_Socket_Memory.      Use     requires     the      system-wide
          SelectTypeParameters value be set.

   Shared The  Shared  configuration  parameter  has  been replaced by the
          OverSubscribe parameter described above.

   State  State of partition or availability for use.  Possible values are
          "UP", "DOWN", "DRAIN" and "INACTIVE". The default value is "UP".
          See also the related "Alternate" keyword.

          UP        Designates that new jobs may queued on the  partition,
                    and  that jobs may be allocated nodes and run from the
                    partition.

          DOWN      Designates  that  new  jobs  may  be  queued  on   the
                    partition,  but queued jobs may not be allocated nodes
                    and run from the partition. Jobs  already  running  on
                    the  partition  continue  to  run.  The  jobs  must be
                    explicitly canceled to force their termination.

          DRAIN     Designates that no new  jobs  may  be  queued  on  the
                    partition (job submission requests will be denied with
                    an error message), but  jobs  already  queued  on  the
                    partition  may  be  allocated nodes and run.  See also
                    the "Alternate" partition specification.

          INACTIVE  Designates that no new  jobs  may  be  queued  on  the
                    partition,   and   jobs  already  queued  may  not  be
                    allocated nodes and run.   See  also  the  "Alternate"
                    partition specification.

   TRESBillingWeights
          TRESBillingWeights is used to define the billing weights of each
          TRES type that will be used in calcuating the usage of a job.

          Billing weights are specified as a comma-separated list of <TRES
          Type>=<TRES Billing Weight> pairs.

          Any  TRES Type is available for billing. Note that base the unit
          for memory and burst buffers is megabytes.

          By default the billing of TRES is calculated as the sum  of  all
          TRES types multiplied by their corresponding billing weight.

          The  weighted  amount  of a resource can be adjusted by adding a
          suffix of K,M,G,T or P after the billing weight. For example,  a
          memory weight of "mem=.25" on a job allocated 8GB will be billed
          2048 (8192MB *.25) units. A memory weight of "mem=.25G"  on  the
          same job will be billed 2 (8192MB * (.25/1024)) units.

          When  a job is allocated 1 CPU and 8 GB of memory on a partition
          configured                                                  with
          TRESBillingWeights="CPU=1.0,Mem=0.25G,GRES/gpu=2.0",         the
          billable TRES will be: (1*1.0) + (8*0.25) + (0*2.0) = 3.0.

          If PriorityFlags=MAX_TRES is configured, the  billable  TRES  is
          calculated  as the MAX of individual TRES' on a node (e.g. cpus,
          mem, gres) plus the sum of all  global  TRES'  (e.g.  licenses).
          Using   the  same  example  above  the  billable  TRES  will  be
          MAX(1*1.0, 8*0.25) + (0*2.0) = 2.0.

          If TRESBillingWeights is not defined  then  the  job  is  billed
          against the total number of allocated CPUs.

          NOTE:  TRESBillingWeights is only used when calcuating fairshare
          and doesn't affect job priority directly as it is currently  not
          used  for  the size of the job. If you want TRES' to play a role
          in the job's  priority  then  refer  to  the  PriorityWeightTRES
          option.

Prolog and Epilog Scripts

   There  are  a variety of prolog and epilog program options that execute
   with various permissions and at various times.  The four  options  most
   likely to be used are: Prolog and Epilog (executed once on each compute
   node for each job) plus PrologSlurmctld and  EpilogSlurmctld  (executed
   once on the ControlMachine for each job).

   NOTE:   Standard  output and error messages are normally not preserved.
   Explicitly write output and error messages to an  appropriate  location
   if you wish to preserve that information.

   NOTE:   By default the Prolog script is ONLY run on any individual node
   when it first sees a job step from a new allocation; it  does  not  run
   the  Prolog immediately when an allocation is granted.  If no job steps
   from an allocation are run on a node, it will never run the Prolog  for
   that   allocation.   This  Prolog  behaviour  can  be  changed  by  the
   PrologFlags parameter.  The Epilog, on the other hand, always  runs  on
   every node of an allocation when the allocation is released.

   If the Epilog fails (returns a non-zero exit code), this will result in
   the node being set to a DRAIN  state.   If  the  EpilogSlurmctld  fails
   (returns  a  non-zero  exit  code),  this  will only be logged.  If the
   Prolog fails (returns a non-zero exit code), this will  result  in  the
   node  being  set  to a DRAIN state and the job being requeued in a held
   state     unless     nohold_on_prolog_fail     is     configured     in
   SchedulerParameters.   If the PrologSlurmctld fails (returns a non-zero
   exit code), this will result in the job requeued to executed on another
   node if possible. Only batch jobs can be requeued.
    Interactive   jobs   (salloc  and  srun)  will  be  cancelled  if  the
   PrologSlurmctld fails.

   Information about the job is passed to  the  script  using  environment
   variables.  Unless otherwise specified, these environment variables are
   available to all of the programs.

   BASIL_RESERVATION_ID
          Basil reservation ID.  Available on Cray systems with ALPS only.

   MPIRUN_PARTITION
          BlueGene partition name.  Available on BlueGene systems only.

   SLURM_ARRAY_JOB_ID
          If this job is part of a job array, this will be set to the  job
          ID.   Otherwise  it will not be set.  To reference this specific
          task  of  a   job   array,   combine   SLURM_ARRAY_JOB_ID   with
          SLURM_ARRAY_TASK_ID         (e.g.        "scontrol        update
          ${SLURM_ARRAY_JOB_ID}_{$SLURM_ARRAY_TASK_ID} ..."); Available in
          PrologSlurmctld and EpilogSlurmctld only.

   SLURM_ARRAY_TASK_ID
          If this job is part of a job array, this will be set to the task
          ID.  Otherwise it will not be set.  To reference  this  specific
          task   of   a   job   array,   combine  SLURM_ARRAY_JOB_ID  with
          SLURM_ARRAY_TASK_ID        (e.g.        "scontrol         update
          ${SLURM_ARRAY_JOB_ID}_{$SLURM_ARRAY_TASK_ID} ..."); Available in
          PrologSlurmctld and EpilogSlurmctld only.

   SLURM_ARRAY_TASK_MAX
          If this job is part of a job array, this  will  be  set  to  the
          maximum  task  ID.   Otherwise it will not be set.  Available in
          PrologSlurmctld and EpilogSlurmctld only.

   SLURM_ARRAY_TASK_MIN
          If this job is part of a job array, this  will  be  set  to  the
          minimum  task  ID.   Otherwise it will not be set.  Available in
          PrologSlurmctld and EpilogSlurmctld only.

   SLURM_ARRAY_TASK_STEP
          If this job is part of a job array, this will be set to the step
          size  of  task IDs.  Otherwise it will not be set.  Available in
          PrologSlurmctld and EpilogSlurmctld only.

   SLURM_CLUSTER_NAME
          Name of the cluster executing the job.

   SLURM_JOB_ACCOUNT
          Account name used for the job.  Available in PrologSlurmctld and
          EpilogSlurmctld only.

   SLURM_JOB_CONSTRAINTS
          Features   required  to  run  the  job.   Available  in  Prolog,
          PrologSlurmctld and EpilogSlurmctld only.

   SLURM_JOB_DERIVED_EC
          The highest exit code of all of the  job  steps.   Available  in
          EpilogSlurmctld only.

   SLURM_JOB_EXIT_CODE
          The  exit  code  of the job script (or salloc). The value is the
          status as returned by  the  wait()  system  call  (See  wait(2))
          Available in EpilogSlurmctld only.

   SLURM_JOB_EXIT_CODE2
          The  exit  code of the job script (or salloc). The value has the
          format  <exit>:<sig>.  The  first  number  is  the  exit   code,
          typically  as  set  by the exit() function. The second number of
          the signal that caused the  process  to  terminante  if  it  was
          terminated by a signal.  Available in EpilogSlurmctld only.

   SLURM_JOB_GID
          Group  ID  of the job's owner.  Available in PrologSlurmctld and
          EpilogSlurmctld only.

   SLURM_JOB_GPUS
          GPU IDs allocated to the job (if any).  Available in the  Prolog
          only.

   SLURM_JOB_GROUP
          Group name of the job's owner.  Available in PrologSlurmctld and
          EpilogSlurmctld only.

   SLURM_JOB_ID
          Job ID.  CAUTION: If this job is the first task of a job  array,
          then  Slurm  commands using this job ID will refer to the entire
          job array rather than this specific task of the job array.

   SLURM_JOB_NAME
          Name   of   the   job.    Available   in   PrologSlurmctld   and
          EpilogSlurmctld only.

   SLURM_JOB_NODELIST
          Nodes  assigned  to job. A Slurm hostlist expression.  "scontrol
          show hostnames" can be  used  to  convert  this  to  a  list  of
          individual   host   names.   Available  in  PrologSlurmctld  and
          EpilogSlurmctld only.

   SLURM_JOB_PARTITION
          Partition   that   job   runs   in.    Available   in    Prolog,
          PrologSlurmctld and EpilogSlurmctld only.

   SLURM_JOB_UID
          User ID of the job's owner.

   SLURM_JOB_USER
          User name of the job's owner.

NETWORK TOPOLOGY

   Slurm   is  able  to  optimize  job  allocations  to  minimize  network
   contention.  Special Slurm logic is used  to  optimize  allocations  on
   systems  with  a  three-dimensional  interconnect (BlueGene, etc.)  and
   information about configuring those systems are available on web  pages
   available   here:   <http://slurm.schedmd.com/>.   For  a  hierarchical
   network, Slurm needs to have detailed information about how  nodes  are
   configured on the network switches.

   Given  network  topology  information,  Slurm  allocates all of a job's
   resources onto a single leaf of  the  network  (if  possible)  using  a
   best-fit  algorithm.  Otherwise it will allocate a job's resources onto
   multiple leaf switches so  as  to  minimize  the  use  of  higher-level
   switches.   The  TopologyPlugin parameter controls which plugin is used
   to collect network topology information.   The  only  values  presently
   supported  are  "topology/3d_torus"  (default for IBM BlueGene and Cray
   XT/XE  systems,  performs   best-fit   logic   over   three-dimensional
   topology),  "topology/none"  (default for other systems, best-fit logic
   over one-dimensional topology), "topology/tree" (determine the  network
   topology  based upon information contained in a topology.conf file, see
   "man topology.conf" for more information).  Future plugins  may  gather
   topology   information   directly   from  the  network.   The  topology
   information is  optional.   If  not  provided,  Slurm  will  perform  a
   best-fit algorithm assuming the nodes are in a one-dimensional array as
   configured and the communications cost is related to the node  distance
   in this array.

RELOCATING CONTROLLERS

   If  the  cluster's  computers used for the primary or backup controller
   will be out of service for an  extended  period  of  time,  it  may  be
   desirable to relocate them.  In order to do so, follow this procedure:

   1. Stop the Slurm daemons
   2. Modify the slurm.conf file appropriately
   3. Distribute the updated slurm.conf file to all nodes
   4. Restart the Slurm daemons

   There  should  be  no loss of any running or pending jobs.  Insure that
   any nodes added  to  the  cluster  have  the  current  slurm.conf  file
   installed.

   CAUTION:  If  two  nodes  are  simultaneously configured as the primary
   controller (two nodes on which ControlMachine specify  the  local  host
   and the slurmctld daemon is executing on each), system behavior will be
   destructive.  If a compute node  has  an  incorrect  ControlMachine  or
   BackupController  parameter, that node may be rendered unusable, but no
   other harm will result.

EXAMPLE

   #
   # Sample /etc/slurm.conf for dev[0-25].llnl.gov
   # Author: John Doe
   # Date: 11/06/2001
   #
   ControlMachine=dev0
   ControlAddr=edev0
   BackupController=dev1
   BackupAddr=edev1
   #
   AuthType=auth/munge
   Epilog=/usr/local/slurm/epilog
   Prolog=/usr/local/slurm/prolog
   FastSchedule=1
   FirstJobId=65536
   InactiveLimit=120
   JobCompType=jobcomp/filetxt
   JobCompLoc=/var/log/slurm/jobcomp
   KillWait=30
   MaxJobCount=10000
   MinJobAge=3600
   PluginDir=/usr/local/lib:/usr/local/slurm/lib
   ReturnToService=0
   SchedulerType=sched/backfill
   SlurmctldLogFile=/var/log/slurm/slurmctld.log
   SlurmdLogFile=/var/log/slurm/slurmd.log
   SlurmctldPort=7002
   SlurmdPort=7003
   SlurmdSpoolDir=/usr/local/slurm/slurmd.spool
   StateSaveLocation=/usr/local/slurm/slurm.state
   SwitchType=switch/none
   TmpFS=/tmp
   WaitTime=30
   JobCredentialPrivateKey=/usr/local/slurm/private.key
   JobCredentialPublicCertificate=/usr/local/slurm/public.cert
   #
   # Node Configurations
   #
   NodeName=DEFAULT CPUs=2 RealMemory=2000 TmpDisk=64000
   NodeName=DEFAULT State=UNKNOWN
   NodeName=dev[0-25] NodeAddr=edev[0-25] Weight=16
   # Update records for specific DOWN nodes
   DownNodes=dev20 State=DOWN Reason="power,ETA=Dec25"
   #
   # Partition Configurations
   #
   PartitionName=DEFAULT MaxTime=30 MaxNodes=10 State=UP
   PartitionName=debug Nodes=dev[0-8,18-25] Default=YES
   PartitionName=batch Nodes=dev[9-17]  MinNodes=4
   PartitionName=long Nodes=dev[9-17] MaxTime=120 AllowGroups=admin

INCLUDE MODIFIERS

   The "include" key word can be used with modifiers within the  specified
   pathname.  These modifiers would be replaced with cluster name or other
   information depending on which modifier is specified. If  the  included
   file  is  not  an  absolute  path  name  (i.e. it does not start with a
   slash), it will searched for in the same directory  as  the  slurm.conf
   file.

   %c     Cluster name specified in the slurm.conf will be used.

   EXAMPLE
   ClusterName=linux
   include /home/slurm/etc/%c_config
   # Above line interpreted as
   # "include /home/slurm/etc/linux_config"

FILE AND DIRECTORY PERMISSIONS

   There  are  three  classes  of  files:  Files used by slurmctld must be
   accessible by user SlurmUser and accessible by the primary  and  backup
   control machines.  Files used by slurmd must be accessible by user root
   and accessible from every  compute  node.   A  few  files  need  to  be
   accessible  by normal users on all login and compute nodes.  While many
   files and directories are listed below, most of them will not  be  used
   with most configurations.

   AccountingStorageLoc
          If this specifies a file, it must be writable by user SlurmUser.
          The file must be accessible by the primary  and  backup  control
          machines.   It  is  recommended that the file be readable by all
          users from login and compute nodes.

   Epilog Must be executable by user root.  It  is  recommended  that  the
          file  be  readable  by  all users.  The file must exist on every
          compute node.

   EpilogSlurmctld
          Must be executable by user SlurmUser.  It  is  recommended  that
          the  file be readable by all users.  The file must be accessible
          by the primary and backup control machines.

   HealthCheckProgram
          Must be executable by user root.  It  is  recommended  that  the
          file  be  readable  by  all users.  The file must exist on every
          compute node.

   JobCheckpointDir
          Must be writable by user SlurmUser and no other users.  The file
          must be accessible by the primary and backup control machines.

   JobCompLoc
          If this specifies a file, it must be writable by user SlurmUser.
          The file must be accessible by the primary  and  backup  control
          machines.

   JobCredentialPrivateKey
          Must be readable only by user SlurmUser and writable by no other
          users.  The file must be accessible by the  primary  and  backup
          control machines.

   JobCredentialPublicCertificate
          Readable  to  all  users  on all nodes.  Must not be writable by
          regular users.

   MailProg
          Must be executable by user SlurmUser.  Must not be  writable  by
          regular  users.   The file must be accessible by the primary and
          backup control machines.

   Prolog Must be executable by user root.  It  is  recommended  that  the
          file  be  readable  by  all users.  The file must exist on every
          compute node.

   PrologSlurmctld
          Must be executable by user SlurmUser.  It  is  recommended  that
          the  file be readable by all users.  The file must be accessible
          by the primary and backup control machines.

   ResumeProgram
          Must  be  executable  by  user  SlurmUser.   The  file  must  be
          accessible by the primary and backup control machines.

   SallocDefaultCommand
          Must  be  executable by all users.  The file must exist on every
          login and compute node.

   slurm.conf
          Readable to all users on all nodes.  Must  not  be  writable  by
          regular users.

   SlurmctldLogFile
          Must be writable by user SlurmUser.  The file must be accessible
          by the primary and backup control machines.

   SlurmctldPidFile
          Must  be  writable  by  user  root.   Preferably  writable   and
          removable  by  SlurmUser.   The  file  must be accessible by the
          primary and backup control machines.

   SlurmdLogFile
          Must be writable by user root.  A distinct file  must  exist  on
          each compute node.

   SlurmdPidFile
          Must  be  writable  by user root.  A distinct file must exist on
          each compute node.

   SlurmdSpoolDir
          Must be writable by user root.  A distinct file  must  exist  on
          each compute node.

   SrunEpilog
          Must  be  executable by all users.  The file must exist on every
          login and compute node.

   SrunProlog
          Must be executable by all users.  The file must exist  on  every
          login and compute node.

   StateSaveLocation
          Must be writable by user SlurmUser.  The file must be accessible
          by the primary and backup control machines.

   SuspendProgram
          Must  be  executable  by  user  SlurmUser.   The  file  must  be
          accessible by the primary and backup control machines.

   TaskEpilog
          Must  be  executable by all users.  The file must exist on every
          compute node.

   TaskProlog
          Must be executable by all users.  The file must exist  on  every
          compute node.

   UnkillableStepProgram
          Must  be  executable  by  user  SlurmUser.   The  file  must  be
          accessible by the primary and backup control machines.

LOGGING

   Note that while Slurm daemons create  log  files  and  other  files  as
   needed,  it  treats  the  lack  of parent directories as a fatal error.
   This prevents the daemons from running if critical file systems are not
   mounted  and  will minimize the risk of cold-starting (starting without
   preserving jobs).

   Log files and job accounting files, may need to be created/owned by the
   "SlurmUser"  uid  to  be  successfully  accessed.   Use the "chown" and
   "chmod" commands to set the ownership  and  permissions  appropriately.
   See  the  section  FILE AND DIRECTORY PERMISSIONS for information about
   the various files and directories used by Slurm.

   It is recommended that the logrotate utility be  used  to  insure  that
   various  log  files do not become too large.  This also applies to text
   files used for accounting, process tracking, and the  slurmdbd  log  if
   they are used.

   Here  is  a  sample  logrotate  configuration.  Make  appropriate  site
   modifications and save as /etc/logrotate.d/slurm on all nodes.  See the
   logrotate man page for more details.

   ##
   # Slurm Logrotate Configuration
   ##
   /var/log/slurm/*log {
       compress
       missingok
       nocopytruncate
       nocreate
       nodelaycompress
       nomail
       notifempty
       noolddir
       rotate 5
       sharedscripts
       size=5M
       create 640 slurm root
       postrotate
        /etc/init.d/slurm reconfig
       endscript
   }

COPYING

   Copyright  (C)  2002-2007  The Regents of the University of California.
   Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
   Copyright (C) 2008-2010 Lawrence Livermore National Security.
   Copyright (C) 2010-2016 SchedMD LLC.

   This file is  part  of  Slurm,  a  resource  management  program.   For
   details, see <http://slurm.schedmd.com/>.

   Slurm  is free software; you can redistribute it and/or modify it under
   the terms of the GNU General Public License as published  by  the  Free
   Software  Foundation;  either  version  2  of  the License, or (at your
   option) any later version.

   Slurm is distributed in the hope that it will be  useful,  but  WITHOUT
   ANY  WARRANTY;  without even the implied warranty of MERCHANTABILITY or
   FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General  Public  License
   for more details.

FILES

   /etc/slurm.conf

SEE ALSO

   bluegene.conf(5),  cgroup.conf(5),  gethostbyname  (3),  getrlimit (2),
   gres.conf(5),  group  (5),  hostname  (1),  scontrol(1),  slurmctld(8),
   slurmd(8),  slurmdbd(8),  slurmdbd.conf(5),  srun(1),  spank(8), syslog
   (2), topology.conf(5), wiki.conf(5)



Opportunity


Personal Opportunity - Free software gives you access to billions of dollars of software at no cost. Use this software for your business, personal use or to develop a profitable skill. Access to source code provides access to a level of capabilities/information that companies protect though copyrights. Open source is a core component of the Internet and it is available to you. Leverage the billions of dollars in resources and capabilities to build a career, establish a business or change the world. The potential is endless for those who understand the opportunity.

Business Opportunity - Goldman Sachs, IBM and countless large corporations are leveraging open source to reduce costs, develop products and increase their bottom lines. Learn what these companies know about open source and how open source can give you the advantage.


Free Software


Free Software provides computer programs and capabilities at no cost but more importantly, it provides the freedom to run, edit, contribute to, and share the software. The importance of free software is a matter of access, not price. Software at no cost is a benefit but ownership rights to the software and source code is far more significant.

Free Office Software - The Libre Office suite provides top desktop productivity tools for free. This includes, a word processor, spreadsheet, presentation engine, drawing and flowcharting, database and math applications. Libre Office is available for Linux or Windows.


Free Books


The Free Books Library is a collection of thousands of the most popular public domain books in an online readable format. The collection includes great classical literature and more recent works where the U.S. copyright has expired. These books are yours to read and use without restrictions.

Source Code - Want to change a program or know how it works? Open Source provides the source code for its programs so that anyone can use, modify or learn how to write those programs themselves. Visit the GNU source code repositories to download the source.


Education


Study at Harvard, Stanford or MIT - Open edX provides free online courses from Harvard, MIT, Columbia, UC Berkeley and other top Universities. Hundreds of courses for almost all major subjects and course levels. Open edx also offers some paid courses and selected certifications.

Linux Manual Pages - A man or manual page is a form of software documentation found on Linux/Unix operating systems. Topics covered include computer programs (including library and system calls), formal standards and conventions, and even abstract concepts.