smap(1)

NAME

   smap  -  graphically view information about Slurm jobs, partitions, and
   set configurations parameters.

SYNOPSIS

   smap [OPTIONS...]

DESCRIPTION

   smap is used to graphically view job, partition  and  node  information
   for  a  system  running  Slurm.   Note that information about nodes and
   partitions to which you lack access will always be displayed  to  avoid
   obvious  gaps in the output.  This is equivalent to the --all option of
   the sinfo and squeue commands.

OPTIONS

   -c, --commandline
          Print output to the commandline, no curses.

   -D <option>, --display=<option>
          sets the display mode for  smap,  showing  relevant  information
          about  the  selected  view  and  displaying a corresponding node
          chart.  Note that unallocated nodes are indicated by a  '.'  and
          nodes  in  the  DOWN,  DRAINED or FAIL state by a '#'.  When the
          --iterate=<seconds> option is  also  selected,  you  can  switch
          displays  by  typing  a  different  letter  from  the list below
          (except 'c').

          b              Displays information about BlueGene partitions on
                         the system

          c              Displays  current BlueGene node states and allows
                         users to configure the system.   Type  'quit'  to
                         end  the  configure mode.  Type 'exit' to end the
                         configuration mode and exit smap.

          j              Displays  information  about  jobs   running   on
                         system.

          r              Display  information about advanced reservations.
                         While all current and future reservations will be
                         listed,  only  currently active reservations will
                         appear on the node map.

          s              Displays information about  slurm  partitions  on
                         the system

   -h, --noheader
          Do not print a header on the output.

   -H, --show_hidden
          Display hidden partitions and their jobs.

   --help,
          Print a message describing all smap options.

   -i <seconds> , --iterate=<seconds>
          Print  the  state  on a periodic basis.  Sleep for the indicated
          number of seconds between reports.  User can exit at anytime  by
          typing  'q'  or hitting the return key.  If user is in configure
          mode type 'exit' to exit program, 'quit' to exit configure mode.

   -I, --ionodes
          Only show objects with these ionodes this support  is  only  for
          bluegene systems. This should be used inconjuction with the '-n'
          option.  Only specify the ionode number range here.  Specify the
          node name with the '-n' option.

   -M, --clusters=<string>
          Clusters to issue commands to.

   -n, --nodes
          Only  show  objects with these nodes.  If querying to the ionode
          level use the option '-I' in conjunction with this option.

   -Q, --quiet
          Avoid printing error messages.

   -R <RACK_MIDPLANE_ID/XYZ>, --resolve=<RACK_MIDPLANE_ID/XYZ>
          Returns the XYZ coords for a Rack/Midplane id or vice-versa.

          To get the XYZ coord for a Rack/Midplane id input -R R101  where
          10 is the rack and 1 is the midplane.

          To  get the Rack/Midplane id from a XYZ coord input -R 101 where
          X=1 Y=1 Z=1 with no leading 'R'.

   --usage
          Print a brief message listing the smap options.

   -V , --version
          Print version information and exit.

INTERACTIVE OPTIONS

   When using smap in curses mode and when the --iterate=<seconds>  option
   is  also  selected,  you can scroll through the different windows using
   the arrow  keys.   The  up  and  down  arrow  keys  scroll  the  window
   containing  the  grid,  and  the  left  and right arrow keys scroll the
   window containing the text information.

   With the iterate option selected,  you  can  use  any  of  the  options
   available to the -D option listed above (except 'c') to change screens.
   You can also hide or make visible hidden partitions by pressing 'h'  at
   any moment.

OUTPUT FIELD DESCRIPTIONS

   ACCESS_CONTROL
          Identifies  the  users  or  bank  accounts  which  can  use this
          advanced reservation.  A  prefix  of  "A:"  indicates  that  the
          following  account  names may use this reservation.  A prefix of
          "U:" indicates that  the  following  user  names  may  use  this
          reservation.

   AVAIL  Partition state: up or down.

   BG_BLOCK
          BlueGene Block Name.

   CONN   Connection Type: TORUS or MESH or SMALL (for small blocks).

   END_TIME
          The time when an advanced reservation ended.

   ID     Key  to  identify  the  nodes associated with this entity in the
          node chart.

   MODE   Mode Type: COPROCESS or VIRTUAL.

   NAME   Name of the job or advanced reservation.

   NODELIST or BP_LIST
          Names  of  nodes  or  base  partitions  associated   with   this
          configuration, partition or reservation.

   NODES  Count   of   nodes  or  base  partitions  with  this  particular
          configuration.

   PARTITION
          Name of a partition.  Note that the suffix  "*"  identifies  the
          default partition.

   ST     State  of  a  job  in  compact form. Possible states include: PD
          (pending), R  (running),  S  (suspended),  CD   (completed),  CF
          (configuring), CG (completing), F (failed), TO (timeout), and NF
          (node failure). See JOB  STATE  CODES  section  below  for  more
          information.

   START_TIME
          The time when an advanced reservation started.

   STATE  State   of  the  nodes.   Possible  states  include:  allocated,
          completing, down, drained, draining, fail,  failing,  idle,  and
          unknown  plus their abbreviated forms: alloc, comp, donw, drain,
          drng, fail, failg, idle, and unk respectively.   Note  that  the
          suffix  "*"  identifies nodes that are presently not responding.
          See NODE STATE CODES section below for more information.

   TIMELIMIT
          Maximum    time    limit     for     any     user     job     in
          days-hours:minutes:seconds.   infinite  is used to identify jobs
          or partitions without a job time limit.

   TOPOGRAPHY INFORMATION

   The node chart is designed to indicate relative locations of the nodes.
   On  most  Linux clusters this will represent a one-dimensional array of
   nodes. Larger clusters will utilize multiple as needed with right  side
   of one line being logically followed by the left side of the next line.

   On BlueGene systems, the node chart will indicate the three
   dimensional topography of the system.
   The X dimension will increase from left to right on a given line.
   The Y dimension will increase in planes from bottom to top.
   The Z dimension will increase within a plane from the back
   line to the front line of a plane.
   Note the example below:

      a a a a b b d d
     a a a a b b d d
    a a a a b b c c
   a a a a b b c c

      a a a a b b d d
     a a a a b b d d
    a a a a b b c c
   a a a a b b c c

      a a a a . . d d
     a a a a . . d d
    a a a a . . e e              Y
   a a a a . . e e               |
                                 |
      a a a a . . d d            0----X
     a a a a . . d d            /
    a a a a . . . .            /
   a a a a . . . #            Z

   ID JOBID PARTITION BG_BLOCK USER   NAME ST  TIME NODES BP_LIST
   a  12345 batch     RMP0     joseph tst1 R  43:12   32k bgl[000x333]
   b  12346 debug     RMP1     chris  sim3 R  12:34    8k bgl[420x533]
   c  12350 debug     RMP2     danny  job3 R   0:12    4k bgl[622x733]
   d  12356 debug     RMP3     dan    colu R  18:05    8k bgl[600x731]
   e  12378 debug     RMP4     joseph asx4 R   0:34    2k bgl[612x713]

CONFIGURATION INSTRUCTIONS

   For  Admin  use.  From  this screen one can create a configuration file
   that is used to partition and wire the system into usable blocks.

   OUTPUT

          BG_BLOCK
                 BlueGene Block Name.

          CONN   Connection Type:  TORUS  or  MESH  or  SMALL  (for  small
                 blocks).

          ID     Key  to identify the nodes associated with this entity in
                 the node chart.

          MODE   Mode Type: COPROCESS or VIRTUAL.

   INPUT COMMANDS

          resolve <RACK_MIDPLANE_ID/XYZ>
                 Returns  the  XYZ  coords  for  a  Rack/Midplane  id   or
                 vice-versa.

                 To get the XYZ coord for a Rack/Midplane id input -R R101
                 where 10 is the rack and 1 is the midplane.

                 To get the Rack/Midplane id from a XYZ coord input -R 101
                 where X=1 Y=1 Z=1 with no leading 'R'.

          load <bluegene.conf file>
                 Load  an  already  existent bluegene.conf file. This will
                 verify and mapout a bluegene.conf file.  After loaded the
                 configuration may be edited and saved as a new file.

          create <size> <options>
                 Submit  request  for  partition creation. The size may be
                 specified  either  as  a  count  of  base  partitions  or
                 specific   dimensions  in  the  X,  Y  and  Z  directions
                 separated by "x",  for  example  "2x3x4".  A  variety  of
                 options may be specified. Valid options are listed below.
                 Note  that  the  option  and  their   values   are   case
                 insensitive (e.g. "MESH" and "mesh" are equivalent).

          Start = XxYxZ
                 Identify where to start the partition.  This is primarily
                 for testing purposes.  For convenience one can  only  put
                 the  X coord or XxY will also work.  The default value is
                 0x0x0.

          Connection = MESH | TORUS | SMALL
                 Identify how the nodes should be  connected  in  network.
                 The default value is TORUS.

                 Small  Equivalent  to  "Connection=Small".   If  a  small
                        connection is specified the base partition  chosen
                        will  create  smaller  partitions based on options
                        32CNBlocks  and  128CNBlocks  respectively  for  a
                        Bluegene  L  system.   16CNBlocks, 64CNBlocks, and
                        256CNBlocks are  also  available  for  Bluegene  P
                        systems.   Keep  in  mind  you  must  have  enough
                        ionodes to make all these configurations possible.
                          These number will be  altered  to  take  up  the
                        entire  base  partition.  Size does not need to be
                        specified with a small  request,  we  will  always
                        default to 1 base partition for allocation.

                 Mesh   Equivalent to "Connection=Mesh".

                 Torus  Equivalent to "Connection=Torus".

          Rotation = TRUE | FALSE
                 Specifies   that  the  geometry  specified  in  the  size
                 parameter may be rotated in  space  (e.g.  the  Y  and  Z
                 dimensions may be switched).  The default value is FALSE.

          Rotate Equivalent to "Rotation=true".

          Elongation = TRUE | FALSE
                 If  TRUE,  permit  the  geometry  specified  in  the size
                 parameter to  be  altered  as  needed  to  fit  available
                 resources.   For  example, an allocation of "4x2x1" might
                 be used to satisfy a size specification of "2x2x2".   The
                 default value is FALSE.

          Elongate
                 Equivalent to "Elongation=true".

          copy <id> <count>
                 Submit  request for partition to be copied.  You may copy
                 a specific partition by specifying its id, by default the
                 last  configured  partition  is  copied.   You  may  also
                 specify a number of copies to be made.  By  default,  one
                 copy is made.

          delete <id>
                 Delete the specified block.

          down <node_range>
                 Down  a  specific  node  or  range  of  nodes.  i.e. 000,
                 000-111 [000x111]

          up <node_range>
                 Bring a specific node or range of nodes  up.   i.e.  000,
                 000-111 [000x111]

          alldown
                 Set all nodes to down state.

          allup  Set all nodes to up state.

          save <file_name>
                 Save   the  current  configuration  to  a  file.   If  no
                 file_name is specified, the configuration is written to a
                 file   named   "bluegene.conf"  in  the  current  working
                 directory.

          clear  Clear all partitions created.

NODE STATE CODES

   Node state codes are shortened as required for the field  size.   These
   node  states  may  be followed by a special character to identify state
   flags associated with the  node.   The  following  node  sufficies  and
   states are used:

   *   The  node is presently not responding and will not be allocated any
       new work.  If the node remains non-responsive, it will be placed in
       the  DOWN  state  (except  in  the  case  of  COMPLETING,  DRAINED,
       DRAINING, FAIL, FAILING nodes).

   ~   The node is presently in a power saving mode (typically running  at
       reduced frequency).

   #   The node is presently being powered up or configured.

   $   The  node  is  currently  in  a  reservation  with  a flag value of
       "maintenance" or is scheduled to be rebooted.

   ALLOCATED   The node has been allocated to one or more jobs.

   ALLOCATED+  The node is allocated to one or more active jobs  plus  one
               or more jobs are in the process of COMPLETING.

   COMPLETING  All  jobs  associated  with this node are in the process of
               COMPLETING.  This node state will be removed  when  all  of
               the  job's  processes  have terminated and the Slurm epilog
               program (if any) has terminated. See the  Epilog  parameter
               description   in   the   slurm.conf   man   page  for  more
               information.

   DOWN        The node is unavailable for use.  Slurm  can  automatically
               place  nodes  in  this state if some failure occurs. System
               administrators may also  explicitly  place  nodes  in  this
               state.  If  a  node  resumes  normal  operation,  Slurm can
               automatically return it to service. See the ReturnToService
               and    SlurmdTimeout    parameter   descriptions   in   the
               slurm.conf(5) man page for more information.

   DRAINED     The node is unavailable for use  per  system  administrator
               request.   See  the  update node command in the scontrol(1)
               man  page  or  the  slurm.conf(5)   man   page   for   more
               information.

   DRAINING    The  node  is  currently  executing  a job, but will not be
               allocated to  additional  jobs.  The  node  state  will  be
               changed to state DRAINED when the last job on it completes.
               Nodes enter this state per  system  administrator  request.
               See  the update node command in the scontrol(1) man page or
               the slurm.conf(5) man page for more information.

   FAIL        The node is expected to fail soon and  is  unavailable  for
               use  per system administrator request.  See the update node
               command in the scontrol(1) man page  or  the  slurm.conf(5)
               man page for more information.

   FAILING     The  node  is currently executing a job, but is expected to
               fail  soon  and  is  unavailable   for   use   per   system
               administrator  request.  See the update node command in the
               scontrol(1) man page or the slurm.conf(5) man page for more
               information.

   IDLE        The  node is not allocated to any jobs and is available for
               use.

   MAINT       The node is currently in a reservation with a flag value of
               "maintainence".

   UNKNOWN     The  Slurm controller has just started and the node's state
               has not yet been determined.

JOB STATE CODES

   Jobs typically pass through several  states  in  the  course  of  their
   execution.    The  typical  states  are  PENDING,  RUNNING,  SUSPENDED,
   COMPLETING, and COMPLETED.  An explanation of each state follows.

   BF  BOOT_FAIL       Job terminated due to launch failure, typically due
                       to a hardware failure (e.g. unable to boot the node
                       or block and the job can not be requeued).

   CA  CANCELLED       Job was explicitly cancelled by the user or  system
                       administrator.   The  job  may or may not have been
                       initiated.

   CD  COMPLETED       Job has terminated all processes on all nodes  with
                       an exit code of zero.

   CG  COMPLETING      Job is in the process of completing. Some processes
                       on some nodes may still be active.

   CF  CONFIGURING     Job has been allocated resources, but  are  waiting
                       for them to become ready for use (e.g. booting).

   F   FAILED          Job  terminated  with  non-zero  exit code or other
                       failure condition.

   NF  NODE_FAIL       Job terminated  due  to  failure  of  one  or  more
                       allocated nodes.

   PD  PENDING         Job is awaiting resource allocation.

   PR  PREEMPTED       Job terminated due to preemption.

   R   RUNNING         Job currently has an allocation.

   SE  SPECIAL_EXIT    The job was requeued in a special state. This state
                       can be set by users, typically in  EpilogSlurmctld,
                       if  the  job  has terminated with a particular exit
                       value.

   ST  STOPPED         Job has  an  allocation,  but  execution  has  been
                       stopped   with  SIGSTOP  signal.   CPUS  have  been
                       retained by this job.

   S   SUSPENDED       Job has  an  allocation,  but  execution  has  been
                       suspended  and  CPUs  have  been released for other
                       jobs.

   TO  TIMEOUT         Job terminated upon reaching its time limit.

ENVIRONMENT VARIABLES

   The following environment variables can be used  to  override  settings
   compiled into smap.

   SLURM_CONF          The location of the Slurm configuration file.

COPYING

   Copyright  (C)  2004-2007  The Regents of the University of California.
   Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
   Copyright (C) 2008-2009 Lawrence Livermore National Security.
   Copyright (C) 2010-2013 SchedMD LLC.

   This file is  part  of  Slurm,  a  resource  management  program.   For
   details, see <http://slurm.schedmd.com/>.

   Slurm  is free software; you can redistribute it and/or modify it under
   the terms of the GNU General Public License as published  by  the  Free
   Software  Foundation;  either  version  2  of  the License, or (at your
   option) any later version.

   Slurm is distributed in the hope that it will be  useful,  but  WITHOUT
   ANY  WARRANTY;  without even the implied warranty of MERCHANTABILITY or
   FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General  Public  License
   for more details.

SEE ALSO

   scontrol(1),     sinfo(1),    squeue(1),    slurm_load_ctl_conf    (3),
   slurm_load_jobs (3), slurm_load_node  (3),  slurm_load_partitions  (3),
   slurm_reconfigure   (3),   slurm_shutdown  (3),  slurm_update_job  (3),
   slurm_update_node (3), slurm_update_partition (3), slurm.conf(5)



Opportunity


Personal Opportunity - Free software gives you access to billions of dollars of software at no cost. Use this software for your business, personal use or to develop a profitable skill. Access to source code provides access to a level of capabilities/information that companies protect though copyrights. Open source is a core component of the Internet and it is available to you. Leverage the billions of dollars in resources and capabilities to build a career, establish a business or change the world. The potential is endless for those who understand the opportunity.

Business Opportunity - Goldman Sachs, IBM and countless large corporations are leveraging open source to reduce costs, develop products and increase their bottom lines. Learn what these companies know about open source and how open source can give you the advantage.


Free Software


Free Software provides computer programs and capabilities at no cost but more importantly, it provides the freedom to run, edit, contribute to, and share the software. The importance of free software is a matter of access, not price. Software at no cost is a benefit but ownership rights to the software and source code is far more significant.

Free Office Software - The Libre Office suite provides top desktop productivity tools for free. This includes, a word processor, spreadsheet, presentation engine, drawing and flowcharting, database and math applications. Libre Office is available for Linux or Windows.


Free Books


The Free Books Library is a collection of thousands of the most popular public domain books in an online readable format. The collection includes great classical literature and more recent works where the U.S. copyright has expired. These books are yours to read and use without restrictions.

Source Code - Want to change a program or know how it works? Open Source provides the source code for its programs so that anyone can use, modify or learn how to write those programs themselves. Visit the GNU source code repositories to download the source.


Education


Study at Harvard, Stanford or MIT - Open edX provides free online courses from Harvard, MIT, Columbia, UC Berkeley and other top Universities. Hundreds of courses for almost all major subjects and course levels. Open edx also offers some paid courses and selected certifications.

Linux Manual Pages - A man or manual page is a form of software documentation found on Linux/Unix operating systems. Topics covered include computer programs (including library and system calls), formal standards and conventions, and even abstract concepts.