fileserver(8)

NAME

   fileserver - Initializes the File Server component of the fs process

SYNOPSIS

   fileserver
       [-auditlog <path to log file>]
       [-audit-interface (file | sysvmq)]
       [-d <debug level>]
       [-p <number of processes>]
       [-spare <number of spare blocks>]
       [-pctspare <percentage spare>]
       [-b <buffers>]
       [-l <large vnodes>]
       [-s <small vnodes>]
       [-vc <volume cachesize>]
       [-w <call back wait interval>]
       [-cb <number of call backs>]
       [-banner]
       [-novbc]
       [-implicit <admin mode bits: rlidwka>]
       [-readonly]
       [-hr <number of hours between refreshing the host cps>]
       [-busyat <redirect clients when queue > n>]
       [-nobusy]
       [-rxpck <number of rx extra packets>]
       [-rxdbg]
       [-rxdbge]
       [-rxmaxmtu <bytes>]
       [-nojumbo]
       [-jumbo]
       [-rxbind]
       [-allow-dotted-principals]
       [-L]
       [-S]
       [-k <stack size>]
       [-realm <Kerberos realm name>]
       [-udpsize <size of socket buffer in bytes>]
       [-sendsize <size of send buffer in bytes>]
       [-abortthreshold <abort threshold>]
       [-enable_peer_stats]
       [-enable_process_stats]
       [-syslog [< loglevel >]]
       [-mrafslogs]
       [-saneacls]
       [-help]
       [-vhandle-setaside <fds reserved for non-cache io>]
       [-vhandle-max-cachesize <max open files>]
       [-vhandle-initial-cachesize <fds reserved for non-cache io>]
       [-vattachpar <number of volume attach threads>]
       [-m <min percentage spare in partition>]
       [-lock]
       [-sync <sync behavior>]
       [-offline-timeout <timeout in seconds>]
       [-offline-shutdown-timeout <timeout in seconds>]

DESCRIPTION

   The fileserver command initializes the File Server component of the
   "fs" process. In the conventional configuration, its binary file is
   located in the /usr/lib/openafs directory on a file server machine.

   The fileserver command is not normally issued at the command shell
   prompt, but rather placed into a database server machine's
   /etc/openafs/BosConfig file with the bos create command. If it is ever
   issued at the command shell prompt, the issuer must be logged onto a
   file server machine as the local superuser "root".

   The File Server creates the /var/log/openafs/FileLog log file as it
   initializes, if the file does not already exist. It does not write a
   detailed trace by default, but the -d option may be used to increase
   the amount of detail. Use the bos getlog command to display the
   contents of the log file.

   The command's arguments enable the administrator to control many
   aspects of the File Server's performance, as detailed in OPTIONS.  By
   default the File Server sets values for many arguments that are
   suitable for a medium-sized file server machine. To set values suitable
   for a small or large file server machine, use the -S or -L flag
   respectively. The following list describes the parameters and
   corresponding argument for which the File Server sets default values,
   and the table below summarizes the setting for each of the three
   machine sizes.

   *   The maximum number of lightweight processes (LWPs) or pthreads the
       File Server uses to handle requests for data; corresponds to the -p
       argument. The File Server always uses a minimum of 32 KB of memory
       for these processes.

   *   The maximum number of directory blocks the File Server caches in
       memory; corresponds to the -b argument. Each cached directory block
       (buffer) consumes 2,092 bytes of memory.

   *   The maximum number of large vnodes the File Server caches in memory
       for tracking directory elements; corresponds to the -l argument.
       Each large vnode consumes 292 bytes of memory.

   *   The maximum number of small vnodes the File Server caches in memory
       for tracking file elements; corresponds to the -s argument.  Each
       small vnode consumes 100 bytes of memory.

   *   The maximum volume cache size, which determines how many volumes
       the File Server can cache in memory before having to retrieve data
       from disk; corresponds to the -vc argument.

   *   The maximum number of callback structures the File Server caches in
       memory; corresponds to the -cb argument. Each callback structure
       consumes 16 bytes of memory.

   *   The maximum number of Rx packets the File Server uses; corresponds
       to the -rxpck argument. Each packet consumes 1544 bytes of memory.

   The default values are:

     Parameter (Argument)               Small (-S)     Medium   Large (-L)
     ---------------------------------------------------------------------
     Number of LWPs (-p)                        6           9          128
     Number of cached dir blocks (-b)          70          90          120
     Number of cached large vnodes (-l)       200         400          600
     Number of cached small vnodes (-s)       200         400          600
     Maximum volume cache size (-vc)          200         400          600
     Number of callbacks (-cb)             20,000      60,000       64,000
     Number of Rx packets (-rxpck)            100         150          200

   To override any of the values, provide the indicated argument (which
   can be combined with the -S or -L flag).

   The amount of memory required for the File Server varies. The
   approximate default memory usage is 751 KB when the -S flag is used
   (small configuration), 1.1 MB when all defaults are used (medium
   configuration), and 1.4 MB when the -L flag is used (large
   configuration). If additional memory is available, increasing the value
   of the -cb and -vc arguments can improve File Server performance most
   directly.

   By default, the File Server allows a volume to exceed its quota by 1 MB
   when an application is writing data to an existing file in a volume
   that is full. The File Server still does not allow users to create new
   files in a full volume. To change the default, use one of the following
   arguments:

   *   Set the -spare argument to the number of extra kilobytes that the
       File Server allows as overage. A value of 0 allows no overage.

   *   Set the -pctspare argument to the percentage of the volume's quota
       the File Server allows as overage.

   By default, the File Server implicitly grants the "a" (administer) and
   "l" (lookup) permissions to system:administrators on the access control
   list (ACL) of every directory in the volumes stored on its file server
   machine. In other words, the group's members can exercise those two
   permissions even when an entry for the group does not appear on an ACL.
   To change the set of default permissions, use the -implicit argument.

   The File Server maintains a host current protection subgroup (host CPS)
   for each client machine from which it has received a data access
   request. Like the CPS for a user, a host CPS lists all of the
   Protection Database groups to which the machine belongs, and the File
   Server compares the host CPS to a directory's ACL to determine in what
   manner users on the machine are authorized to access the directory's
   contents. When the pts adduser or pts removeuser command is used to
   change the groups to which a machine belongs, the File Server must
   recompute the machine's host CPS in order to notice the change. By
   default, the File Server contacts the Protection Server every two hours
   to recompute host CPSs, implying that it can take that long for changed
   group memberships to become effective. To change this frequency, use
   the -hr argument.

   The File Server stores volumes in partitions. A partition is a
   filesystem or directory on the server machine that is named "/vicepX"
   or "/vicepXX" where XX is "a" through "z" or "aa" though "iv". Up to
   255 partitions are allowed. The File Server expects that the /vicepXX
   directories are each on a dedicated filesystem. The File Server will
   only use a /vicepXX if it's a mountpoint for another filesystem, unless
   the file "/vicepXX/AlwaysAttach" exists.  A partition will not be
   mounted if the file "/vicepXX/NeverAttach" exists. If both
   "/vicepXX/AlwaysAttach" and "/vicepXX/NeverAttach" are present, then
   "/vicepXX/AlwaysAttach" wins.  The data in the partition is a special
   format that can only be access using OpenAFS commands or an OpenAFS
   client.

   The File Server generates the following message when a partition is
   nearly full:

      No space left on device

   This command does not use the syntax conventions of the AFS command
   suites. Provide the command name and all option names in full.

CAUTIONS

   There are two strategies the File Server can use for attaching AFS
   volumes at startup and handling volume salvages.  The traditional
   method assumes all volumes are salvaged before the File Server starts
   and attaches all volumes at start before serving files.  The newer
   demand-attach method attaches volumes only on demand, salvaging them at
   that time as needed, and detaches volumes that are not in use.  A
   demand-attach File Server can also save state to disk for faster
   restarts. The dafileserver implements the demand-attach method, while
   fileserver uses the traditional method.

   The choice of traditional or demand-attach File Server changes the
   required setup in BosConfig. When changing from a traditional File
   Server to demand-attach or vice versa, you will need to stop and remove
   the "fs" or "dafs" node in BosConfig and create a new node of the
   appropriate type. See bos_create(8) for more information.

   Do not use the -k and -w arguments, which are intended for use by the
   OpenAFS developers only. Changing them from their default values can
   result in unpredictable File Server behavior.  In any case, on many
   operating systems the File Server uses native threads rather than the
   LWP threads, so using the -k argument to set the number of LWP threads
   has no effect.

   Do not specify both the -spare and -pctspare arguments. Doing so causes
   the File Server to exit, leaving an error message in the
   /var/log/openafs/FileLog file.

   Options that are available only on some system types, such as the -m
   and -lock options, appear in the output generated by the -help option
   only on the relevant system type.

   Currently, the maximum size of a volume quota is 2 terabytes (2^41
   bytes) and the maximum size of a /vicepX partition on a fileserver is
   2^64 kilobytes. The maximum partition size in releases 1.4.7 and
   earlier is 2 terabytes (2^31 bytes). The maximum partition size for
   1.5.x releases 1.5.34 and earlier is 2 terabytes as well.

   The maximum number of directory entries is 64,000 if all of the entries
   have names that are 15 octets or less in length. A name that is 15
   octets long requires the use of only one block in the directory.
   Additional sequential blocks are required to store entries with names
   that are longer than 15 octets. Each additional block provides an
   additional length of 32 octets for the name of the entry. Note that if
   file names use an encoding like UTF-8, a single character may be
   encoded into multiple octets.

   In real world use, the maximum number of objects in an AFS directory is
   usually between 16,000 and 25,000, depending on the average name
   length.

OPTIONS

   -auditlog <log path>
       Turns on audit logging, and sets the path for the audit log.  The
       audit log records information about RPC calls, including the name
       of the RPC call, the host that submitted the call, the
       authenticated entity (user) that issued the call, the parameters
       for the call, and if the call succeeded or failed.

   -audit-interface (file | sysvmq)
       Specifies what audit interface to use. The "file" interface writes
       audit messages to the file passed to -auditlog. The "sysvmq"
       interface writes audit messages to a SYSV message (see msgget(2)
       and msgrcv(2)). The message queue the "sysvmq" interface writes to
       has the key "ftok(path, 1)", where "path" is the path specified in
       the -auditlog option.

       Defaults to "file".

   -d <debug level>
       Sets the detail level for the debugging trace written to the
       /var/log/openafs/FileLog file. Provide one of the following values,
       each of which produces an increasingly detailed trace: 0, 1, 5, 25,
       and 125. The default value of 0 produces only a few messages.

   -p <number of processes>
       Sets the number of threads (or LWPs) to run. Provide a positive
       integer.  The File Server creates and uses five threads for special
       purposes, in addition to the number specified (but if this argument
       specifies the maximum possible number, the File Server
       automatically uses five of the threads for its own purposes).

       The maximum number of threads can differ in each release of
       OpenAFS.  Consult the OpenAFS Release Notes for the current
       release.

   -spare <number of spare blocks>
       Specifies the number of additional kilobytes an application can
       store in a volume after the quota is exceeded. Provide a positive
       integer; a value of 0 prevents the volume from ever exceeding its
       quota. Do not combine this argument with the -pctspare argument.

   -pctspare <percentage spare>
       Specifies the amount by which the File Server allows a volume to
       exceed its quota, as a percentage of the quota. Provide an integer
       between 0 and 99. A value of 0 prevents the volume from ever
       exceeding its quota. Do not combine this argument with the -spare
       argument.

   -b <buffers>
       Sets the number of directory buffers. Provide a positive integer.

   -l <large vnodes>
       Sets the number of large vnodes available in memory for caching
       directory elements. Provide a positive integer.

   -s <small nodes>
       Sets the number of small vnodes available in memory for caching
       file elements. Provide a positive integer.

   -vc <volume cachesize>
       Sets the number of volumes the File Server can cache in memory.
       Provide a positive integer.

   -w <call back wait interval>
       Sets the interval at which the daemon spawned by the File Server
       performs its maintenance tasks. Do not use this argument; changing
       the default value can cause unpredictable behavior.

   -cb <number of callbacks>
       Sets the number of callbacks the File Server can track. Provide a
       positive integer.

   -banner
       Prints the following banner to /dev/console about every 10 minutes.

          File Server is running at I<time>.

   -novbc
       Prevents the File Server from breaking the callbacks that Cache
       Managers hold on a volume that the File Server is reattaching after
       the volume was offline (as a result of the vos restore command, for
       example). Use of this flag is strongly discouraged.

   -implicit <admin mode bits>
       Defines the set of permissions granted by default to the
       system:administrators group on the ACL of every directory in a
       volume stored on the file server machine. Provide one or more of
       the standard permission letters ("rlidwka") and auxiliary
       permission letters ("ABCDEFGH"), or one of the shorthand notations
       for groups of permissions ("all", "none", "read", and "write"). To
       review the meaning of the permissions, see the fs setacl reference
       page.

   -readonly
       Don't allow writes to this fileserver.

   -hr <number of hours between refreshing the host cps>
       Specifies how often the File Server refreshes its knowledge of the
       machines that belong to protection groups (refreshes the host CPSs
       for machines). The File Server must update this information to
       enable users from machines recently added to protection groups to
       access data for which those machines now have the necessary ACL
       permissions.

   -busyat <redirect clients when queue > n>
       Defines the number of incoming RPCs that can be waiting for a
       response from the File Server before the File Server returns the
       error code "VBUSY" to the Cache Manager that sent the latest RPC.
       In response, the Cache Manager retransmits the RPC after a delay.
       This argument prevents the accumulation of so many waiting RPCs
       that the File Server can never process them all. Provide a positive
       integer.  The default value is 600.

   -rxpck <number of rx extra packets>
       Controls the number of Rx packets the File Server uses to store
       data for incoming RPCs that it is currently handling, that are
       waiting for a response, and for replies that are not yet complete.
       Provide a positive integer.

   -rxdbg
       Writes a trace of the File Server's operations on Rx packets to the
       file /var/log/openafs/rx_dbg.

   -rxdbge
       Writes a trace of the File Server's operations on Rx events (such
       as retransmissions) to the file /var/log/openafs/rx_dbg.

   -rxmaxmtu <bytes>
       Defines the maximum size of an MTU.  The value must be between the
       minimum and maximum packet data sizes for Rx.

   -jumbo
       Allows the server to send and receive jumbograms. A jumbogram is a
       large-size packet composed of 2 to 4 normal Rx data packets that
       share the same header. The fileserver does not use jumbograms by
       default, as some routers are not capable of properly breaking the
       jumbogram into smaller packets and reassembling them.

   -nojumbo
       Deprecated; jumbograms are disabled by default.

   -rxbind
       Force the fileserver to only bind to one IP address.

   -allow-dotted-principals
       By default, the RXKAD security layer will disallow access by
       Kerberos principals with a dot in the first component of their
       name. This is to avoid the confusion where principals user/admin
       and user.admin are both mapped to the user.admin PTS entry. Sites
       whose Kerberos realms don't have these collisions between principal
       names may disable this check by starting the server with this
       option.

   -L  Sets values for many arguments in a manner suitable for a large
       file server machine. Combine this flag with any option except the
       -S flag; omit both flags to set values suitable for a medium-sized
       file server machine.

   -S  Sets values for many arguments in a manner suitable for a small
       file server machine. Combine this flag with any option except the
       -L flag; omit both flags to set values suitable for a medium-sized
       file server machine.

   -k <stack size>
       Sets the LWP stack size in units of 1 kilobyte. Do not use this
       argument, and in particular do not specify a value less than the
       default of 24.

   -realm <Kerberos realm name>
       Defines the Kerberos realm name for the File Server to use. If this
       argument is not provided, it uses the realm name corresponding to
       the cell listed in the local /etc/openafs/server/ThisCell file.

   -udpsize <size of socket buffer in bytes>
       Sets the size of the UDP buffer, which is 64 KB by default. Provide
       a positive integer, preferably larger than the default.

   -sendsize <size of send buffer in bytes>
       Sets the size of the send buffer, which is 16384 bytes by default.

   -abortthreshold <abort threshold>
       Sets the abort threshold, which is triggered when an AFS client
       sends a number of FetchStatus requests in a row and all of them
       fail due to access control or some other error. When the abort
       threshold is reached, the file server starts to slow down the
       responses to the problem client in order to reduce the load on the
       file server.

       The throttling behaviour can cause issues especially for some
       versions of the Windows OpenAFS client. When using Windows Explorer
       to navigate the AFS directory tree, directories with only "look"
       access for the current user may load more slowly because of the
       throttling. This is because the Windows OpenAFS client sends
       FetchStatus calls one at a time instead of in bulk like the Unix
       Open AFS client.

       Setting the threshold to 0 disables the throttling behavior. This
       option is available in OpenAFS versions 1.4.1 and later.

   -enable_peer_stats
       Activates the collection of Rx statistics and allocates memory for
       their storage. For each connection with a specific UDP port on
       another machine, a separate record is kept for each type of RPC
       (FetchFile, GetStatus, and so on) sent or received. To display or
       otherwise access the records, use the Rx Monitoring API.

   -enable_process_stats
       Activates the collection of Rx statistics and allocates memory for
       their storage. A separate record is kept for each type of RPC
       (FetchFile, GetStatus, and so on) sent or received, aggregated over
       all connections to other machines. To display or otherwise access
       the records, use the Rx Monitoring API.

   -syslog [<loglevel]
       Use syslog instead of the normal logging location for the
       fileserver process.  If provided, log messages are at <loglevel>
       instead of the default LOG_USER.

   -mrafslogs
       Use MR-AFS (Multi-Resident) style logging.  This option is
       deprecated.

   -saneacls
       Offer the SANEACLS capability for the fileserver.  This option is
       currently unimplemented.

   -help
       Prints the online help for this command. All other valid options
       are ignored.

   -vhandle-setaside <fds reserved for non-cache io>
       Number of file handles set aside for I/O not in the cache. Defaults
       to 128.

   -vhandle-max-cachesize <max open files>
       Maximum number of available file handles.

   -vhandle-initial-cachesize <initial open file cache>
       Number of file handles set aside for I/O in the cache. Defaults to
       128.

   -vattachpar <number of volume attach threads>
       The number of threads assigned to attach and detach volumes.  The
       default is 1.  Warning: many of the I/O parallelism features of
       Demand-Attach Fileserver are turned off when the number of volume
       attach threads is only 1.

       This option is only meaningful for a file server built with
       pthreads support.

   -m <min percentage spare in partition>
       Specifies the percentage of each AFS server partition that the AIX
       version of the File Server creates as a reserve. Specify an integer
       value between 0 and 30; the default is 8%. A value of 0 means that
       the partition can become completely full, which can have serious
       negative consequences.  This option is not supported on platforms
       other than AIX.

   -lock
       Prevents any portion of the fileserver binary from being paged
       (swapped) out of memory on a file server machine running the IRIX
       operating system.  This option is not supported on platforms other
       than IRIX.

   -sync <always | delayed | onclose | never>
       This option changes how hard the fileserver tries to ensure that
       data written to volumes actually hits the physical disk.

       Normally, when the fileserver writes to disk, the underlying
       filesystem or Operating System may delay writes from actually going
       to disk, and reorder which writes hit the disk first. So, during an
       unclean shutdown of the machine (if the power goes out, or the
       machine crashes, etc), or if the physical disk backing store
       becomes unavailable, file data may become lost that the server
       previously told clients was already successfully written.

       To try to mitigate this, the fileserver will try to "sync" file
       data to the physical disk at numerous points during various I/O.
       However, this can result in significantly reduced performance.
       Depending on the usage patterns, this may or may not be acceptable.
       This option dictates specifically what the fileserver does when it
       wants to perform a "sync".

       There are several options; pass one of these as the argument to
       -sync. The default is "onclose".

       always
           This causes a sync operation to always sync immediately and
           synchronously.  This is the slowest option that provides the
           greatest protection against data loss in the event of a crash
           or backing store unavailability.

           Note that this is still not a 100% guarantee that data will not
           be lost or corrupted during a crash. The underlying filesystem
           itself may cause data to be lost or corrupt in such a
           situation. And OpenAFS itself does not (yet) even guarantee
           that all data is consistent at any point in time; so even if
           the filesystem and OS do not buffer or reorder any writes, you
           are not guaranteed that all data will be okay after a crash.

           This option may be appropriate if you have reason to believe a
           server is prone to data loss failures, such as if the server
           encounters frequent power failures or connectivity issues with
           network attached storage. Or if the backend storage is
           temporarily degraded in some way (for example, a battery on a
           caching controller fails), it may make sense to temporarily use
           the "always" option until the situation is fixed. Some servers
           may also allow for sync operations to occur very quickly, such
           that the "always" option is not noticeably slower than any
           other option. In such a case, there is no downside to
           specifying "always".

           This was the only behavior allowed in OpenAFS releases prior to
           1.4.5.

       delayed
           This causes a sync to do nothing immediately, but the sync will
           happen sometime in the background, within approximately the
           next 10 seconds. This works by having a separate thread that
           goes through all open file handles every 10 seconds, and it
           syncs the ones that have been marked as needing a sync. File
           handles flagged for sync may also get synced on volume
           detachment, according to the same behavior as with the
           "onclose" option.

           This option is currently not recommended, since in the past the
           code implementing this option has caused rare data corruption
           during normal operation.

           This was the only behavior allowed in OpenAFS releases starting
           from 1.4.5 up to and including 1.6.2. It was the default
           starting from OpenAFS 1.6.3 up to and including OpenAFS 1.6.7.
           This option will be removed in a future version of OpenAFS.

       onclose
           This causes a sync to do nothing immediately, but causes the
           relevant file to be flagged as potentially needing a sync. When
           a volume is detached, flagged volume metadata files are synced,
           as well as data files that have been accessed recently. Events
           that cause a volume to detach include: performing certain
           volume operations (restore, salvage, offline, et al), detection
           of volume consistency errors, a clean shutdown of the
           fileserver, or during DAFS "soft detachment".

           Effectively this option is the same as "never" while a volume
           is attached and actively being used, but if a volume is
           detached, there is an additional guarantee for the data's
           consistency.

           This option is the default starting with OpenAFS 1.6.8.

       never
           This causes all syncs to never do anything. This is the fastest
           option, with the weakest guarantees for data consistency.

           Depending on the underlying filesystem and Operating System,
           there may be guarantees that any data written to disk will hit
           the physical media after a certain amount of time. For example,
           Linux's pdflush process usually makes this guarantee, and ext3
           can make certain various consistency guarantees according to
           the options given. ZFS on Solaris can also provide similar
           guarantees, as can various other platforms and filesystems.
           Consult the documentation for your platform if you are unsure.

       Which option you choose is not an easy decision to make. Various
       developers and experts sometimes disagree on which option is the
       most reasonable, and it may depend on the specific scenario and
       workload involved. Some argue that the "always" option does not
       provide significantly greater guarantees over any other option,
       whereas others argue that choosing anything besides the "always"
       option allows for an unacceptable risk of data loss. This may
       depend on your usage patterns, your hardware, your platform and
       filesystem, and who you talk to about this topic.

   -offline-timeout <timeout in seconds>
       Setting this option to N means that if any clients are reading from
       a volume when we want to offline that volume (for example, as part
       of releasing a volume), we will wait N seconds for the clients'
       request to finish. If the clients' requests have not finished, we
       will then interrupt the client requests and send an error to those
       clients, allowing the volume to go offline.

       If a client is interrupted, from the client's point of view, it
       will appear as if they had accessed the volume after it had gone
       offline. For RO volumes, this mean the client should fail-over to
       other valid RO sites for that volume. This option may speed up
       volume releases if volumes are being accessed by clients that have
       slow or unreliable network connections.

       Setting this option to 0 means to interrupt clients immediately if
       a volume is waiting to go offline. Setting this option to "-1"
       means to wait forever for client requests to finish. The default
       value is "-1".

       For the LWP fileserver, the only valid value for this option is
       "-1".

   -offline-shutdown-timeout <timeout in seconds>
       This option behaves similarly to -offline-timeout but applies to
       volumes that are going offline as part of the fileserver shutdown
       process. If the value specified is N, we will interrupt any clients
       reading from volumes after N seconds have passed since we first
       needed to wait for a volume to offline during the shutdown process.

       Setting this option to 0 means to interrupt all clients reading
       from volumes immediately during the shutdown process. Setting this
       option to "-1" means to wait forever for client requests to finish
       during the shutdown process.

       If -offline-timeout is specified, the default value of
       -offline-shutdown-timeout is the value specified for
       -offline-timeout. Otherwise, the default value is "-1".

       For the LWP fileserver, the only valid value for this option is
       "-1".

EXAMPLES

   The following bos create command creates a traditional fs process on
   the file server machine "fs2.abc.com" that uses the large configuration
   size, and allows volumes to exceed their quota by 10%. Type the command
   on a single line:

      % bos create -server fs2.abc.com -instance fs -type fs \
                   -cmd "/usr/lib/openafs/fileserver -pctspare 10 -L" \
                   /usr/lib/openafs/volserver /usr/lib/openafs/salvager

TROUBLESHOOTING

   Sending process signals to the File Server Process can change its
   behavior in the following ways:

     Process          Signal       OS     Result
     ---------------------------------------------------------------------

     File Server      XCPU        Unix    Prints a list of client IP
                                          Addresses.

     File Server      USR2      Windows   Prints a list of client IP
                                          Addresses.

     File Server      POLL        HPUX    Prints a list of client IP
                                          Addresses.

     Any server       TSTP        Any     Increases Debug level by a power
                                          of 5 -- 1,5,25,125, etc.
                                          This has the same effect as the
                                          -d XXX command-line option.

     Any Server       HUP         Any     Resets Debug level to 0

     File Server      TERM        Any     Run minor instrumentation over
                                          the list of descriptors.

     Other Servers    TERM        Any     Causes the process to quit.

     File Server      QUIT        Any     Causes the File Server to Quit.
                                          Bos Server knows this.

   The basic metric of whether an AFS file server is doing well is the
   number of connections waiting for a thread, which can be found by
   running the following command:

      % rxdebug <server> | grep waiting_for | wc -l

   Each line returned by "rxdebug" that contains the text "waiting_for"
   represents a connection that's waiting for a file server thread.

   If the blocked connection count is ever above 0, the server is having
   problems replying to clients in a timely fashion.  If it gets above 10,
   roughly, there will be noticeable slowness by the user.  The total
   number of connections is a mostly irrelevant number that goes
   essentially monotonically for as long as the server has been running
   and then goes back down to zero when it's restarted.

   The most common cause of blocked connections rising on a server is some
   process somewhere performing an abnormal number of accesses to that
   server and its volumes.  If multiple servers have a blocked connection
   count, the most likely explanation is that there is a volume replicated
   between those servers that is absorbing an abnormally high access rate.

   To get an access count on all the volumes on a server, run:

      % vos listvol <server> -long

   and save the output in a file.  The results will look like a bunch of
   vos examine output for each volume on the server.  Look for lines like:

      40065 accesses in the past day (i.e., vnode references)

   and look for volumes with an abnormally high number of accesses.
   Anything over 10,000 is fairly high, but some volumes like root.cell
   and other volumes close to the root of the cell will have that many
   hits routinely.  Anything over 100,000 is generally abnormally high.
   The count resets about once a day.

   Another approach that can be used to narrow the possibilities for a
   replicated volume, when multiple servers are having trouble, is to find
   all replicated volumes for that server.  Run:

      % vos listvldb -server <server>

   where <server> is one of the servers having problems to refresh the
   VLDB cache, and then run:

      % vos listvldb -server <server> -part <partition>

   to get a list of all volumes on that server and partition, including
   every other server with replicas.

   Once the volume causing the problem has been identified, the best way
   to deal with the problem is to move that volume to another server with
   a low load or to stop any runaway programs that are accessing that
   volume unnecessarily.  Often the volume will be enough information to
   tell what's going on.

   If you still need additional information about who's hitting that
   server, sometimes you can guess at that information from the failed
   callbacks in the FileLog log in /var/log/afs on the server, or from the
   output of:

      % /usr/afsws/etc/rxdebug <server> -rxstats

   but the best way is to turn on debugging output from the file server.
   (Warning: This generates a lot of output into FileLog on the AFS
   server.)  To do this, log on to the AFS server, find the PID of the
   fileserver process, and do:

       kill -TSTP <pid>

   where <pid> is the PID of the file server process.  This will raise the
   debugging level so that you'll start seeing what people are actually
   doing on the server.  You can do this up to three more times to get
   even more output if needed.  To reset the debugging level back to
   normal, use (The following command will NOT terminate the file server):

       kill -HUP <pid>

   The debugging setting on the File Server should be reset back to normal
   when debugging is no longer needed.  Otherwise, the AFS server may well
   fill its disks with debugging output.

   The lines of the debugging output that are most useful for debugging
   load problems are:

       SAFS_FetchStatus,  Fid = 2003828163.77154.82248, Host 171.64.15.76
       SRXAFS_FetchData, Fid = 2003828163.77154.82248

   (The example above is partly truncated to highlight the interesting
   information).  The Fid identifies the volume and inode within the
   volume; the volume is the first long number.  So, for example, this
   was:

      % vos examine 2003828163
      pubsw.matlab61                   2003828163 RW    1040060 K  On-line
          afssvr5.Stanford.EDU /vicepa
          RWrite 2003828163 ROnly 2003828164 Backup 2003828165
          MaxQuota    3000000 K
          Creation    Mon Aug  6 16:40:55 2001
          Last Update Tue Jul 30 19:00:25 2002
          86181 accesses in the past day (i.e., vnode references)

          RWrite: 2003828163    ROnly: 2003828164    Backup: 2003828165
          number of sites -> 3
             server afssvr5.Stanford.EDU partition /vicepa RW Site
             server afssvr11.Stanford.EDU partition /vicepd RO Site
             server afssvr5.Stanford.EDU partition /vicepa RO Site

   and from the Host information one can tell what system is accessing
   that volume.

   Note that the output of vos_examine(1) also includes the access count,
   so once the problem has been identified, vos examine can be used to see
   if the access count is still increasing.  Also remember that you can
   run vos examine on the read-only replica (e.g.,
   pubsw.matlab61.readonly) to see the access counts on the read-only
   replica on all of the servers that it's located on.

PRIVILEGE REQUIRED

   The issuer must be logged in as the superuser "root" on a file server
   machine to issue the command at a command shell prompt.  It is
   conventional instead to create and start the process by issuing the bos
   create command.

SEE ALSO

   BosConfig(5), FileLog(5), bos_create(8), bos_getlog(8), fs_setacl(1),
   msgget(2), msgrcv(2), salvager(8), volserver(8), vos_examine(1)

COPYRIGHT

   IBM Corporation 2000. <http://www.ibm.com/> All Rights Reserved.

   This documentation is covered by the IBM Public License Version 1.0.
   It was converted from HTML to POD by software written by Chas Williams
   and Russ Allbery, based on work by Alf Wachsmann and Elizabeth Cassell.



Opportunity


Personal Opportunity - Free software gives you access to billions of dollars of software at no cost. Use this software for your business, personal use or to develop a profitable skill. Access to source code provides access to a level of capabilities/information that companies protect though copyrights. Open source is a core component of the Internet and it is available to you. Leverage the billions of dollars in resources and capabilities to build a career, establish a business or change the world. The potential is endless for those who understand the opportunity.

Business Opportunity - Goldman Sachs, IBM and countless large corporations are leveraging open source to reduce costs, develop products and increase their bottom lines. Learn what these companies know about open source and how open source can give you the advantage.


Free Software


Free Software provides computer programs and capabilities at no cost but more importantly, it provides the freedom to run, edit, contribute to, and share the software. The importance of free software is a matter of access, not price. Software at no cost is a benefit but ownership rights to the software and source code is far more significant.

Free Office Software - The Libre Office suite provides top desktop productivity tools for free. This includes, a word processor, spreadsheet, presentation engine, drawing and flowcharting, database and math applications. Libre Office is available for Linux or Windows.


Free Books


The Free Books Library is a collection of thousands of the most popular public domain books in an online readable format. The collection includes great classical literature and more recent works where the U.S. copyright has expired. These books are yours to read and use without restrictions.

Source Code - Want to change a program or know how it works? Open Source provides the source code for its programs so that anyone can use, modify or learn how to write those programs themselves. Visit the GNU source code repositories to download the source.


Education


Study at Harvard, Stanford or MIT - Open edX provides free online courses from Harvard, MIT, Columbia, UC Berkeley and other top Universities. Hundreds of courses for almost all major subjects and course levels. Open edx also offers some paid courses and selected certifications.

Linux Manual Pages - A man or manual page is a form of software documentation found on Linux/Unix operating systems. Topics covered include computer programs (including library and system calls), formal standards and conventions, and even abstract concepts.