ovdb(5)


NAME

   ovdb - Overview storage method for INN

DESCRIPTION

   Ovdb is a storage method that uses the Berkeley DB library to store
   overview data.  It requires version 4.4 or later of the Berkeley DB
   library (4.7+ is recommended because older versions suffer from various
   issues).

   Ovdb makes use of the full transaction/logging/locking functionality of
   the Berkeley DB environment.  Berkeley DB may be downloaded from
   <http://www.oracle.com/technetwork/database/database-technologies/berkeleydb/overview/index.html>
   and is needed to build the ovdb backend.

UPGRADING

   This is version 2 of ovdb.  If you have a database created with a
   previous version of ovdb (such as the one shipped with INN 2.3.0) your
   database will need to be upgraded using ovdb_init(8).  See the man page
   ovdb_init(8) for upgrade instructions.

INSTALLATION

   To build ovdb support into INN, specify the option --with-bdb when
   running the configure script.  By default, configure will search for
   Berkeley DB in default search paths; there will be a message in the
   configure output indicating the pathname that will be used.

   You can override this pathname by adding a path to the option, for
   instance --with-bdb=/usr/BerkeleyDB.4.4.  This directory is expected to
   have subdirectories include and lib (lib32 and lib64 are also checked),
   containing respectively db.h, and the library itself.  In case non-
   standard paths to the Berkeley DB libraries are used, one or both of
   the options --with-bdb-include and --with-bdb-lib can be given to
   configure with a path.

   The ovdb database may take up more disk space for a given spool than
   the other overview methods.  Plan on needing at least 1.1 KB for every
   article in your spool (not counting crossposts).  So, if you have 5
   million articles, you'll need at least 5.5 GB of disk space for ovdb.
   With compression enabled, this estimate changes to 0.7 KB per article.
   See the COMPRESSION section below.  Plus, you'll need additional space
   for transaction logs: at least 100 MB.  By default the transaction logs
   go in the same directory as the database.  To improve performance, they
   can be placed on a different disk -- see the DB_CONFIG section.

CONFIGURATION

   To enable ovdb, set the ovmethod parameter in inn.conf to "ovdb".  The
   ovdb database is stored in the directory specified by the pathoverview
   parameter in inn.conf.  This is the "DB_HOME" directory.  To start out,
   this directory should be empty (other than an optional DB_CONFIG file;
   see DB_CONFIG for details) and innd (or makehistory) will create the
   files as necessary in that directory.  Make sure the directory is owned
   by the news user.

   Other parameters for configuring ovdb are in the ovdb.conf(5)
   configuration file.  See also the sample ovdb.conf.

   cachesize
       Size of the memory pool cache, in kilobytes.  The cache will have a
       backing store file in the DB directory which will be at least as
       big.  In general, the bigger the cache, the better.  Use "ovdb_stat
       -m" to see cache hit percentages.  To make a change of this
       parameter take effect, shut down and restart INN (be sure to kill
       all of the nnrpds when shutting down).  Default is 8000, which is
       adequate for small to medium sized servers.  Large servers will
       probably need at least 20000.

   compress
       If INN was compiled with zlib, and this compress parameter is true,
       OVDB will compress overview records that are longer than 600 bytes.
       See the COMPRESSION section below.

   numdbfiles
       Overview data is split between this many files.  Currently, innd
       will keep all of the files open, so don't set this too high or innd
       may run out of file descriptors.  nnrpd only opens one at a time,
       regardless.  May be set to one, or just a few, but only do that if
       your OS supports large (>2G) files.  Changing this parameter has no
       effect on an already-established database.  Default is 32.

   txn_nosync
       If txn_nosync is set to false, Berkeley DB flushes the log after
       every transaction.  This minimizes the number of transactions that
       may be lost in the event of a crash, but results in significantly
       degraded performance.  Default is true.

   useshm
       If useshm is set to true, Berkeley DB will use shared memory
       instead of mmap for its environment regions (cache, lock, etc).
       With some platforms, this may improve performance.  Default is
       false.

   shmkey
       Sets the shared memory key used by Berkeley DB when 'useshm' is
       true.  Berkeley DB will create several (usually 5) shared memory
       segments, using sequentially numbered keys starting with 'shmkey'.
       Choose a key that does not conflict with any existing shared memory
       segments on your system.  Default is 6400.

   pagesize
       Sets the page size for the DB files (in bytes).  Must be a power of
       2.  Best choices are 4096 or 8192.  The default is 8192.  Changing
       this parameter has no effect on an already-established database.

   minkey
       Sets the minimum number of keys per page.  See the Berkeley DB
       documentation for more info.  Default is based on page size and
       whether compression is enabled:

          default_minkey = MAX(2, pagesize / 2600) if compress is false
          default_minkey = MAX(2, pagesize / 1500) if compress is true

       The lowest allowed minkey is 2.  Setting minkey higher than the
       default is not recommended, as it will cause the databases to have
       a lot of overflow pages.  Changing this parameter has no effect on
       an already-established database.

   maxlocks
       Sets the Berkeley DB "lk_max" parameter, which is the maximum
       number of locks that can exist in the database at the same time.
       Default is 4000.

   nocompact
       The nocompact parameter affects expireover's behavior.  The
       expireover function in ovdb can do its job in one of two ways:  by
       simply deleting expired records from the database, or by re-writing
       the overview records into a different location leaving out the
       expired records.  The first method is faster, but it leaves 'holes'
       that result in space that can not immediately be reused.  The
       second method 'compacts' the records by rewriting them.

       If this parameter is set to 0, expireover will compact all
       newsgroups; if set to 1, expireover will not compact any
       newsgroups; and if set to a value greater than one, expireover will
       only compact groups that have less than that number of articles.

       Experience has shown that compacting has minimal effect (other than
       making expireover take longer) so the default is now 1.  This
       parameter will probably be removed in the future.

   readserver
       Normally, each nnrpd process directly accesses the Berkeley DB
       environment.  The process of attaching to the database (and
       detaching when finished) is fairly expensive, and can result in
       high loads in situations when there are lots of reader connections
       of relatively short duration.

       When the readserver parameter is true, the nnrpds will access
       overview via a helper server (ovdb_server -- which is started by
       ovdb_init).  This can also result in cleaner shutdowns for the
       database, improving stability and avoiding deadlocks and corrupted
       databases.  If you are experiencing any instability in ovdb, try
       setting this parameter to true.  Default is false.

   numrsprocs
       This parameter is only used when readserver is true.  It sets the
       number of ovdb_server processes.  As each ovdb_server can process
       only one transaction at a time, running more servers can improve
       reader response times.  Default is 5.

   maxrsconn
       This parameter is only used when readserver is true.  It sets a
       maximum number of readers that a given ovdb_server process will
       serve at one time.  This means the maximum number of readers for
       all of the ovdb_server processes is (numrsprocs * maxrsconn).  This
       does not limit the actual number of readers, since nnrpd will fall
       back to opening the database directly if it can't connect to a
       readserver.  Default is 0, which means an umlimited number of
       connections is allowed.

COMPRESSION

   New in this version of OVDB is the ability to compress overview data
   before it is stored into the database.  In addition to consuming less
   disk space, compression keeps the average size of the database keys
   smaller.  This in turn increases the average number of keys per page,
   which can significantly improve performance and also helps keep the
   database more compact.  This feature requires that INN be built with
   zlib. Only records larger than 600 bytes get compressed, because that
   is the point at which compression starts to become significant.

   If compression is not enabled (either from the "compress" option in
   ovdb.conf or INN was not built from zlib), the database will be
   backward compatible with older versions of OVDB.  However, if
   compression is enabled, the database is marked with a newer version
   that will prevent older versions of OVDB from opening the database.

   You can upgrade an existing database to use compression simply by
   setting compress to true in ovdb.conf.  Note that existing records in
   the database will remain uncompressed; only new records added after
   enabling compression will be compressed.

   If you disable compression on a database that previously had it
   enabled, new records will be stored uncompressed, but the database will
   still be incompatible with older versions of OVDB (and will also be
   incompatible with this version of OVDB if it was not built with zlib).
   So to downgrade to a completely uncompressed database you will have to
   rebuild the database using makehistory.

DB_CONFIG

   A file called DB_CONFIG may be placed in the database directory to
   customize where the various database files and transaction logs are
   written.  By default, all of the files are written in the "DB_HOME"
   directory.  One way to improve performance is to put the transaction
   logs on a different disk.  To do this, put:

       DB_LOG_DIR /path/to/logs

   in the DB_CONFIG file.  If the pathname you give starts with a /, it is
   treated as an absolute path; otherwise, it is relative to the "DB_HOME"
   directory.  Make sure that any directories you specify exist and have
   proper ownership/mode before starting INN, because they won't be
   created automatically.  Also, don't change the DB_CONFIG file while
   anything that uses ovdb is running.

   Another thing that you can do with this file is to split the overview
   database across multiple disks.  In the DB_CONFIG file, you can list
   directories that Berkeley DB will search when it goes to open a
   database.

   For example, let's say that you have pathoverview set to /mnt/overview
   and you have four additional file systems created on /mnt/ov?.  You
   would create a file "/mnt/overview/DB_CONFIG" containing the following
   lines:

       set_data_dir /mnt/overview
       set_data_dir /mnt/ov1
       set_data_dir /mnt/ov2
       set_data_dir /mnt/ov3
       set_data_dir /mnt/ov4

   Distribute your ovNNNNN files into the four filesystems.  (say, 8
   each).  When called upon to open a database file, the db library will
   look for it in each of the specified directories (in order).  If said
   file is not found, one will be created in the first of those
   directories.

   Whenever you change DB_CONFIG or move database files around, make sure
   all news processes that use the database are shut down first (including
   nnrpds).

   The DB_CONFIG functionality is part of Berkeley DB itself, rather than
   something provided by ovdb.  See the Berkeley DB documentation for
   complete details for the version of Berkeley DB that you're running.

RUNNING

   When starting the news system, rc.news will invoke ovdb_init.
   ovdb_init must be run before using the database.  It performs the
   following tasks:

   *   Creates the database environment, if necessary.

   *   If the database is idle, it performs a normal recovery.  The
       recovery will remove stale locks, recreate the memory pool cache,
       and repair any damage caused by a system crash or improper
       shutdown.

   *   Starts the DB housekeeping processes (ovdb_monitor) if they're not
       already running.

   And when stopping INN, rc.news kills the ovdb_monitor processes after
   the other INN processes have been shut down.

DIAGNOSTICS

   Problems relating to ovdb are logged to news.err with "OVDB" in the
   error message.

   INN programs that use overview will fail to start up if the
   ovdb_monitor processes aren't running.  Be sure to run ovdb_init before
   running anything that accesses overview.

   Also, INN programs that use overview will fail to start up if the user
   running them is not the "news" user.

   If a program accessing the database crashes, or otherwise exits
   uncleanly, it might leave a stale lock in the database.  This lock
   could cause other processes to deadlock on that stale lock.  To fix
   this, shut down all news processes (using "kill -9" if necessary) and
   then restart.  ovdb_init should perform a recovery operation which will
   remove the locks and repair damage caused by killing the deadlocked
   processes.

FILES

   inn.conf
       The ovmethod and pathoverview parameters are relevant to ovdb.

   ovdb.conf
       Optional configuration file for tuning.  See CONFIGURATION above.

   pathoverview
       Directory where the database goes.  Berkeley DB calls it the
       'DB_HOME' directory.

   pathoverview/DB_CONFIG
       Optional file to configure the layout of the database files.

   pathrun/ovdb.sem
       A file that gets locked by every process that is accessing the
       database.  This is used by ovdb_init to determine whether the
       database is active or quiescent.

   pathrun/ovdb_monitor.pid
       Contains the process ID of ovdb_monitor.

TO DO

   Implement a way to limit how many databases can be open at once (to
   reduce file descriptor usage); maybe using something similar to the
   cache code in ov3.c

HISTORY

   Written by Heath Kehoe <hakehoe@avalon.net> for InterNetNews

   $Id: ovdb.pod 9593 2013-12-27 21:16:09Z iulius $

SEE ALSO

   inn.conf(5), innd(8), nnrpd(8), ovdb_init(8), ovdb_monitor(8),
   ovdb_stat(8)

   Berkeley DB documentation:  in the docs directory of the Berkeley DB
   source distribution, or on the Oracle Berkeley DB web page
   (<http://www.oracle.com/technetwork/database/database-technologies/berkeleydb/overview/index.html>).





Opportunity


Personal Opportunity - Free software gives you access to billions of dollars of software at no cost. Use this software for your business, personal use or to develop a profitable skill. Access to source code provides access to a level of capabilities/information that companies protect though copyrights. Open source is a core component of the Internet and it is available to you. Leverage the billions of dollars in resources and capabilities to build a career, establish a business or change the world. The potential is endless for those who understand the opportunity.

Business Opportunity - Goldman Sachs, IBM and countless large corporations are leveraging open source to reduce costs, develop products and increase their bottom lines. Learn what these companies know about open source and how open source can give you the advantage.





Free Software


Free Software provides computer programs and capabilities at no cost but more importantly, it provides the freedom to run, edit, contribute to, and share the software. The importance of free software is a matter of access, not price. Software at no cost is a benefit but ownership rights to the software and source code is far more significant.


Free Office Software - The Libre Office suite provides top desktop productivity tools for free. This includes, a word processor, spreadsheet, presentation engine, drawing and flowcharting, database and math applications. Libre Office is available for Linux or Windows.





Free Books


The Free Books Library is a collection of thousands of the most popular public domain books in an online readable format. The collection includes great classical literature and more recent works where the U.S. copyright has expired. These books are yours to read and use without restrictions.


Source Code - Want to change a program or know how it works? Open Source provides the source code for its programs so that anyone can use, modify or learn how to write those programs themselves. Visit the GNU source code repositories to download the source.





Education


Study at Harvard, Stanford or MIT - Open edX provides free online courses from Harvard, MIT, Columbia, UC Berkeley and other top Universities. Hundreds of courses for almost all major subjects and course levels. Open edx also offers some paid courses and selected certifications.


Linux Manual Pages - A man or manual page is a form of software documentation found on Linux/Unix operating systems. Topics covered include computer programs (including library and system calls), formal standards and conventions, and even abstract concepts.