pcp-archive(5)

NAME

   pcp-archive - Archive Files for Performance Co-Pilot

SYNOPSIS

   $PCP_LOG_DIR/pmlogger/*/*.{meta,index,0}

   $PCP_LOG_DIR/pmmgr/*/*.{meta,index,0}

DESCRIPTION

   PCP  log  archives  store  volumes  of  historical  values of arbitrary
   Performance Co-Pilot metrics recorded from a single host.  Archives are
   self-contained  in  the  sense  that  they  contain  all  the important
   metadata that would be required for off-line or off-site analysis.  The
   format  is intended to be stable in order to allow long-term historical
   storage and processing by current tools.  (Compatibility in  the  other
   direction - new files, old tools - is not as fully assured.)

   Archives  may  be  read  by most PCP client tools, using the -a ARCHIVE
   option, or dumped raw by pmdumplog(1).   Archives  may  be  created  by
   pmlogger(1)  and  bulk-import tools.  Archives may be merged, analyzed,
   and  subsampled  using  specialized  tools  such  as   pmlogsummary(1),
   pmlogreduce(1), pmlogrewrite(1), and pmlogextract(1).  In addition, PCP
   archives may examined in sets or grouped  together  into  PCP  "archive
   folios", which are managed by the pmafm(1) tool.

   PCP  archives  consist  of  several  physical files that share a common
   arbitrary prefix, e.g., myarchive.

   myarchive.0, myarchive.1, ...
          Metric values.  May grow rapidly.

   myarchive.meta
          Information for PMAPI  functions  such  as  pmLookupDesc(3)  and
          pmGetInDom(3).  May grow in fits and spurts, as logged instances
          and instance domains vary.

   myarchive.index
          A temporal index, mapping timestamps to  offsets  in  the  other
          files.  Grows slowly.

COMMON FEATURES

   All  three  types  of  files  have  a similar record-based structure, a
   convention of  network-byte-order  (big-endian)  encoding,  and  32-bit
   fields  for  tagging/padding  for those records.  Strings are stored as
   8-bit characters without assuming  a  specific  encoding,  so  normally
   ASCII.  See also the __pmLog* types in include/pcp/impl.h.

   RECORD FRAMING
   The volume and meta files are divided into self-identifying records.

   
   Offset  Length                         Name                         
   
     0       4     N, length of record, in bytes, including this field 
     4      N-8    record payload, usually starting with a 32-bit tag  
    N-4      4     N, length of record (again)                         
   

   ARCHIVE LOG LABEL
   All  three  types  of  files  begin  with  a  "log label" header, which
   identifies the host name, the time interval covered, and a time zone.

   
   Offset  Length                         Name                        
   
     0       4     tag, PM_LOG_MAGIC | PM_LOG_VERS02=0x50052602       
     4       4     pid of pmlogger process that wrote file            
     8       4     log start time, seconds part (past UNIX epoch)     
     12      4     log start time, microseconds part                  
     16      4     current log volume number (or -1=.meta, -2=.index) 
     20      64    name of collection host                            
     80      40    time zone string ($TZ environment variable)        
   
   All fields, except for the current log volume number field,  match  for
   all archive-related files produced by a single run of the tool.

ARCHIVE VOLUME (.0, .1, ...) RECORDS

   pmResult
   After  the  archive  log  label record, an archive volume file contains
   metric  values  corresponding  to  the  pmResult  set  of  one  pmFetch
   operation,  which  is almost identical to the form on disk.  The record
   size may vary according to number of PMIDs being fetched, the number of
   instances  for  their  domains.   File  size  is limited to 2GB, due to
   storage of 32-bit offsets within the .index file.

       
       Offset   Length                    Name                    
       
          0       4     timestamp, seconds part (past UNIX epoch) 
          4       4     timestamp, microseconds part              
          8       4     number of PMIDs with data following       
         12       M     pmValueSet #0                             
        12+M      N     pmValueSet #1                             
       12+M+N    ...    ...                                       
         NOP      X     pmValueBlock #0                           
        NOP+X     Y     pmValueBlock #1                           
       NOP+X+Y   ...    ...                                       
       
   Records with a number-of-PMIDs equal to zero  are  "markers",  and  may
   represent  interruptions,  missing  data,  or  time  discontinuities in
   logging.

   pmValueSet
   This subrecord represents the measurements for one metric.

     
     Offset  Length                       Name                      
     
       0       4     PMID                                           
       4       4     number of values                               
       8       4     storage mode, PM_VAL_INSITU=0 or PM_VAL_DPTR=1 
       12      M     pmValue #0                                     
      12+M     N     pmValue #1                                     
     12+M+N   ...    ...                                            
     

   The metric-description metadata for PMIDs is found in the .meta  files.
   These  entries  are  not  timestamped, so the metadata is assumed to be
   unchanging throughout the archiving session.

   pmValue
   This subrecord represents one  measurement  for  one  instance  of  the
   metric.   It  is  a  variant type, depending on the parent pmValueSet's
   value-format field.  This allows small numbers to be encoded compactly,
   but  retain flexibility for larger or variable-length data to be stored
   later in the pmResult record.

      
      Offset  Length                      Name                      
      
        0       4     number in instance-domain (or PM_IN_NULL=-1)  
        4       4     value (INSITU) or                             
                      offset in pmResult to our pmValueBlock (DPTR) 
      

   The instance-domain metadata for PMIDs is found  in  the  .meta  files.
   Since  the  numeric  mappings  may  change  during  the lifetime of the
   logging session, it is important to  match  up  the  timestamp  of  the
   measurement record with the corresponding instance-domain record.  That
   is, the instance-domain corresponding to a measurement at  time  T  are
   the records with largest timestamps T' <= T.

   pmValueBlock
   Instances  of  this  subrecord are placed at the end of the pmValueSet,
   after all the pmValue subrecords.  Iff needed, they are padded  at  the
   end to the next-higher 32-bit boundary.

     
     Offset  Length                       Name                      
     
       0       1     value type (same as pmDesc.type)               
       1       3     4 + N, the length of the subrecord             
       4       N     bytes that make up the raw value               
      4+N     0-3    padding (not included in the 4+N length field) 
     
   Note  that  for  PM_TYPE_STRING,  the  length  includes an explicit NUL
   terminator byte.  For PM_TYPE_EVENT, the value  bytestring  is  further
   structured.

   pmEventArray
   (TBD)

METADATA FILE (.meta) RECORDS

   After  the  archive  log  label  record,  the  metadata  file  contains
   interleaved   metric-description   and   timestamped    instance-domain
   descriptors.   File  size  is  limited to 2GB, due to storage of 32-bit
   offsets within the .index file.   Unlike  the  archive  volumes,  these
   records    are    not   forced   to   32-bit   alignment!    See   also
   src/libpcp/src/logmeta.c.

   pmDesc
   Instances of this record represent the  metric  description,  giving  a
   name, type, instance-domain identifier, and a set of names to each PMID
   used in the archive volume.

    
    Offset  Length                        Name                       
    
      0       4     tag, TYPE_DESC=1                                 
      4       4     pmid                                             
      8       4     type (PM_TYPE_*)                                 
      12      4     instance domain number                           
      16      4     semantics of value (PM_SEM_*)                    
      20      4     units: bit-packed pmUnits                        
      4       4     number of alternative names for this PMID        
      28      4     N: number of bytes in this name                  
      32      N     bytes of the name, no NUL terminator nor padding 
     32+N     4     N2: number of bytes in next name                 
     36+N     N2    bytes of the name, no NUL terminator nor padding 
     ...     ...    ...                                              
    

   pmLogIndom
   Instances of this record represent the number-string mapping  table  of
   an  instance domain.  The instance domain number will have already been
   mentioned in a prior pmDesc record.  Since  new  instances  may  appear
   over  a  long archiving run, these records are timestamped, and must be
   searched when decoding pmResult records from the main archive  volumes.
   Instance  names  may  be reused between instance numbers, so an offset-
   based string table is used that could permit sharing.

     
      Offset   Length                      Name                      
     
        0        4     tag, TYPE_INDOM=2                             
        4        4     timestamp, seconds part (past UNIX epoch)     
        8        4     timestamp, microseconds part                  
        12       4     instance domain number                        
        16       4     N: number of instances in domain, normally >0 
        20       4     first instance number                         
        24       4     second instance number (if appropriate)       
       ...      ...    ...                                           
      20+4*N     4     first offset into string table (see below)    
     20+4*N+4    4     second offset into string table (etc.)        
       ...      ...    ...                                           
      20+8*N     M     base of string table, containing              
                       packed, NUL-terminated instance names         
     
   Records of  this  form  replace  the  existing  instance-domain:  prior
   records are not searched for resolving instance numbers in measurements
   after this timestamp.

INDEX FILE (.index) RECORDS

   After the archive log label record, the temporal index file contains  a
   plainly concatenated, unframed group of tuples, which relate timestamps
   to 32-bit seek offsets in the volume  and  meta  files.   (This  limits
   those  files  to  2GB  in  size.)  These records are fixed-size, fixed-
   format, and are not  enclosed  in  the  standard  length/payload/length
   wrapper:  they  just  take  up the entire remainder of the .index file.
   See also src/libpcp/src/logutil.c.

    
    Offset  Length                        Name                        
    
      0       4     event time, seconds part (past UNIX epoch)        
      4       4     event time, microseconds part                     
      8       4     archive volume number (0...N)                     
      12      4     byte offset in .meta file of pmDesc or pmLogIndom 
      16      4     byte offset in archive volume file of pmResult    
    
   Since temporal indexes are optional, and exist only to speed  up  time-
   wise  random  access  of  metrics and their metadata, index records are
   emitted only intermittently.  An  archive  reader  program  should  not
   presume  any  particular  rate  of  data flow into the index.  However,
   common events that may trigger  a  new  temporal-index  record  include
   changes  in  instance-domains,  switching over to a new archive volume,
   just starting or stopping logging.  One reliable invariant  however  is
   that,  for  each index entry, there are to be no meta or archive-volume
   records with a timestamp after that in the index, but physically before
   the byte-offset in the index.

SEE ALSO

   PCPIntro(1),    PMAPI(3),    pmlogger(1),    pmdumplog(1),    pmafm(1),
   pcp.conf(5), and pcp.env(5).



Opportunity


Personal Opportunity - Free software gives you access to billions of dollars of software at no cost. Use this software for your business, personal use or to develop a profitable skill. Access to source code provides access to a level of capabilities/information that companies protect though copyrights. Open source is a core component of the Internet and it is available to you. Leverage the billions of dollars in resources and capabilities to build a career, establish a business or change the world. The potential is endless for those who understand the opportunity.

Business Opportunity - Goldman Sachs, IBM and countless large corporations are leveraging open source to reduce costs, develop products and increase their bottom lines. Learn what these companies know about open source and how open source can give you the advantage.


Free Software


Free Software provides computer programs and capabilities at no cost but more importantly, it provides the freedom to run, edit, contribute to, and share the software. The importance of free software is a matter of access, not price. Software at no cost is a benefit but ownership rights to the software and source code is far more significant.

Free Office Software - The Libre Office suite provides top desktop productivity tools for free. This includes, a word processor, spreadsheet, presentation engine, drawing and flowcharting, database and math applications. Libre Office is available for Linux or Windows.


Free Books


The Free Books Library is a collection of thousands of the most popular public domain books in an online readable format. The collection includes great classical literature and more recent works where the U.S. copyright has expired. These books are yours to read and use without restrictions.

Source Code - Want to change a program or know how it works? Open Source provides the source code for its programs so that anyone can use, modify or learn how to write those programs themselves. Visit the GNU source code repositories to download the source.


Education


Study at Harvard, Stanford or MIT - Open edX provides free online courses from Harvard, MIT, Columbia, UC Berkeley and other top Universities. Hundreds of courses for almost all major subjects and course levels. Open edx also offers some paid courses and selected certifications.

Linux Manual Pages - A man or manual page is a form of software documentation found on Linux/Unix operating systems. Topics covered include computer programs (including library and system calls), formal standards and conventions, and even abstract concepts.