states - awk alike text processing tool
states [-hvV] [-D var=val] [-f file] [-o outputfile] [-p path] [-s startstate] [-W level] [filename ...]
States is an awk-alike text processing tool with some state machine extensions. It is designed for program source code highlighting and to similar tasks where state information helps input processing. At a single point of time, States is in one state, each quite similar to awk's work environment, they have regular expressions which are matched from the input and actions which are executed when a match is found. From the action blocks, states can perform state transitions; it can move to another state from which the processing is continued. State transitions are recorded so states can return to the calling state once the current state has finished. The biggest difference between states and awk, besides state machine extensions, is that states is not line-oriented. It matches regular expression tokens from the input and once a match is processed, it continues processing from the current position, not from the beginning of the next input line.
-D var=val, --define=var=val
Define variable var to have string value val. Command line
definitions overwrite variable definitions found from the
config file.
-f file, --file=file
Read state definitions from file file. As a default, states
tries to read state definitions from file states.st in the
current working directory.
-h, --help
Print short help message and exit.
-o file, --output=file
Save output to file file instead of printing it to stdout.
-p path, --path=path
Set the load path to path. The load path defaults to the
directory, from which the state definitions file is loaded.
-s state, --state=state
Start execution from state state. This definition overwrites
start state resolved from the start block.
-v, --verbose
Increase the program verbosity.
-V, --version
Print states version and exit.
-W level, --warning=level
Set the warning level to level. Possible values for level are:
light light warnings (default)
all all warnings
States program files can contain on start block, startrules and
namerules blocks to specify the initial state, state definitions and
expressions.
The start block is the main() of the states program, it is executed on
script startup for each input file and it can perform any
initialization the script needs. It normally also calls the
check_startrules() and check_namerules() primitives which resolve the
initial state from the input file name or the data found from the
beginning of the input file. Here is a sample start block which
initializes two variables and does the standard start state resolving:
start
{
a = 1;
msg = "Hello, world!";
check_startrules ();
check_namerules ();
}
Once the start block is processed, the input processing is continued
from the initial state.
The initial state is resolved by the information found from the
startrules and namerules blocks. Both blocks contain regular
expression - symbol pairs, when the regular expression is matched from
the name of from the beginning of the input file, the initial state is
named by the corresponding symbol. For example, the following start
and name rules can distinguish C and Fortran files:
namerules
{
/\.(c|h)$/ c;
/\.[fF]$/ fortran;
}
startrules
{
/-\*- [cC] -\*-/ c;
/-\*- fortran -\*-/ fortran;
}
If these rules are used with the previously shown start block, states
first check the beginning of input file. If it has string -*- c -*-,
the file is assumed to contain C code and the processing is started
from state called c. If the beginning of the input file has string -*-
fortran -*-, the initial state is fortran. If none of the start rules
matched, the name of the input file is matched with the namerules. If
the name ends to suffix c or C, we go to state c. If the suffix is f
or F, the initial state is fortran.
If both start and name rules failed to resolve the start state, states
just copies its input to output unmodified.
The start state can also be specified from the command line with option
-s, --state.
State definitions have the following syntax:
state { expr {statements} ... }
where expr is: a regular expression, special expression or symbol and
statements is a list of statements. When the expression expr is
matched from the input, the statement block is executed. The statement
block can call states' primitives, user-defined subroutines, call other
states, etc. Once the block is executed, the input processing is
continued from the current intput position (which might have been
changed if the statement block called other states).
Special expressions BEGIN and END can be used in the place of expr.
Expression BEGIN matches the beginning of the state, its block is
called when the state is entered. Expression END matches the end of
the state, its block is executed when states leaves the state.
If expr is a symbol, its value is looked up from the global environment
and if it is a regular expression, it is matched to the input,
otherwise that rule is ignored.
The states program file can also have top-level expressions, they are
evaluated after the program file is parsed but before any input files
are processed or the start block is evaluated.
call (symbol)
Move to state symbol and continue input file processing from
that state. Function returns whatever the symbol state's
terminating return statement returned.
calln (name)
Like call but the argument name is evaluated and its value must
be string. For example, this function can be used to call a
state which name is stored to a variable.
check_namerules ()
Try to resolve start state from namerules rules. Function
returns 1 if start state was resolved or 0 otherwise.
check_startrules ()
Try to resolve start state from startrules rules. Function
returns 1 if start state was resolved or 0 otherwise.
concat (str, ...)
Concanate argument strings and return result as a new string.
float (any)
Convert argument to a floating point number.
getenv (str)
Get value of environment variable str. Returns an empty string
if variable var is undefined.
int (any)
Convert argument to an integer number.
length (item, ...)
Count the length of argument strings or lists.
list (any, ...)
Create a new list which contains items any, ...
panic (any, ...)
Report a non-recoverable error and exit with status 1.
Function never returns.
print (any, ...)
Convert arguments to strings and print them to the output.
range (source, start, end)
Return a sub-range of source starting from position start
(inclusively) to end (exclusively). Argument source can be
string or list.
regexp (string)
Convert string string to a new regular expression.
regexp_syntax (char, syntax)
Modify regular expression character syntaxes by assigning new
syntax syntax for character char. Possible values for syntax
are:
'w' character is a word constituent
' ' character isn't a word constituent
regmatch (string, regexp)
Check if string string matches regular expression regexp.
Functions returns a boolean success status and sets sub-
expression registers $n.
regsub (string, regexp, subst)
Search regular expression regexp from string string and replace
the matching substring with string subst. Returns the
resulting string. The substitution string subst can contain $n
references to the n:th parenthesized sup-expression.
regsuball (string, regexp, subst)
Like regsub but replace all matches of regular expression
regexp from string string with string subst.
require_state (symbol)
Check that the state symbol is defined. If the required state
is undefined, the function tries to autoload it. If the
loading fails, the program will terminate with an error
message.
split (regexp, string)
Split string string to list considering matches of regular
rexpression regexp as item separator.
sprintf (fmt, ...)
Format arguments according to fmt and return result as a
string.
strcmp (str1, str2)
Perform a case-sensitive comparision for strings str1 and str2.
Function returns a value that is:
-1 string str1 is less than str2
0 strings are equal
1 string str1 is greater than str2
string (any)
Convert argument to string.
strncmp (str1, str2, num)
Perform a case-sensitive comparision for strings str1 and str2
comparing at maximum num characters.
substring (str, start, end)
Return a substring of string str starting from position start
(inclusively) to end (exclusively).
$. current input line number
$n the n:th parenthesized regular expression sub-expression from
the latest state regular expression or from the regmatch
primitive
$` everything before the matched regular rexpression. This is
usable when used with the regmatch primitive; the contents of
this variable is undefined when used in action blocks to refer
the data before the block's regular expression.
$B an alias for $`
argv list of input file names
filename
name of the current input file
program name of the program (usually states)
version program version string
/usr/share/enscript/hl/*.st enscript's states definitions
awk(1), enscript(1)
Markku Rossi <[email protected]> <http://www.iki.fi/~mtr/> GNU Enscript WWW home page: <http://www.iki.fi/~mtr/genscript/>
Personal Opportunity - Free software gives you access to billions of dollars of software at no cost. Use this software for your business, personal use or to develop a profitable skill. Access to source code provides access to a level of capabilities/information that companies protect though copyrights. Open source is a core component of the Internet and it is available to you. Leverage the billions of dollars in resources and capabilities to build a career, establish a business or change the world. The potential is endless for those who understand the opportunity.
Business Opportunity - Goldman Sachs, IBM and countless large corporations are leveraging open source to reduce costs, develop products and increase their bottom lines. Learn what these companies know about open source and how open source can give you the advantage.
Free Software provides computer programs and capabilities at no cost but more importantly, it provides the freedom to run, edit, contribute to, and share the software. The importance of free software is a matter of access, not price. Software at no cost is a benefit but ownership rights to the software and source code is far more significant.
Free Office Software - The Libre Office suite provides top desktop productivity tools for free. This includes, a word processor, spreadsheet, presentation engine, drawing and flowcharting, database and math applications. Libre Office is available for Linux or Windows.
The Free Books Library is a collection of thousands of the most popular public domain books in an online readable format. The collection includes great classical literature and more recent works where the U.S. copyright has expired. These books are yours to read and use without restrictions.
Source Code - Want to change a program or know how it works? Open Source provides the source code for its programs so that anyone can use, modify or learn how to write those programs themselves. Visit the GNU source code repositories to download the source.
Study at Harvard, Stanford or MIT - Open edX provides free online courses from Harvard, MIT, Columbia, UC Berkeley and other top Universities. Hundreds of courses for almost all major subjects and course levels. Open edx also offers some paid courses and selected certifications.
Linux Manual Pages - A man or manual page is a form of software documentation found on Linux/Unix operating systems. Topics covered include computer programs (including library and system calls), formal standards and conventions, and even abstract concepts.