CFGXML::Parser Module | Backend |
---|
The CFGXML::Parser is the base module
(aka class) for all other parsers. Even parsers you write from
scratch will benefit from extending this module.
The following options are available and should be set in your
parser's init() function if you want to use something other
than the default value.
Example 1. Setting a parser option | sub init
{
my $self = shift; #req'd for perl's object-orientedness
#call parent class's init first so our settings aren't over-ridden
$self->SUPER::init();
#set root XML tag to <apache-config>
$self->{ROOTTAG} = 'apache-config';
} |
| Note |
---|
If you add your own option to your parser, be sure to use
the name listed below if your option is very similar or
identical.
|
BOOLEANRULES - A hash containing mappings from
true to false for boolean type values. The keys of the hash are
the allowed true values, while the values in the hash are the
corresponding false value for each true value. By default, it
recognizes: true/false, yes/no, 1/0, True/False, Yes/No,
enabled/disabled, Enabled/Disabled CLOSECOMMENTALLOWED - For
Common::Apache, whether close comments
(those on the same line as the closing tag of a section) are
permitted. Default is 1. COMMENTCHAR - The single character used to detect the
start of a comment in CFGXML::Parser's
isComment() and
tokenize() functions. Default is
# (pound symbol). COMMENTCHAR2 - The secondary single character
used to detect the start of a comment in
CFGXML::Parser's
isComment() and
tokenize() functions in addition to the
COMMENTCHAR value. Default is disabled (empty string). DEBUG - A flag used to output extra debugging
information from various functions/parsers. Normally this is
only used during parser testing, as often the extra output is
not valid XML or native config file. Default is
false. DELIMITEDEXTRAFIELDS - For
Common::Delimited, whether extra fields
are allowed. If 1 (true), any fields encountered after all of
the fields found in DELIMITEDFIELDTAGS, any remaining text is
added in an 'extra' tag, and delimiters in the extra field are
ignored. If 0 (false), an error will occur if any additional
fields are encountered which are not found in DELIMITEDFIELDTAGS.
Default is 1 (true). DELIMITEDFIELDTAGS - For
Common::Delimited, the list of XML tags
to use for each major field in the delimited file. If empty,
PROPERTYRULESDEFAULT is used instead. If you specify these
tags, then during saving, the XML need not be in a particular
order for the config file to be constructed properly.
DELIMITER - For
Common::Delimited, the main delimiter
used to split a section (line) into its properties. Default
is the space character. PROPERTYINVERTRULES - A hash containing arrays
listing boolean antonyms. The keys for the hash are the names
of the tag used to store the antonyms, and the elements in
each array are the antonyms to be converted. The name and
value of the property are preserved in most cases, but are
displayed in the UIs as their opposite. Note: All antonym
must be listed for this feature to work. Note also that this
feature is not fully implemented or working yet. Note that the
property's name is converted to lowercase and spaces are
replaced with underscores before comparison
takes place. Default is empty. PROPERTYRULES - A hash containing arrays,
where the keys of the hash are the target XML tag names, and
the arrays containing the criterion to convert if the
property's name matches. The original name of the
configuration directive is preserved when the config file is
saved. Note that the property's name is converted to lowercase
and spaces are replaced with underscores
before comparison takes place. Default is
empty. PROPERTYRULESDEFAULT - The default tag to use
for properties, if no tag is specified and one cannot be
determined automatically. Default is
property. RIGHTCOMMENTALLOWED - For some parsers,
whether right comments (those to the right of
configuration/section data as opposed to on a line by
themselves) are allowed. Default is 1. ROOTTAG - The XML tag to use for the root of
the XML generated by the parser. You most likely want to set
this to something unique, such as
samba-config, ini-style,
etc. You then need a matching file in
data/classes/ROOTTAG.xml. Generally,
parsers for the same thing but on different distros or for
different versions of things should have the same ROOTTAG
(since they should use the same XML when possible). Default
is generic-config. RUNLEVELDIR - This directive has two possible
contexts, but is always the path to the directory tree
containing symlinks for services enabled on each runlevel, and
the default is /etc/runlevels. For
Common::SysVRunlevel the default is
/etc/rc.d, and for
Distro::Gentoo::Runlevel the default
is /etc/runlevels. RUNLEVELS - For
Common::SysVRunlevel, an array
containing the list of possible runlevels. Default is
0, 1, 2, 3, 4, 5, 6. SECONDARYDELIMITER -
Common::Delimited, the secondary
delimiter to use to split properties into sub-values. No
quoting of the SECONDARYDELIMITER is supported. Default is
, (comma). Note that this is ignored if
SECONDARYDELIMITERLIST is anything other than an empty array
SECONDARYDELIMITERLIST -
Common::Delimited, the secondary
delimiters to use to split each field into sub-values. To use,
create an array with one entry (in order) for each field in your
delimited field. Each item in the array is the delimiter which
gets used in the split to create the
sub-values. By default, this array is empty and is ignored,
instead SECONDARYDELIMITER is used for all fields. Note that if
you put anything in this list, SECONDARYDELIMITER is ignored
for all fields. SECTIONRULES - A hash containing arrays,
where the keys of the hash are the target XML tag names, and
the arrays containing the criterion to convert if the
section's name matches. The original name of the
configuration section is preserved when the config file is
saved. Note that the section's name is converted to lowercase
and spaces are replaced with underscores
before comparison takes place. Default is
empty. SECTIONRULESDEFAULT - The default tag to use
for sections, if no tag is specified and one cannot be
determined automatically. Default is
section. SERVICESDIR - This option has possible
contexts, but is always the directory containing all available
services. For Common::SysVRunlevel the
default is /etc/rc.d/init.d. For
Distro::Gentoo::Runlevel the default is
/etc/init.d. VALUETAGS - A hash containing arrays, where the
keys of the hash are the XML tag name of the parent, and the
array contains the custom value tags to be used for that
parent's value tags. See addValueArray for
more details on exact behavior. Note that you should be sure to
make any value tags extend the type "value" so that they are
recognized properly. Note also that only
addValueArray uses this option,
addValue ignores it, you should always use
addValueArray if you have or could have
more than one value under a particular property.
The following runtime variables are defined and used while the
parsers execute. The following list indicated their purpose and
context, and when it is safe to refer directly to them in your
parser.
_ACTIONSET - This is used internally by
CFGXML::Parser to determine whether the
action (load or save) has been set. You should not access this
directly, nor should you need to worry about this
variable. _BOOLEANRULES - An automatically generated hash
containing one key for each boolean value, with the values
pointed to being either "true" or "false"
depending on whether the key represents a true or false
value. _BOOLEANREVERSERULES - An automatically
generated hash which is the opposite of BOOLEANRULES, where each
key in the hash is a false value, and the value pointed to is
the corresponding true version. _CONTENTS - If you are loading a config file,
this is an array containing the lines in the config file which
you can iterate through using a foreach loop.
In most cases, you should read this array directly to get the
contents of the file (and not open fils yourself!). There's no
need to modify this variable. This variable is only used during
loading of config files. _CURRENTSECTION - This is an internal storage of
the section the parser is currently. You should not access it
directly, instead use getCurSection,
enterSection, and
leaveSection which handle error checking
and management of the current section for you. _DIRECTIVERUNCOLS - This is used internally by
makeColumnValueString to keep track of the
largest value in each column when the same directive occurs
multiple times in a row, such as in
Common::Apache. You should not modify
this array directly, it is set by
makeColumnValueString as
needed. _DIRECTIVERUN - If set, this value contains the
name of the current property/directive which has occurred more than once
in a row. You should not modify this value directly. Instead
you should use resetDirectiveRun when you
encounter a non-property or other situation where a
"run" must end. You should not modify this value
directly, it is automatically set if applicable when you call
nameColumnValueString. _LOADING - A boolean flag indicating whether the
parser is loading or saving. You should specify this value as
an argument to run to set this value during
testing (this is done for you by CFG when it runs the parser),
and there is no need to access this value directly. The
CFGXML::Parser module will call your
load or save methods
in the appropriate situations. _OUTPUT - This is a string containing the output
of a parser saving a configuration file. You should normally
append text to this string instead of directly writing to the
files. This variable is only used during saving of config
files. _PROPERTYINVERTLIST - This is an internal
representation of the PROPERTYINVERTRULES option. You should
not access it directly. _PROPERTYRULESLIST - This is an internal
representation of the PROPERTYRULES option. You should not
access it directly. _PROPERTYRULESTAGS - This is an internal list of
XML tags which are property elements. You should not access it
directly, instead use the getProperties,
isSection, or
isProperty methods of
CFGXML::Parser. _RECURSIVE - Boolean whether the current instance
of a parser is one which was called recursively from another
parser _SECTIONRULESLIST - This is an internal
representation of the SECTIONRULES option. You should not
access it directly. _SECTIONRULESTAGS - This is an internal list of
XML tags which are section elements. You should not access it
directly, instead use the getSections,
isSection, or
isProperty methods of
CFGXML::Parser. _SECTPROPRULESTAGS - This is an internal list of
XML tags which are either property or elements. You should not
access it directly, instead use the
getSectionsAndProperties method of
CFGXML::Parser. _XMLROOT - This is the root
config4gnu::CfgObject node of the XML tree, and
can be accessed when you need to refer to the root node of the
XML document during loading or saving. You should not have any
need to modify this element except through its
functions.
The following public functions are defined in
CFGXML::Parser and can be used by parsers
which extend it.
| Note |
---|
Since the parsers are class/module-based, you must use the
standard way of calling an object's own functions. This can
be done in a parser by doing
$self->functionName($arg1, $arg2)
|
These functions are applicable during both loading and saving.
CFGXML::Parser new( ) ;
This is the constructor for CFGXML::Parser and its sub-classes. You do not normally need to call this unless you are manually testing your parser. string | $action - The action the parser should perform, either load or save; |
This function performs the load or save process by calling various other functions as needed, including the init function. void init( ) ;
This function is called by run before loading/saving begins to set any needed options. Sub-classes of CFGXML::Parser should overload this method as described in the Options section. boolean isSection( | $element) ; | |
config4gnu::CfgObject | $element; |
Returns 1 (true) if $element is defined as a section tag in the current parser, 0 (false) otherwise. Note that this is not the exact opposite of isProperty. To define an element as a section, specify in the class definition file that it extends the section type. boolean isProperty( | $element) ; | |
config4gnu::CfgObject | $element; |
Returns 1 (true) if $element is defined as a property tag in the current parser, 0 (false) otherwise. Note that this is not the exact opposite of isSection. To define an element as a property, specify in the class definition file that it extends the property type. boolean isComment( | $candidate) ; | |
string | $candidate - The string to be examined to see if it is a comment; |
Returns true if $candidate begins with a COMMENTCHAR character possibly preceded by whitespace, false otherwise. If a quoted or escaped COMMENTCHAR appears first or immediately after initial whitespace, false will be returned.
These functions are applicable during loading.
string[] tokenize( | $input) ; | |
string | $input - The string to split into tokens; |
Returns an array of non-empty strings containing each token. The string is tokenized by space or tab characters, and the split characters are not included in the tokens. All characters inside single or double quotes are treated as single tokens. If RIGHTCOMMENTALLOWED is true and a COMMENTCHAR is encountered at the beginning of a non-quoted token, everything including and after that character is returned as the last token, which can be tested with isComment and shifted off the array to be stored as a right comment. void addValue( | $node, | | | $data, | | | $multiple, | | | $tag) ; | |
config4gnu::CfgObject | $node - The node to add the value to; | string | $data - The value to add to $node; | boolean | $multiple - Whether the property you're adding/setting values of can have multiple values. (defaut = false); | string | $tag - The tag to use for the value tag; |
Adds $data to $node using the appropriate XML tags. If $tag is specified, it is used for the XML tag, otherwise value is used (you should not need to specify $tag except in very strange situations). Type conversion is done automatically. Any trailing newline is trimmed, and if $data is empty after trimming, it is not added. If $multiple is false, any existing value will be replaced with $data, otherwise $data will be added as another value, preserving existing values. void addValueArray( | $node, | | | $data) ; | |
config4gnu::CfgObject | $node - The node to add the values to; | string[] | $data - The array of values to add to $node; |
Adds each item in $data as a value of $node using the appropriate XML tags. If VALUETAGS has an entry for the tagname of $node, its contents are used in order for each value tag. If more entries exist in VALUETAGS than in $data, the extra tags are NOT created. If more entries exist in $data than in VALUETAGS, the last entry in VALUETAGS is used for any extra value items. This means, for example, that if a directive has a set of parameters with different meaning followed by a variable number of parameters with identical meaning, you can put in VALUETAGS an array containing one entry for each different parameter and then an extra entry to specify the tag to be used for all of the trailing identical parameters. Type conversion is done and any trailing newline is trimmed and empty values are ignored, just as in addValue. config4gnu::CfgObject newSection( | $parent, | | | $childName, | | | $tagName) ; | |
config4gnu::CfgObject | $parent - The element to add the new section to; | string | $childName - The name of the child to add (optional); | string | $tagName - The XML tag to use for the child (optional); |
Appends a new section element to $parent and returns a pointer to the new element. If no optional parameters are given, the XML tag will be section. If $tagName is given, its value will be used for the XML tag and $childName will be included in the XML but not used to determine the XML tag. If $childName is given and $tagName is not, $childName will be used to automatically determine the appropriate XML tag. Normally, you should only specify $childName and customize the XML tag using parser options config4gnu::CfgObject newProperty( | $parent, | | | $childName, | | | $tagName) ; | |
config4gnu::CfgObject | $parent - The element to add the new section to; | string | $childName - The name of the child to add (optional); | string | $tagName - The XML tag to use for the child (optional); |
Appends a new property element to $parent and returns a pointer to the new element. If no optional parameters are given, the XML tag will be property. If $tagName is given, its value will be used for the XML tag and $childName will be included in the XML but not used to determine the XML tag. If $childName is given and $tagName is not, $childName will be used to automatically determine the appropriate XML tag. Normally, you should only specify $childName and customize the XML tag using parser options config4gnu::CfgObject enterSection( | $name) ; | |
string | $name - The name of the section being entered; |
Calls newSection to add a section named $name as a child of the current section and returns a pointer to it. This is similar to calling newSection directly except the current section and basic sanity checking is automatically maintained for you. You should use this function even if your sections aren't nested to make your parser easier to understand, just remember to leave a section before you enter the new one if they're all at the same depth level (such as with INI-style config files). void leaveSection( | $name) ; | |
string | $name - The name of the section being left; |
Checks that $name matches the name of the current section (to help detect syntax errors) and makes the current section's parent section be the new current section. If $name is not specified, no checking is performed, useful for config files which don't use nested sectons (i.e., INI). Note that calling leaveSection when the current section is _XMLROOT results in no change in the current section, even if called from a recursive instance of a parser whose _XMLROOT is not the true root node. config4gnu::CfgObject getCurSection( ) ;
Returns the current section if enterSection and leaveSection have been used in the parser. void endFile( | $endcomment) ; | |
string | $endcomment - the ending comment text; |
Appends a fileend tag to the current section, signifying the end of the file. Adds $endcomment as the comment of the fileend tag, which is automatically appended to the file during saving. All parsers should call this function when they finish, whether or not the file has an endcomment int locateIncludeStart( | $node) ; | |
config4gnu::CfgObject | $node - current section; |
Returns the index to the entry after the include tag so that a subparser starts at the correct node. Must be called by a recursively called parser to find its starting point. See Common::Ini for example. int locateIncludeEnd( | $node, | | | $curpos) ; | |
config4gnu::CfgObject | $node - current section; | int | $curpos - current position in children list of $node; |
Locates the fileend node which matches the filename of the include tag encountered at $curpos. Must be called at end of loop where a recursive parser could have been started so children belonging to include file are skipped by outer parser.
These functions are applicable during saving.
config4gnu::CfgObjectVector getValues( | $node) ; | |
config4gnu::CfgObject | $node - The node to get the values of; |
Returns a list of all immediate value children of $node, doing any necessary type conversions. Note that value children are considered those which extend the type value, and is not based on the tag name. string getValueString( | $node, | | | $delimit) ; | |
config4gnu::CfgObject | $node - The node to get the value string of; | boolean | $delimit - whether or not to delimit the values with spaces (optional, default = 1); |
Returns a string representation of all immediate value children of $node. Type conversion is done automatically. A space character is inserted at the beginning of the string and between each value unless 0 is passed as a parameter. If no values are found, an empty string is returned. config4gnu::CfgObjectVector getSections( | $parent) ; | |
config4gnu::CfgObject | $parent - The element to be searched.; |
Returns a config4gnu::CfgObjectVector containing pointers to all immediate children which are defined as section tags in the current parser. If a recursive instance and was given the _XMLROOT as its parameter, instead returns a CfgObjectVector containing only the root node, to make handling recursion easier for individual parsers. config4gnu::CfgObjectVector getProperties( | $parent) ; | |
config4gnu::CfgObject | $parent - The element to be searched.; |
Returns a config4gnu::CfgObjectVector containing pointers to all immediate children which are defined as property tags in the current parser. config4gnu::CfgObjectVector getPropertiesAndSections( | $parent) ; | |
config4gnu::CfgObject | $parent - The element to be searched.; |
Returns a config4gnu::CfgObjectVector containing pointers to all immediate children which are defined as either property or section tags in the current parser. The only tags not returned are data/attribute tags such as value, comment, etc. This function is identical to combining getSections and getProperties except that order is preserved. It is faster than calling both functions separately and combining their output, and also preserves them in proper order, so should normally be used instead anyway. void resetDirectiveRun( ) ;
Resets the current directive run, if any. You should call this when a situation is encountered where directive runs should be terminated, otherwise the run may be incorrectly assumed to continue. For example, it should be called when you enter or leave a section. string makeColumnValueString( | $items, | | | $current) ; | |
config4gnu::CfgObjectVector | $items; | int | $current; |
Returns a columnized representation of the values of the element at index $current in $items. The $items variable will be automatically tested as needed to detect a directive run, and the columns will be formatted to fit the largest values in each column.
The following private functions are defined in
CFGXML::Parser and should normally be
used only by other functions defined in
CFGXML::Parser. You could also use them
in custom functions which you make, or rewrites of existing
functions needed for your parser. However, in both cases you
should generalize your function so that it can add to or replace
existing public CFGXML::Parser functions,
unless the change is very specific to only your parser. For
example, it may be necessary to extend the
_strTo and _strFrom functions to do what you need them to do.
void _setAction( | $action) ; | |
string | $action - Either load or save; |
Sets _LOADING to 1 if $action is load, 0 otherwise. This is called automatically by run using the same parameter it was called with. void _start( ) ;
If loading, reads STDIN into the _CONTENTS array and initializes the XML tree in _XMLROOT. If saving, reads STDIN into a XML tree. Called automatically by run. void _finish( ) ;
If loading, prints _XMLROOT as a string to STDOUT unless the current instance was called recursively from another parser. If saving, prints the _OUTPUT string to STDOUT. Called automatically by run. config4gnu::CfgObject | $node - The fileend node which signified the end of the file; |
Called automatically during saving, prints out endcomment stored in $node's comment property. string _makePaddedValueString( | $item) ; | |
config4gnu::CfgObject | $item - Node to get the values of; |
Returns a string containing the columnized representation of $item's value tags based on _DIRECTIVERUNCOLS, which is made automatically by makeColumnValueString, which is the only function which should call _makePaddedValueString. string _strFromBoolean( | $origvalue, | | | $booleanval) ; | |
string | $origvalue; | string | $booleanval; |
Converts $booleanval into a string based on how $origvalue represented the value in the original config file. If $origvalue's true/falseness is the same as what is specified by $booleanval, it returns $origvalue since no conversion is necessary. Otherwise, it returns the opposite version of $origvalue. You can set BOOLEANRULES to modify its behavior, you probably don't need to over-ride it. array _strToBoolean( $value) ; string $value ;
Returns an array containing either "true" or "false" to indicate the true/falseness of $value, and a cleaned up version of $value for storage in the XML. You can set BOOLEANRULES to modify its behavior, you probably don't need to over-ride it. void _buildBooleanRules( ) ;
Builds the _BOOLEANRULES and _BOOLEANREVERSERULES based on the BOOLEANRULES option. string _strFromString( | $value, | | | $quotetype) ; | |
string | $value; | int | $quotetype; |
Converts $value into a string and returns it. If $quotetype is 1 or 2, single or double quotes, respectively, are put around $value, and any internal quotes of the same type are escaped. array _strToString( $value) ; string $value ;
Returns an array containing the new string value and an integer representing how/if $value was quoted. The new value is trimmed on both ends, then any enclosing quotes are removed and escaped quotes of the same type are unescaped. For the second value, 0 = no quoting, 1 = enclosed by single quotes, 2 = enclosed by double quotes. array _readFile( $filename) ; string $filename ;
Returns an array containing each line of the contents of $filename. Note that the CFG library may rewrite the filename. void _writeFile( | $contents, | | | $filename) ; | |
string | $contents; | string | $filename; |
Writes $contents to $filename. Note that the CFG library may rewrite the filename. | |
|
|