Configuration 4 GNU (CFG)


CFGXML::Parser Module

Prev Backend Next

The CFGXML::Parser is the base module (aka class) for all other parsers. Even parsers you write from scratch will benefit from extending this module.

Options

The following options are available and should be set in your parser's init() function if you want to use something other than the default value.

Example 1. Setting a parser option
sub init
{
  my $self = shift;       #req'd for perl's object-orientedness
  #call parent class's init first so our settings aren't over-ridden
  $self->SUPER::init();
  #set root XML tag to <apache-config>
  $self->{ROOTTAG} = 'apache-config';
}

[Note]Note

If you add your own option to your parser, be sure to use the name listed below if your option is very similar or identical.

  • BOOLEANRULES - A hash containing mappings from true to false for boolean type values. The keys of the hash are the allowed true values, while the values in the hash are the corresponding false value for each true value. By default, it recognizes: true/false, yes/no, 1/0, True/False, Yes/No, enabled/disabled, Enabled/Disabled

  • CLOSECOMMENTALLOWED - For Common::Apache, whether close comments (those on the same line as the closing tag of a section) are permitted. Default is 1.

  • COMMENTCHAR - The single character used to detect the start of a comment in CFGXML::Parser's isComment() and tokenize() functions. Default is # (pound symbol).

  • COMMENTCHAR2 - The secondary single character used to detect the start of a comment in CFGXML::Parser's isComment() and tokenize() functions in addition to the COMMENTCHAR value. Default is disabled (empty string).

  • DEBUG - A flag used to output extra debugging information from various functions/parsers. Normally this is only used during parser testing, as often the extra output is not valid XML or native config file. Default is false.

  • DELIMITEDEXTRAFIELDS - For Common::Delimited, whether extra fields are allowed. If 1 (true), any fields encountered after all of the fields found in DELIMITEDFIELDTAGS, any remaining text is added in an 'extra' tag, and delimiters in the extra field are ignored. If 0 (false), an error will occur if any additional fields are encountered which are not found in DELIMITEDFIELDTAGS. Default is 1 (true).

  • DELIMITEDFIELDTAGS - For Common::Delimited, the list of XML tags to use for each major field in the delimited file. If empty, PROPERTYRULESDEFAULT is used instead. If you specify these tags, then during saving, the XML need not be in a particular order for the config file to be constructed properly.

  • DELIMITER - For Common::Delimited, the main delimiter used to split a section (line) into its properties. Default is the space character.

  • PROPERTYINVERTRULES - A hash containing arrays listing boolean antonyms. The keys for the hash are the names of the tag used to store the antonyms, and the elements in each array are the antonyms to be converted. The name and value of the property are preserved in most cases, but are displayed in the UIs as their opposite. Note: All antonym must be listed for this feature to work. Note also that this feature is not fully implemented or working yet. Note that the property's name is converted to lowercase and spaces are replaced with underscores before comparison takes place. Default is empty.

  • PROPERTYRULES - A hash containing arrays, where the keys of the hash are the target XML tag names, and the arrays containing the criterion to convert if the property's name matches. The original name of the configuration directive is preserved when the config file is saved. Note that the property's name is converted to lowercase and spaces are replaced with underscores before comparison takes place. Default is empty.

  • PROPERTYRULESDEFAULT - The default tag to use for properties, if no tag is specified and one cannot be determined automatically. Default is property.

  • RIGHTCOMMENTALLOWED - For some parsers, whether right comments (those to the right of configuration/section data as opposed to on a line by themselves) are allowed. Default is 1.

  • ROOTTAG - The XML tag to use for the root of the XML generated by the parser. You most likely want to set this to something unique, such as samba-config, ini-style, etc. You then need a matching file in data/classes/ROOTTAG.xml. Generally, parsers for the same thing but on different distros or for different versions of things should have the same ROOTTAG (since they should use the same XML when possible). Default is generic-config.

  • RUNLEVELDIR - This directive has two possible contexts, but is always the path to the directory tree containing symlinks for services enabled on each runlevel, and the default is /etc/runlevels. For Common::SysVRunlevel the default is /etc/rc.d, and for Distro::Gentoo::Runlevel the default is /etc/runlevels.

  • RUNLEVELS - For Common::SysVRunlevel, an array containing the list of possible runlevels. Default is 0, 1, 2, 3, 4, 5, 6.

  • SECONDARYDELIMITER - Common::Delimited, the secondary delimiter to use to split properties into sub-values. No quoting of the SECONDARYDELIMITER is supported. Default is , (comma). Note that this is ignored if SECONDARYDELIMITERLIST is anything other than an empty array

  • SECONDARYDELIMITERLIST - Common::Delimited, the secondary delimiters to use to split each field into sub-values. To use, create an array with one entry (in order) for each field in your delimited field. Each item in the array is the delimiter which gets used in the split to create the sub-values. By default, this array is empty and is ignored, instead SECONDARYDELIMITER is used for all fields. Note that if you put anything in this list, SECONDARYDELIMITER is ignored for all fields.

  • SECTIONRULES - A hash containing arrays, where the keys of the hash are the target XML tag names, and the arrays containing the criterion to convert if the section's name matches. The original name of the configuration section is preserved when the config file is saved. Note that the section's name is converted to lowercase and spaces are replaced with underscores before comparison takes place. Default is empty.

  • SECTIONRULESDEFAULT - The default tag to use for sections, if no tag is specified and one cannot be determined automatically. Default is section.

  • SERVICESDIR - This option has possible contexts, but is always the directory containing all available services. For Common::SysVRunlevel the default is /etc/rc.d/init.d. For Distro::Gentoo::Runlevel the default is /etc/init.d.

  • VALUETAGS - A hash containing arrays, where the keys of the hash are the XML tag name of the parent, and the array contains the custom value tags to be used for that parent's value tags. See addValueArray for more details on exact behavior. Note that you should be sure to make any value tags extend the type "value" so that they are recognized properly. Note also that only addValueArray uses this option, addValue ignores it, you should always use addValueArray if you have or could have more than one value under a particular property.

Runtime Variables

The following runtime variables are defined and used while the parsers execute. The following list indicated their purpose and context, and when it is safe to refer directly to them in your parser.

  • _ACTIONSET - This is used internally by CFGXML::Parser to determine whether the action (load or save) has been set. You should not access this directly, nor should you need to worry about this variable.

  • _BOOLEANRULES - An automatically generated hash containing one key for each boolean value, with the values pointed to being either "true" or "false" depending on whether the key represents a true or false value.

  • _BOOLEANREVERSERULES - An automatically generated hash which is the opposite of BOOLEANRULES, where each key in the hash is a false value, and the value pointed to is the corresponding true version.

  • _CONTENTS - If you are loading a config file, this is an array containing the lines in the config file which you can iterate through using a foreach loop. In most cases, you should read this array directly to get the contents of the file (and not open fils yourself!). There's no need to modify this variable. This variable is only used during loading of config files.

  • _CURRENTSECTION - This is an internal storage of the section the parser is currently. You should not access it directly, instead use getCurSection, enterSection, and leaveSection which handle error checking and management of the current section for you.

  • _DIRECTIVERUNCOLS - This is used internally by makeColumnValueString to keep track of the largest value in each column when the same directive occurs multiple times in a row, such as in Common::Apache. You should not modify this array directly, it is set by makeColumnValueString as needed.

  • _DIRECTIVERUN - If set, this value contains the name of the current property/directive which has occurred more than once in a row. You should not modify this value directly. Instead you should use resetDirectiveRun when you encounter a non-property or other situation where a "run" must end. You should not modify this value directly, it is automatically set if applicable when you call nameColumnValueString.

  • _LOADING - A boolean flag indicating whether the parser is loading or saving. You should specify this value as an argument to run to set this value during testing (this is done for you by CFG when it runs the parser), and there is no need to access this value directly. The CFGXML::Parser module will call your load or save methods in the appropriate situations.

  • _OUTPUT - This is a string containing the output of a parser saving a configuration file. You should normally append text to this string instead of directly writing to the files. This variable is only used during saving of config files.

  • _PROPERTYINVERTLIST - This is an internal representation of the PROPERTYINVERTRULES option. You should not access it directly.

  • _PROPERTYRULESLIST - This is an internal representation of the PROPERTYRULES option. You should not access it directly.

  • _PROPERTYRULESTAGS - This is an internal list of XML tags which are property elements. You should not access it directly, instead use the getProperties, isSection, or isProperty methods of CFGXML::Parser.

  • _RECURSIVE - Boolean whether the current instance of a parser is one which was called recursively from another parser

  • _SECTIONRULESLIST - This is an internal representation of the SECTIONRULES option. You should not access it directly.

  • _SECTIONRULESTAGS - This is an internal list of XML tags which are section elements. You should not access it directly, instead use the getSections, isSection, or isProperty methods of CFGXML::Parser.

  • _SECTPROPRULESTAGS - This is an internal list of XML tags which are either property or elements. You should not access it directly, instead use the getSectionsAndProperties method of CFGXML::Parser.

  • _XMLROOT - This is the root config4gnu::CfgObject node of the XML tree, and can be accessed when you need to refer to the root node of the XML document during loading or saving. You should not have any need to modify this element except through its functions.

Public Functions

The following public functions are defined in CFGXML::Parser and can be used by parsers which extend it.

[Note]Note

Since the parsers are class/module-based, you must use the standard way of calling an object's own functions. This can be done in a parser by doing $self->functionName($arg1, $arg2)

Common

These functions are applicable during both loading and saving.

CFGXML::Parser new();

This is the constructor for CFGXML::Parser and its sub-classes. You do not normally need to call this unless you are manually testing your parser.

void run($action); 
string  $action - The action the parser should perform, either load or save;

This function performs the load or save process by calling various other functions as needed, including the init function.

void init();

This function is called by run before loading/saving begins to set any needed options. Sub-classes of CFGXML::Parser should overload this method as described in the Options section.

boolean isSection($element); 
config4gnu::CfgObject  $element;

Returns 1 (true) if $element is defined as a section tag in the current parser, 0 (false) otherwise. Note that this is not the exact opposite of isProperty. To define an element as a section, specify in the class definition file that it extends the section type.

boolean isProperty($element); 
config4gnu::CfgObject  $element;

Returns 1 (true) if $element is defined as a property tag in the current parser, 0 (false) otherwise. Note that this is not the exact opposite of isSection. To define an element as a property, specify in the class definition file that it extends the property type.

boolean isComment($candidate); 
string  $candidate - The string to be examined to see if it is a comment;

Returns true if $candidate begins with a COMMENTCHAR character possibly preceded by whitespace, false otherwise. If a quoted or escaped COMMENTCHAR appears first or immediately after initial whitespace, false will be returned.

Loading

These functions are applicable during loading.

string[] tokenize($input); 
string  $input - The string to split into tokens;

Returns an array of non-empty strings containing each token. The string is tokenized by space or tab characters, and the split characters are not included in the tokens. All characters inside single or double quotes are treated as single tokens. If RIGHTCOMMENTALLOWED is true and a COMMENTCHAR is encountered at the beginning of a non-quoted token, everything including and after that character is returned as the last token, which can be tested with isComment and shifted off the array to be stored as a right comment.

void addValue($node,  
 $data,  
 $multiple,  
 $tag); 
config4gnu::CfgObject  $node - The node to add the value to;
string  $data - The value to add to $node;
boolean  $multiple - Whether the property you're adding/setting values of can have multiple values. (defaut = false);
string  $tag - The tag to use for the value tag;

Adds $data to $node using the appropriate XML tags. If $tag is specified, it is used for the XML tag, otherwise value is used (you should not need to specify $tag except in very strange situations). Type conversion is done automatically. Any trailing newline is trimmed, and if $data is empty after trimming, it is not added. If $multiple is false, any existing value will be replaced with $data, otherwise $data will be added as another value, preserving existing values.

void addValueArray($node,  
 $data); 
config4gnu::CfgObject  $node - The node to add the values to;
string[]  $data - The array of values to add to $node;

Adds each item in $data as a value of $node using the appropriate XML tags. If VALUETAGS has an entry for the tagname of $node, its contents are used in order for each value tag. If more entries exist in VALUETAGS than in $data, the extra tags are NOT created. If more entries exist in $data than in VALUETAGS, the last entry in VALUETAGS is used for any extra value items. This means, for example, that if a directive has a set of parameters with different meaning followed by a variable number of parameters with identical meaning, you can put in VALUETAGS an array containing one entry for each different parameter and then an extra entry to specify the tag to be used for all of the trailing identical parameters. Type conversion is done and any trailing newline is trimmed and empty values are ignored, just as in addValue.

config4gnu::CfgObject newSection($parent,  
 $childName,  
 $tagName); 
config4gnu::CfgObject  $parent - The element to add the new section to;
string  $childName - The name of the child to add (optional);
string  $tagName - The XML tag to use for the child (optional);

Appends a new section element to $parent and returns a pointer to the new element. If no optional parameters are given, the XML tag will be section. If $tagName is given, its value will be used for the XML tag and $childName will be included in the XML but not used to determine the XML tag. If $childName is given and $tagName is not, $childName will be used to automatically determine the appropriate XML tag. Normally, you should only specify $childName and customize the XML tag using parser options

config4gnu::CfgObject newProperty($parent,  
 $childName,  
 $tagName); 
config4gnu::CfgObject  $parent - The element to add the new section to;
string  $childName - The name of the child to add (optional);
string  $tagName - The XML tag to use for the child (optional);

Appends a new property element to $parent and returns a pointer to the new element. If no optional parameters are given, the XML tag will be property. If $tagName is given, its value will be used for the XML tag and $childName will be included in the XML but not used to determine the XML tag. If $childName is given and $tagName is not, $childName will be used to automatically determine the appropriate XML tag. Normally, you should only specify $childName and customize the XML tag using parser options

config4gnu::CfgObject enterSection($name); 
string  $name - The name of the section being entered;

Calls newSection to add a section named $name as a child of the current section and returns a pointer to it. This is similar to calling newSection directly except the current section and basic sanity checking is automatically maintained for you. You should use this function even if your sections aren't nested to make your parser easier to understand, just remember to leave a section before you enter the new one if they're all at the same depth level (such as with INI-style config files).

void leaveSection($name); 
string  $name - The name of the section being left;

Checks that $name matches the name of the current section (to help detect syntax errors) and makes the current section's parent section be the new current section. If $name is not specified, no checking is performed, useful for config files which don't use nested sectons (i.e., INI). Note that calling leaveSection when the current section is _XMLROOT results in no change in the current section, even if called from a recursive instance of a parser whose _XMLROOT is not the true root node.

config4gnu::CfgObject getCurSection();

Returns the current section if enterSection and leaveSection have been used in the parser.

void endFile($endcomment); 
string  $endcomment - the ending comment text;

Appends a fileend tag to the current section, signifying the end of the file. Adds $endcomment as the comment of the fileend tag, which is automatically appended to the file during saving. All parsers should call this function when they finish, whether or not the file has an endcomment

int locateIncludeStart($node); 
config4gnu::CfgObject  $node - current section;

Returns the index to the entry after the include tag so that a subparser starts at the correct node. Must be called by a recursively called parser to find its starting point. See Common::Ini for example.

int locateIncludeEnd($node,  
 $curpos); 
config4gnu::CfgObject  $node - current section;
int  $curpos - current position in children list of $node;

Locates the fileend node which matches the filename of the include tag encountered at $curpos. Must be called at end of loop where a recursive parser could have been started so children belonging to include file are skipped by outer parser.

Saving

These functions are applicable during saving.

config4gnu::CfgObjectVector getValues($node); 
config4gnu::CfgObject  $node - The node to get the values of;

Returns a list of all immediate value children of $node, doing any necessary type conversions. Note that value children are considered those which extend the type value, and is not based on the tag name.

string getValueString($node,  
 $delimit); 
config4gnu::CfgObject  $node - The node to get the value string of;
boolean  $delimit - whether or not to delimit the values with spaces (optional, default = 1);

Returns a string representation of all immediate value children of $node. Type conversion is done automatically. A space character is inserted at the beginning of the string and between each value unless 0 is passed as a parameter. If no values are found, an empty string is returned.

config4gnu::CfgObjectVector getSections($parent); 
config4gnu::CfgObject  $parent - The element to be searched.;

Returns a config4gnu::CfgObjectVector containing pointers to all immediate children which are defined as section tags in the current parser. If a recursive instance and was given the _XMLROOT as its parameter, instead returns a CfgObjectVector containing only the root node, to make handling recursion easier for individual parsers.

config4gnu::CfgObjectVector getProperties($parent); 
config4gnu::CfgObject  $parent - The element to be searched.;

Returns a config4gnu::CfgObjectVector containing pointers to all immediate children which are defined as property tags in the current parser.

config4gnu::CfgObjectVector getPropertiesAndSections($parent); 
config4gnu::CfgObject  $parent - The element to be searched.;

Returns a config4gnu::CfgObjectVector containing pointers to all immediate children which are defined as either property or section tags in the current parser. The only tags not returned are data/attribute tags such as value, comment, etc. This function is identical to combining getSections and getProperties except that order is preserved. It is faster than calling both functions separately and combining their output, and also preserves them in proper order, so should normally be used instead anyway.

void resetDirectiveRun();

Resets the current directive run, if any. You should call this when a situation is encountered where directive runs should be terminated, otherwise the run may be incorrectly assumed to continue. For example, it should be called when you enter or leave a section.

string makeColumnValueString($items,  
 $current); 
config4gnu::CfgObjectVector  $items;
int  $current;

Returns a columnized representation of the values of the element at index $current in $items. The $items variable will be automatically tested as needed to detect a directive run, and the columns will be formatted to fit the largest values in each column.

Private Functions

The following private functions are defined in CFGXML::Parser and should normally be used only by other functions defined in CFGXML::Parser. You could also use them in custom functions which you make, or rewrites of existing functions needed for your parser. However, in both cases you should generalize your function so that it can add to or replace existing public CFGXML::Parser functions, unless the change is very specific to only your parser. For example, it may be necessary to extend the _strTo and _strFrom functions to do what you need them to do.

void _setAction($action); 
string  $action - Either load or save;

Sets _LOADING to 1 if $action is load, 0 otherwise. This is called automatically by run using the same parameter it was called with.

void _start();

If loading, reads STDIN into the _CONTENTS array and initializes the XML tree in _XMLROOT. If saving, reads STDIN into a XML tree. Called automatically by run.

void _finish();

If loading, prints _XMLROOT as a string to STDOUT unless the current instance was called recursively from another parser. If saving, prints the _OUTPUT string to STDOUT. Called automatically by run.

void _fileend($node); 
config4gnu::CfgObject  $node - The fileend node which signified the end of the file;

Called automatically during saving, prints out endcomment stored in $node's comment property.

string _makePaddedValueString($item); 
config4gnu::CfgObject  $item - Node to get the values of;

Returns a string containing the columnized representation of $item's value tags based on _DIRECTIVERUNCOLS, which is made automatically by makeColumnValueString, which is the only function which should call _makePaddedValueString.

string _strFromBoolean($origvalue,  
 $booleanval); 
string  $origvalue;
string  $booleanval;

Converts $booleanval into a string based on how $origvalue represented the value in the original config file. If $origvalue's true/falseness is the same as what is specified by $booleanval, it returns $origvalue since no conversion is necessary. Otherwise, it returns the opposite version of $origvalue. You can set BOOLEANRULES to modify its behavior, you probably don't need to over-ride it.

array _strToBoolean($value);
string $value;

Returns an array containing either "true" or "false" to indicate the true/falseness of $value, and a cleaned up version of $value for storage in the XML. You can set BOOLEANRULES to modify its behavior, you probably don't need to over-ride it.

void _buildBooleanRules();

Builds the _BOOLEANRULES and _BOOLEANREVERSERULES based on the BOOLEANRULES option.

string _strFromString($value,  
 $quotetype); 
string  $value;
int  $quotetype;

Converts $value into a string and returns it. If $quotetype is 1 or 2, single or double quotes, respectively, are put around $value, and any internal quotes of the same type are escaped.

array _strToString($value);
string $value;

Returns an array containing the new string value and an integer representing how/if $value was quoted. The new value is trimmed on both ends, then any enclosing quotes are removed and escaped quotes of the same type are unescaped. For the second value, 0 = no quoting, 1 = enclosed by single quotes, 2 = enclosed by double quotes.

array _readFile($filename);
string $filename;

Returns an array containing each line of the contents of $filename. Note that the CFG library may rewrite the filename.

void _writeFile($contents,  
 $filename); 
string  $contents;
string  $filename;

Writes $contents to $filename. Note that the CFG library may rewrite the filename.

Prev Up Next
Backend Home config4gnu::CfgObject

SourceForge Logo