Rules Persistence System

From PCGen Wiki
Revision as of 01:11, 5 April 2009 by Tom Parker (talk | contribs) (New page: {| align="right" | __TOC__ |} ==Background== This document is primarily intended to communicate the design of PCGen Rules Persistence System. See the Overall System Figure [[Image:O...)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Background

This document is primarily intended to communicate the design of PCGen Rules Persistence System. See the Overall System Figure

Block diagram of proposed CDOM structure

to get an understanding of the place of the Rules Persistence System within the entire PCGen code base and architecture.

This document provides a detailed overview of the architecture of a specific portion of PCGen. The overall architecture and further details of other subsystems and processes are provided in separate documents available on the Architecture page.

Overview

The Rules Persistence System is one of the major components of PCGen. It is responsible for loading game system and component data from the persistence data file format and saving it back into that data file format. It is aware of the internal storage of information within PCGen only to the point it is required to store that information for use by the core of PCGen. The Rules Persistence System is not capable of interpreting much in the way of meaning of the values it is storing.

This document describes the Rules Persistence System, and provides guidance on how to interact with the interface/API of the Rules Persistence System.

Key Design Decisions

Token/File Loader System

File Loaders are key components of the Rules Persistence System. File Loader instances are specific to a given file type. When processing a file, the File Loader splits the file into separate lines, splits the lines (if necessary) into separate tags, and then submits the tags to the Tokens.

Underlying Requirement(s): Information Hiding, Data Encapsulation, Increased Flexibility

Basis: This abstracts specific individual components of the data persistence format from the internal data structure (and each other).

Implementation: The interactions of Tokens, File Loaders and other elements of the Rules Persistence System is shown in Figure \ref{Fig: Flow of Data Load}.

File Loaders are created by the Rules Persistence System for each file type. While these may share a single Class (for code reuse), each instance is specialized to a specific data persistence format file type. The list of available tags is also specific to a given data persistence file type. This allows features to be limited to certain objects to avoid non-sensical situations (e.g. you can't assign material components to a Race). A collection of Global tags that can be used in nearly all data persistence files is also available.

Each Token Class is stored in a separate file, independent of the core of PCGen, to allow each token to be independently updated, removed, or otherwise manipulated without altering or impacting other Tokens. Individual Token files are in the pcgen.plugin.lsttokens package. These persistence Tokens are non-abstract Classes that implement the LstToken interface. When PCGen is launched, all plugins of PCGen are evaluated, and Tokens are specifically placed into the TokenLibrary. The Tokens each have a method (getTokenName()) that identifies the tag the Token processes.

By keeping each Token in an individual class, this keeps the Token Classes very simple, which makes them easy to test, modify, and understand (as they are effectively atomic to the processing of a specific token). One goal of the PCGen Rules Persistence System is to ensure that all of the parsing of LST files is done within the Tokens and not in the core of PCGen. This makes adding new tags to the LST files to be reasonably painless (though changes to the core or export system may also be required to add required functionality). It at least facilitates the long-term goal of altering behavior of PCGen without forcing a recompile of core PCGen code.

On Transition from PCGen 5.14: PCGen 5.14 used a slightly different storage system for tokens. It stored tokens in a TokenStore, and that was effectively done under a Map of Maps. The first Map was an interface identifying the type of Token, and the second Map was used to identify the token from the tag name. As the conversion to the new Token system is being done gradually, some tokens in PCGen may remain in the PCGen 5.14 style. New tokens will always extend (perhaps indirectly through another Interface), CDOMToken. Many GameMode tokens remain in the TokenStore, rather than in the new TokenLibarary.

Data Modification during Data Load

The Rules Persistence System supports modifying, copying or forgetting objects defined in the data persistence files.

Underlying Requirement(s): Information Hiding, Data Encapsulation, Increased Flexibility

Basis: This allows users to modify base data to easily produce new Races, Abilities, or other items without risk of copy/paste error.

Implementation: The data persistence file format supports three special functions that can be performed on data persistence entries.

.COPY: allows a data file to copy an existing object. This .COPY entry need not worry about file load order (see Data Persistence File Load Order Independence). The value preceding the .COPY string identifies the object to be copied. This identifier is the KEY (or KEY and CATEGORY) of the object to be copied. The identifier for the copied object is placed after an equals sign that follows the .COPY String, e.g.: Dodge.COPY=MyDodge
.MOD allows a data file to modify an existing object. This .MOD entry need not worry about file load order (see Data Persistence File Load Order Independence). All .MOD entries will be processed after all .COPY entries, regardless of the source file. The value preceding the .MOD string identifies the object to be modified. This identifier is the KEY (or KEY and CATEGORY) of the object to be modified. If more than one .COPY token produces an object with the same identifier, then a duplicate object error will be generated.
.FORGET: allows a data file to remove an existing object from the Rules Data Store. This .FORGET entry need not worry about file load order

(see Data Persistence File Load Order Independence). All .FORGET entries will be processed after all .COPY and .MOD entries, regardless of the source file. The value preceding the .FORGET string identifies the object to be removed from the Rules Data Store.

Subtokens

Some tags have complex behavior that significantly differs based on the first argument in the value of the tag. In order to simplify tag parsing and Token code, these Tokens implement a Sub-token structure, which delegates parsing of the tag value to a Token specialized to the first argument in the value of the tag.

Underlying Requirement(s): Data Encapsulation, Information Hiding, Increased Flexibility

Basis: This design is primarily intended to separate out code for different subtokens. This provides increased ability to add new subtokens without altering existing code. This provides increased flexibility for developers, and ensures that unexpected side effects from code changes don't impact other features of PCGen.

Implementation: The flow of events during Data Load when Subtokens are present is shown as an optional series of events in Figure \ref{Fig: Flow of Data Load}.

The LoadContext is capable of processing subtokens for a given Token. Any token which delegates to subtokens can call processSubToken(T, String, String, String) from LoadContext in order to delegate to subtokens. This delegation will return a boolean value to indicate success (true) or failure (false) of the delegation. The exact cause of the failure is reported to the Logging utility.

Note that it is legal for a subtoken to only be valid in a single object type (such as a Race), even if the "primary" token is accepted universally. This greatly simplifies the restriction of subtokens to individual file types without producing burden on the primary token to establish legal values. Resolution of those restrictions is handled entirely within the LoadContext and its supporting classes.

Rules Persistence System I/O

The input and output of data persistence information should be an integral part of the Rules Persistence System. In versions up to and including PCGen 5.14, Tokens and the Rules Persistence System were only responsible for input from the data persistence file format. Starting with PCGen 5.16, the Rules Persistence System is responsible for both input and output of Tokens.

Underlying Requirement(s): Data Encapsulation, Information Hiding

Basis: Adding output to the persistence system provides the ability to reuse the Rules Persistence System in a data file editor, as well as the runtime system. This sharing of code helps to guarantee the integrity of the data file editor. Such a structure also facilitates unit testing, as the Rules Persistence System can be tested independently of the core code.

Implementation: Each token has the ability to both "parse" and "unparse" information for the Rules Persistence System. Parsing is the act of reading a token value from a data persistence file and placing it into the internal rules data structure. Unparsing is the act of reading the internal data structure and writing out the appropriate syntax into a data persistence file.

In addition to other benefits, this parse/unparse structure allows Tokens to be tested without major dependence on other components of PCGen. These tests are found in plugin.lsttokens package of the code/src/utest source directory.

As explained in Section \ref{Token/File Loader System}, Token/File Loader System, the File Loaders separate out the tags in an input file and call the parse method on the appropriate Tokens. In order to unparse a loaded object back to the data persistence syntax, the all Tokens that could be used in the given object type must be called (this makes unparse a bit more CPU intensive than parse)

Unparsing a particular object is managed by the unparse(T) method of LoadContext. This process includes delegation of the unparse to all subtokens (See section \ref{Subtokens}), as depcited in Figure \ref{Fig: Flow of Data Unload}.

Because all tokens are called when unparsing an object, it is important that tokens properly represent when they are not used. This is done by returning null from the unparse method of the Token.

Some tokens can be used more than once in a given object (e.g. BONUS), and thus must be capable of indicating each of the values for the multiple tag instances. Since Tokens do not maintain state, the unparse method must only be called a single time to get all of the values; thus, the unparse method returns an array of String objects to indicate the list of values for each instance of the tag being unparsed.

The context is responsible for including the name of the tag in the unparsed result. Just as the token is not responsible for removing/ignoring the name of the tag in the value passed into the parse method, it does not prepend the name of the tag to the value(s) returned from the unparse method. (This also happens to simplify the conversion and compatibility systems.)

Independent Data Persistence Interface

The Data Persistence format must be independent of internal data structure. (The subsystems of PCGen other than the Rules Persistence System should not have detailed knowledge of the data persistence file format).

Underlying Requirement(s): Information Hiding, Catch Errors Early

Basis: This abstracts the data persistence format from the internal data structure. It forces the entire persistence contents to be parsed on data load. This ensures any errors in data files are caught in the Rules Persistence System at data load, rather than at runtime.

Implementation: During the load of data from the data persistence format, each Token is required to fully parse the information and validate the information as much as possible. This ensures that errors in the data files are caught as they are loaded, and not at runtime. The Rules Persistence System is responsible for ensuring data integrity of the rules data to the rest of the PCGen system, and the Tokens are the "front lines" of fulfilling that responsibility.

Beyond the tokens, a load subsystem translates between the data persistence file format parsed by the Tokens and the internal data structure. This system arguably fits the Data Mapper design pattern, although it's not strictly using relational databases. This system is currently known as a LoadContext. The details of translation takes various forms, and those structures are explained in later sections.

Only valid Tokens may impact the Rules Data Store

There is a risk that a partially-parsed Token from an invalid data persistence entry could lead to an unknown state within the Rules Data Store. Therefore, a Token should only impact the state of the Rules Data Store if the token parse completes successfully. The Token should not be responsible for tracking successful completion; rather the load subsystem implements a 'unit of work' design pattern to ensure only valid tokens impact the Rules Data Store.

Underlying Requirement(s): Information Hiding

Basis: This greatly simplifies the implementation of Tokens, as they are not required to analyze or defer method calls to the LoadContext until after the data persistence syntax is established to be valid.

Implementation: During the load of data from the data persistence format, each Token may fully parse the provided value and make any necessary calls to the LoadContext. This can be done even if subsequent information indicates to the Token that there is an error in the Token value. Specifically, individual Tokens should be free to take any action on the LoadContext, and are not responsible for the consequences of those method calls unless the Token indicates that the value from the data persistence format was indicated to be valid. This indication of validity is by returning true from the parse method of the Token.

If a Token returns true, indicating the token was valid, then the File Loader that called the Token is responsible for indicating to the LoadContext to commit() the changes defined by the Token. This process is shown in the "Transaction Success Response" section in Figure \ref{Fig: Flow of Data Load}.

If the Token returns false, then the File Loader is responsible for calling the rollback() method of LoadContext to indicate no changes should made to the Rules Data Store and the tentative changes proposed by the Token should be discarded. This proces is shown in the "Transaction Failure Response" section in Figure \ref{Fig: Flow of Data Load}.

The Load Commit Subsystem provides additional detail.

Data Persistence File Load Order Independence

Items in the rules structure may refer to each other, by granting certain features, possessing certain prerequisites, or by other means. For example, an Ability A may grant Ability B, but we cannot reasonably require that Ability A appears before Ability B. More specifically, due to known interactions, it is impossible to choose a load order for files and entries that guarantees objects will be constructed before references to those objects are encountered. Order independence of persistent data is therefore an architectural requirement.

Underlying Requirement(s): Information Hiding, Data Encapsulation, Catch Errors Early

Basis: Using references before objects are constructed to ensure full parsing of the data persistence file syntax during load improves error catching capability at load time and should improve runtime performance.

Implementation: A two pass load system is required in order to ensure separation of the data persistence format and the internal data structure. In PCGen 5.16, any Token may request a reference to an object, regardless of whether that object has been constructed in the LoadContext. This is done through a ReferenceContext. For more information on the design of this system as well as the types of CDOM References that exist and that the ReferenceContext is capable of returning, see CDOM References Concept Document

The references requested by the tokens can then be placed into objects (Abilities, Skills, etc.) and the underlying object(s) to which the reference refers can be established at runtime.

There are two issues introduced with a system that is capable of referencing objects before they are constructed.

The first issue is that references might be made to objects that don't exist. This problem cannot be detected until the entire load operation is complete. The Rules Persistence System makes a call to the validate() method of ReferenceContext to test whether any references were made where the appropriate referred-to object was not constructed during data persistence file load. In order to provide for minimal functionality without truly understanding the reference, PCGen 5.16 constructs a dummy (empty) object with the given identifier.

Second, the References constructed during data persistence file load must be resolved before they are used during "runtime". Therefore, the Rules Persistence System is responsible for resolving any references after a collection of Campaigns are loaded. This resolution is driven through the resolveReferences() method of LoadContext. Due to the construction of dummy objects during the validate() step, resolveReferences() must be called after validate().

Shared Persistence System with Editor

The data persistence system should be usable for both a data file editor and the runtime character generation program.

Underlying Requirement(s): Code Reuse (general design characteristic), Catch Errors Early

Basis: A significant investment made in ensuring that persistent data is read without errors should be reused across both a data file editor and the runtime system. Consolidation reduces the risk of error and ensures that the editor will always be up to date (a problem in PCGen 5.14). In addition, additional editing capabilities (e.g. edit data in place) that are not available in PCGen 5.14 can be added once a full-capability editor is available.

Implementation: The Rules Persistence System is responsible for tracking detailed changes made by the Tokens during Data Load (see Only valid Tokens may impact the Rules Data Store). As a result, this information allows the load system to serve as a runtime load system and a file editor load system.

As noted in Rules Persistence System I/O, tags may overwrite previous values or add to the set of values for that tag. In the case of an editor, it is critically important not to lose information that would later be overwritten in a runtime environment. A simple example would be the use of a .MOD to alter the number of HANDS on a Race. This alteration should be maintained in the file that contained the .MOD and the value (or unspecificied default) in the original Race should not be lost. This is done by tracking the exact changes that occur during data load. This is fully explained in the Load Commit Subsystem.

Token Compatibility

      • PLACEHOLDER: Describe Compatibility system and impact on TokenLibrary***

Identify Appropriate Token

      • PLACEHOLDER: Need a separate document(?) to describe how the appropriate token is selected, and how that works with compatibility tokens

Characteristics/Weaknesses of the existing system

Prerequisite Tags

Currently the Prerequisite tags are an exception to the parsing system. The Prerequisite tags have a prefix of "PRE" and are followed by the Prerequisite name, e.g. PREFEAT. This means that the Prerequisite tags do not follow the traditional method of having a unique name before the colon. Also, Prerequisite tags can have a leading ! to negate the Prerequisite.

In order to address this situation of a different token definition system, the PreComatibilityToken provides a wrapper into the new PCGen 5.16 token syntax.

FREQ 1782186 exists to convert the Prerequisite tags into two separate buckets, PRE: and REQ: (Prerequisites and Requirements) based on their current behavior.

Class Wrapped Token

A ClassWrappedToken provides compatibility for previously allowed bad behavior in data files.

Many Class tokens in PCGen versions up to 5.14 ignored the class level, so they are technically Class tags and not CLASSLEVEL tags. Yet, PCGen 5.14 allows those tags to appear on class level lines. This is a bit deceptive to users in that the effect will always be on the class, and not appear on the specified level.

Unfortunately, one cannot simply remove support for using CLASS tokens on CLASSLEVEL lines, because if they are used at level 1, then they are equivalent to appearing on a CLASS line. Certainly, the data monkeys use it that way. For example, Blackguard in RSRD advanced uses EXCHANGELEVEL on the first level line.

Therefore the entire ClassWrappedToken system is a workaround for data monkeys using CLASS tokens on CLASSLEVEL lines, and therefore it should only work on level one, otherwise expectations for when the token will take effect are not set.