Load Commit Subsystem

From PCGen Wiki
Revision as of 02:27, 5 April 2009 by Tom Parker (talk | contribs) (The Commit Process)
Jump to: navigation, search

Background

This subsystem is part of Rules Persistence System

This is effectively executes a Unit of Work design, treating each Token as a Unit of Work.

Commit Strategy objects

The Commit Strategy objects are in the pcgen.rules.context package.

Tracking*CommitStrategy

Tracks changes based on the URI[note 1] and includes information about all changes (including CLEAR, etc.). Suitable for use in tracking detailed changes driven by tokens and being able to rewrite that information back out to LST files in the original format (e.g. will include .CLEAR in the output)

Consolidated*CommitStrategy

Commits changes into Rules Data Store (mainly the CDOM Objects) and flattens out changes (e.g. actually removes the contents of a List after a .CLEAR). Suitable for use at runtime. Changes written to LST files are written back out to LST files as a consolidated result of the tokens (e.g. will include the impact of the .CLEAR rather than placing .CLEAR into the output)

The Commit Process

Error creating thumbnail: File missing
Events During Data Load

Each token has the ability to define changes to the Rules data by calling specific methods in the LoadContext. These actions by the token are tracked in a set of Tracking*CommitStrategy objects while the token is processing. If the token completes successfully (returns true), then the changes are committed into another set of CommitStrategy objects; otherwise the changes are rolled back. Each token can be thought of as a 'unit of work'.

How Tracking Works

Tokens have the ability to add or remove information from the Rules Data Store. In order to track the detailed changes, the TrackingObjectCommitStrategy has separate objects in which it can track added or removed information. All changes are tracked by the URI from which the changes originated. This allows changes to be appropriately written back to an LST file, rather than (accidentally) consolidated into the source in which an object first appeared.

In order to maintain simplicity in the Tokens, they are kept URI-ignorant. File Loaders are responsible for calling the setSourceURI(URI) method on LoadContext to identify the source of data being processed in the parse() method of a Token. File Loaders are also responsible for calling the setExtractURI(URI) method on LoadContext to identify any restriction on data that should be written out during calls to the unparse() method of a Token. The LoadContext is responsible for restricting responses to the extract URI in return values from any methods used by unparse() to extract information from the LoadContext.

To add information, a map (positiveMap) is maintained. This maps from the object being loaded (and thus containing the Token) to the temporary object used to track the changes. Items are applied into the temporary object immediately as they are executed in the Token. Only if the token completes successfully will the changes be committed. A rollback will result in the clearning of the map and disposal of the potential changes.

To remove information, a map (negativeMap) is maintained. This maps from the object being loaded (and thus containing the Token) to the temporary object used to track the negative changes. Items are applied into the temporary object as if they are added (vs. removed) immediately as they are executed in the Token. Only if the token completes successfully will the changes be committed (the presence of an object in the negative change object results in removal during the commit() method). A rollback will result in the clearning of the map and disposal of the potential changes.

For pattern-based removal, a map (patternClearSet) is maintained. This maps from the object being loaded (and thus containing the Token) to the ListKey and pattern to be removed. Items are added into the map immediately as they are executed in the Token. Only if the token completes successfully will the changes be committed (the contents of the ListKey in the CDOM Object will be searched and any items matching teh pattern will be removed from the CDOM Object during the commit() method). A rollback will result in the clearning of the map and disposal of the potential changes.

A token may not support both removals of information and pattern-based removals.

To clear information, a map (globalClearSet) is maintained. This maps from the object being loaded (and thus containing the Token) to the ListKey object that was cleared. ListKeys are added into the map immediately as they are identified as being cleared in the Token. Only if the token completes successfully will the clear be committed (the contents of the ListKey in the CDOM Object will be removed during the commit() method). A rollback will result in the clearning of the map and disposal of the potential changes.

A special map is required to maintain clearing of PRE tokens (PRE:.CLEAR), and is similar to the other maps in the TrackingObjectCommitStrategy

Benefits

  1. Allows the rollback of changes in case of token failure (return false), without forcing the LstToken to manage those changes and only commit them if the transaction will be successful. This lightens the programming burden on writers of the LstTokens.
  2. Ensures the use of the Tracking*CommitStrategy methods in the "core" runtime processing code. Since the tracking classes are used to hold the "unit of work" while the token is processing, it ensures that the key components of an "editor" framework are continually tested. This ensures that the editor system stays in better sync with the core PCGen code (a problem during the PCGen 5.x cycles)
  3. Ensures the tokens are applied into the Rules Data Store one at a time, and in the correct (left to right in the LST file) order.

Design Notes

Identity

It is critically important that the maps used to identify the temporary objects or other information that track changes are identity (==) maps. Attempting to use equality (.equals) based lookups can be demonstrated to result in errors due to hash changes and equality problems between objects (this is especially a problem in an Editor context vs. a Runtime context).

Removal vs. Clearing

Note that removals encountered during tracking are .CLEAR.x calls, not .CLEAR calls. .CLEAR calls must be treated as clearing information and stored in a tracking system separately from .CLEAR.x calls.

The reason is that the set of objects impacted by a .CLEAR call is never really known until the list is actually cleared. Thus, it is unsafe to store a .CLEAR into a negative changes object. This can be demonstrated through a simple thought exercise:
Source A: MyObj <tab> TOKEN:Foo
Source B: MyObj.MOD <tab> TOKEN:Bar
Source C: MyObj.MOD <tab> TOKEN:.CLEAR

Note if Source A and Source C are loaded, the .CLEAR removes only Foo. If Sources A, B, and C are loaded, then the .CLEAR removes Foo and Bar (assuming the load priority causes the load order to be A, B, C). As a result, the .CLEAR must be stored as a .CLEAR, since the Editor context must not assume it knows what the .CLEAR will remove. (The .CLEAR must not be converted into .CLEAR.Foo or the behavior will change when Source B is also loaded)

Characteristics/Weaknesses of the existing system

  1. The existing system doesn't necessarily have tokens processed at runtime in the order they appear in the LST file. This WILL occur for a given token name (e.g. ADD), but will not occur across token names (e.g. REMOVE and ADD). I'm not actually sure this is a problem, per se, just should be noted to those who believe the LST files are "strictly" processed left-to-right. They are strictly loaded left to right (to preserve order of .CLEAR, etc.), but that order is destroyed during consolidation. Processing order of the tokens is defined in the core, and is independent of the order in which different tokens appear in the LST file.
  2. Cross-token validity is not evaluated. In particular, structures like this:

MyObject <tab> TOKEN:Foo <tab> TOKEN:.CLEAR
...are not currently detected by PCGen. This is a weakness of the existing Token/Loader system. A reasonably simple solution to this problem exists. Currently the Load system has the tokens feed a single Tracking strategy which commits to a final Tracking (Editor) or Consolidated (Runtime) strategy. The solution to this problem is to have a Token-Tracking Strategy commit to a Line-Tracking Strategy, which commits to a final Tracking (Editor) or Consolidated (Runtime) strategy. The only additional code required is a set of detection methods for the (Line-)Tracking strategy to identify when non-sensical situations (.CLEAR after addtion) occur.

  1. MasterListsEqual should be present solely in EditorContext in order to remove a useless (and potentially confusing) interface from the Consolidated* objects
  2. cloneConstructedCDOMObject needs to be evaluated to determine why it is in ObjectContext
  3. Currently, the StringKey clearing system is not handled well. It is possible (likely?) that the globalClearSet map should be expanded to hold ANY of the *Key objects used in CDOMObject, including IntegerKey, StringKey, etc. rather than just ListKey.

History

This subsystem was added in PCGen 5.16.

Notes

  1. URI is used rather than URL, due to the design of URL in Java. URL calls to equals and hashCode cause domain name resolution. This is both a performance problem and a challenge when dealing with a machine that is not network connected.