Difference between revisions of "Load Commit Subsystem"
Tom Parker (talk | contribs) (→The Commit Process) |
Tom Parker (talk | contribs) |
||
(2 intermediate revisions by the same user not shown) | |||
Line 3: | Line 3: | ||
|} | |} | ||
− | = | + | =Loading the Rules Data Store= |
− | This | + | Beyond the tokens, a load subsystem translates between the data persistence file format parsed by the Tokens and the internal data structure. This system arguably fits the Data Mapper design pattern, although it's not strictly using relational databases. This system is currently known as a LoadContext. The details of translation takes various forms, and those structures are explained in later sections. |
+ | |||
+ | =Implementation= | ||
This is effectively executes a Unit of Work design, treating each Token as a Unit of Work. | This is effectively executes a Unit of Work design, treating each Token as a Unit of Work. | ||
− | ==Commit Strategy objects | + | ==Only valid Tokens may impact the Rules Data Store== |
+ | |||
+ | There is a risk that a partially-parsed Token from an invalid data persistence entry could lead to an unknown state within the Rules Data Store. Therefore, a Token should only impact the state of the Rules Data Store if the token parse completes successfully. The Token should not be responsible for tracking successful completion; rather the load subsystem implements a 'unit of work' design pattern to ensure only valid tokens impact the Rules Data Store. | ||
+ | |||
+ | This greatly simplifies the implementation of Tokens, as they are not required to analyze or defer method calls to the LoadContext until after the data persistence syntax is established to be valid. | ||
+ | |||
+ | During the load of data from the data persistence format, each Token may fully parse the provided value and make any necessary calls to the LoadContext. This can be done even if subsequent information indicates to the Token that there is an error in the Token value. Specifically, individual Tokens should be free to take any action on the LoadContext, and are not responsible for the consequences of those method calls unless the Token indicates that the value from the data persistence format was indicated to be valid. This indication of validity is by returning a ParseResult.PASS from the parse method of the Token. | ||
+ | |||
+ | If a Token passes, then the File Loader that called the Token is responsible for indicating to the LoadContext to commit() the changes defined by the Token. | ||
+ | |||
+ | If the Token returns a failure, then the File Loader is responsible for calling the rollback() method of LoadContext to indicate no changes should made to the Rules Data Store and the tentative changes proposed by the Token should be discarded. | ||
+ | |||
+ | ==Future Work== | ||
+ | |||
+ | There are a few changes that should take place here. | ||
+ | # Currently the parse method returns a ParseResult (effectively a specialized Tuple of a Boolean and an error string). This should convert to returning void and throwing an exception if there is a load error. | ||
+ | # We need more information about WHERE an error occurred in the value. If we are to properly implement an Editor, at some point we may encounter a situation where folks are typing in their own values. This would mean we need to parse those and indicate where errors are - so we may consider things like counting the number of characters we have processed in order to pinpoint exactly where the error occurred in the value String. | ||
+ | (Note: there are ways of doing this without doing it explicitly - e.g. a much smarter derivative of ParsingSeparator to track what was last requested) | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | =Commit Strategy objects= | ||
The Commit Strategy objects are in the pcgen.rules.context package. | The Commit Strategy objects are in the pcgen.rules.context package. | ||
− | + | ==Tracking*CommitStrategy== | |
Tracks changes based on the URI<ref group=note>URI is used rather than URL, due to the design of URL in Java. URL calls to equals and hashCode cause domain name resolution. This is both a performance problem and a challenge when dealing with a machine that is not network connected.</ref> and includes information about all changes (including CLEAR, etc.). Suitable for use in tracking detailed changes driven by tokens and being able to rewrite that information back out to LST files in the original format (e.g. will include .CLEAR in the output) | Tracks changes based on the URI<ref group=note>URI is used rather than URL, due to the design of URL in Java. URL calls to equals and hashCode cause domain name resolution. This is both a performance problem and a challenge when dealing with a machine that is not network connected.</ref> and includes information about all changes (including CLEAR, etc.). Suitable for use in tracking detailed changes driven by tokens and being able to rewrite that information back out to LST files in the original format (e.g. will include .CLEAR in the output) | ||
− | + | ==Consolidated*CommitStrategy== | |
Commits changes into Rules Data Store (mainly the CDOM Objects) and flattens out changes (e.g. actually removes the contents of a List after a .CLEAR). Suitable for use at runtime. Changes written to LST files are written back out to LST files as a consolidated result of the tokens (e.g. will include the impact of the .CLEAR rather than placing .CLEAR into the output) | Commits changes into Rules Data Store (mainly the CDOM Objects) and flattens out changes (e.g. actually removes the contents of a List after a .CLEAR). Suitable for use at runtime. Changes written to LST files are written back out to LST files as a consolidated result of the tokens (e.g. will include the impact of the .CLEAR rather than placing .CLEAR into the output) | ||
− | + | =The Commit Process= | |
[[Image:Rules_Persistence_System_Load_Event_Flow.PNG|thumb|Events During Data Load]] | [[Image:Rules_Persistence_System_Load_Event_Flow.PNG|thumb|Events During Data Load]] | ||
Line 27: | Line 51: | ||
Each token has the ability to define changes to the Rules data by calling specific methods in the LoadContext. These actions by the token are tracked in a set of Tracking*CommitStrategy objects while the token is processing. If the token completes successfully (returns true), then the changes are committed into another set of CommitStrategy objects; otherwise the changes are rolled back. Each token can be thought of as a 'unit of work'. | Each token has the ability to define changes to the Rules data by calling specific methods in the LoadContext. These actions by the token are tracked in a set of Tracking*CommitStrategy objects while the token is processing. If the token completes successfully (returns true), then the changes are committed into another set of CommitStrategy objects; otherwise the changes are rolled back. Each token can be thought of as a 'unit of work'. | ||
− | + | ==How Tracking Works== | |
Tokens have the ability to add or remove information from the Rules Data Store. In order to track the detailed changes, the TrackingObjectCommitStrategy has separate objects in which it can track added or removed information. All changes are tracked by the URI from which the changes originated. This allows changes to be appropriately written back to an LST file, rather than (accidentally) consolidated into the source in which an object first appeared. | Tokens have the ability to add or remove information from the Rules Data Store. In order to track the detailed changes, the TrackingObjectCommitStrategy has separate objects in which it can track added or removed information. All changes are tracked by the URI from which the changes originated. This allows changes to be appropriately written back to an LST file, rather than (accidentally) consolidated into the source in which an object first appeared. | ||
Line 47: | Line 71: | ||
==Benefits== | ==Benefits== | ||
− | # Allows the rollback of changes in case of token failure | + | # Allows the rollback of changes in case of token failure, without forcing the LstToken to manage those changes and only commit them if the transaction will be successful. This lightens the programming burden on writers of the LstTokens. |
− | # Ensures the use of the Tracking*CommitStrategy methods in the "core" runtime processing code. Since the tracking classes are used to hold the "unit of work" while the token is processing, it ensures that the key components of an "editor" framework are continually tested. This ensures that the editor system | + | # Ensures the use of the Tracking*CommitStrategy methods in the "core" runtime processing code. Since the tracking classes are used to hold the "unit of work" while the token is processing, it ensures that the key components of an "editor" framework are continually tested. This ensures that the editor system will stay in better sync with the core PCGen code. |
# Ensures the tokens are applied into the Rules Data Store one at a time, and in the correct (left to right in the LST file) order. | # Ensures the tokens are applied into the Rules Data Store one at a time, and in the correct (left to right in the LST file) order. | ||
− | ==Design Notes | + | =Lifecycle= |
+ | |||
+ | A GameMode owns a (LoadContext) specifically for information loaded within the GameMode. This may include stats and checks, if present in the GameMode, as well as other items from the system/ folder for that GameMode. | ||
+ | |||
+ | This GameMode (LoadContext) need only be built once - at PCGen load. | ||
+ | |||
+ | Whenever a new set of data is loaded, a new (RuntimeLoadContext) is built which inherits information from that GameMode LoadContext. (So if a reference was made to Ability "Foo" we can ensure that ability was eventually created in the data). | ||
+ | |||
+ | ==Future Work== | ||
+ | |||
+ | In general, it would be nice to have destruction of the RuntimeLoadContext to do a full cleanup between data loads. Unfortunately, we have a few items that are not stored in the RuntimeLoadContext - it is stored in other places. Globals.clearLists() must be called between data loads, and it would be a better design to be able to eliminate that clear, so that normal object lifecycles can be maintained. | ||
+ | |||
+ | =Design Notes= | ||
− | + | ==Identity== | |
It is critically important that the maps used to identify the temporary objects or other information that track changes are identity (==) maps. Attempting to use equality (.equals) based lookups can be demonstrated to result in errors due to hash changes and equality problems between objects (this is especially a problem in an Editor context vs. a Runtime context). | It is critically important that the maps used to identify the temporary objects or other information that track changes are identity (==) maps. Attempting to use equality (.equals) based lookups can be demonstrated to result in errors due to hash changes and equality problems between objects (this is especially a problem in an Editor context vs. a Runtime context). | ||
− | + | ==Removal vs. Clearing== | |
Note that removals encountered during tracking are .CLEAR.x calls, not .CLEAR calls. .CLEAR calls must be treated as clearing information and stored in a tracking system separately from .CLEAR.x calls. | Note that removals encountered during tracking are .CLEAR.x calls, not .CLEAR calls. .CLEAR calls must be treated as clearing information and stored in a tracking system separately from .CLEAR.x calls. | ||
Line 68: | Line 104: | ||
Note if Source A and Source C are loaded, the .CLEAR removes only Foo. If Sources A, B, and C are loaded, then the .CLEAR removes Foo and Bar (assuming the load priority causes the load order to be A, B, C). As a result, the .CLEAR must be stored as a .CLEAR, since the Editor context must not assume it knows what the .CLEAR will remove. (The .CLEAR must not be converted into .CLEAR.Foo or the behavior will change when Source B is also loaded) | Note if Source A and Source C are loaded, the .CLEAR removes only Foo. If Sources A, B, and C are loaded, then the .CLEAR removes Foo and Bar (assuming the load priority causes the load order to be A, B, C). As a result, the .CLEAR must be stored as a .CLEAR, since the Editor context must not assume it knows what the .CLEAR will remove. (The .CLEAR must not be converted into .CLEAR.Foo or the behavior will change when Source B is also loaded) | ||
− | + | =Characteristics/Weaknesses of the existing system= | |
# The existing system doesn't necessarily have tokens processed at runtime in the order they appear in the LST file. This WILL occur for a given token name (e.g. ADD), but will not occur across token names (e.g. REMOVE and ADD). I'm not actually sure this is a problem, per se, just should be noted to those who believe the LST files are "strictly" processed left-to-right. They are strictly <b>loaded</b> left to right (to preserve order of .CLEAR, etc.), but that order is destroyed during consolidation. Processing order of the tokens is defined in the core, and is independent of the order in which different tokens appear in the LST file. | # The existing system doesn't necessarily have tokens processed at runtime in the order they appear in the LST file. This WILL occur for a given token name (e.g. ADD), but will not occur across token names (e.g. REMOVE and ADD). I'm not actually sure this is a problem, per se, just should be noted to those who believe the LST files are "strictly" processed left-to-right. They are strictly <b>loaded</b> left to right (to preserve order of .CLEAR, etc.), but that order is destroyed during consolidation. Processing order of the tokens is defined in the core, and is independent of the order in which different tokens appear in the LST file. | ||
Line 78: | Line 114: | ||
# Currently, the StringKey clearing system is not handled well. It is possible (likely?) that the globalClearSet map should be expanded to hold ANY of the *Key objects used in CDOMObject, including IntegerKey, StringKey, etc. rather than just ListKey. | # Currently, the StringKey clearing system is not handled well. It is possible (likely?) that the globalClearSet map should be expanded to hold ANY of the *Key objects used in CDOMObject, including IntegerKey, StringKey, etc. rather than just ListKey. | ||
− | = | + | =Further Reading= |
− | + | For more information about the Rules Data Store, see [[Rules Data Store]] | |
− | + | =Notes= | |
<references group=note/> | <references group=note/> |
Latest revision as of 22:25, 25 February 2018
Loading the Rules Data Store
Beyond the tokens, a load subsystem translates between the data persistence file format parsed by the Tokens and the internal data structure. This system arguably fits the Data Mapper design pattern, although it's not strictly using relational databases. This system is currently known as a LoadContext. The details of translation takes various forms, and those structures are explained in later sections.
Implementation
This is effectively executes a Unit of Work design, treating each Token as a Unit of Work.
Only valid Tokens may impact the Rules Data Store
There is a risk that a partially-parsed Token from an invalid data persistence entry could lead to an unknown state within the Rules Data Store. Therefore, a Token should only impact the state of the Rules Data Store if the token parse completes successfully. The Token should not be responsible for tracking successful completion; rather the load subsystem implements a 'unit of work' design pattern to ensure only valid tokens impact the Rules Data Store.
This greatly simplifies the implementation of Tokens, as they are not required to analyze or defer method calls to the LoadContext until after the data persistence syntax is established to be valid.
During the load of data from the data persistence format, each Token may fully parse the provided value and make any necessary calls to the LoadContext. This can be done even if subsequent information indicates to the Token that there is an error in the Token value. Specifically, individual Tokens should be free to take any action on the LoadContext, and are not responsible for the consequences of those method calls unless the Token indicates that the value from the data persistence format was indicated to be valid. This indication of validity is by returning a ParseResult.PASS from the parse method of the Token.
If a Token passes, then the File Loader that called the Token is responsible for indicating to the LoadContext to commit() the changes defined by the Token.
If the Token returns a failure, then the File Loader is responsible for calling the rollback() method of LoadContext to indicate no changes should made to the Rules Data Store and the tentative changes proposed by the Token should be discarded.
Future Work
There are a few changes that should take place here.
- Currently the parse method returns a ParseResult (effectively a specialized Tuple of a Boolean and an error string). This should convert to returning void and throwing an exception if there is a load error.
- We need more information about WHERE an error occurred in the value. If we are to properly implement an Editor, at some point we may encounter a situation where folks are typing in their own values. This would mean we need to parse those and indicate where errors are - so we may consider things like counting the number of characters we have processed in order to pinpoint exactly where the error occurred in the value String.
(Note: there are ways of doing this without doing it explicitly - e.g. a much smarter derivative of ParsingSeparator to track what was last requested)
Commit Strategy objects
The Commit Strategy objects are in the pcgen.rules.context package.
Tracking*CommitStrategy
Tracks changes based on the URI[note 1] and includes information about all changes (including CLEAR, etc.). Suitable for use in tracking detailed changes driven by tokens and being able to rewrite that information back out to LST files in the original format (e.g. will include .CLEAR in the output)
Consolidated*CommitStrategy
Commits changes into Rules Data Store (mainly the CDOM Objects) and flattens out changes (e.g. actually removes the contents of a List after a .CLEAR). Suitable for use at runtime. Changes written to LST files are written back out to LST files as a consolidated result of the tokens (e.g. will include the impact of the .CLEAR rather than placing .CLEAR into the output)
The Commit Process
Each token has the ability to define changes to the Rules data by calling specific methods in the LoadContext. These actions by the token are tracked in a set of Tracking*CommitStrategy objects while the token is processing. If the token completes successfully (returns true), then the changes are committed into another set of CommitStrategy objects; otherwise the changes are rolled back. Each token can be thought of as a 'unit of work'.
How Tracking Works
Tokens have the ability to add or remove information from the Rules Data Store. In order to track the detailed changes, the TrackingObjectCommitStrategy has separate objects in which it can track added or removed information. All changes are tracked by the URI from which the changes originated. This allows changes to be appropriately written back to an LST file, rather than (accidentally) consolidated into the source in which an object first appeared.
In order to maintain simplicity in the Tokens, they are kept URI-ignorant. File Loaders are responsible for calling the setSourceURI(URI) method on LoadContext to identify the source of data being processed in the parse() method of a Token. File Loaders are also responsible for calling the setExtractURI(URI) method on LoadContext to identify any restriction on data that should be written out during calls to the unparse() method of a Token. The LoadContext is responsible for restricting responses to the extract URI in return values from any methods used by unparse() to extract information from the LoadContext.
To add information, a map (positiveMap) is maintained. This maps from the object being loaded (and thus containing the Token) to the temporary object used to track the changes. Items are applied into the temporary object immediately as they are executed in the Token. Only if the token completes successfully will the changes be committed. A rollback will result in the clearning of the map and disposal of the potential changes.
To remove information, a map (negativeMap) is maintained. This maps from the object being loaded (and thus containing the Token) to the temporary object used to track the negative changes. Items are applied into the temporary object as if they are added (vs. removed) immediately as they are executed in the Token. Only if the token completes successfully will the changes be committed (the presence of an object in the negative change object results in removal during the commit() method). A rollback will result in the clearning of the map and disposal of the potential changes.
For pattern-based removal, a map (patternClearSet) is maintained. This maps from the object being loaded (and thus containing the Token) to the ListKey and pattern to be removed. Items are added into the map immediately as they are executed in the Token. Only if the token completes successfully will the changes be committed (the contents of the ListKey in the CDOM Object will be searched and any items matching teh pattern will be removed from the CDOM Object during the commit() method). A rollback will result in the clearning of the map and disposal of the potential changes.
A token may not support both removals of information and pattern-based removals.
To clear information, a map (globalClearSet) is maintained. This maps from the object being loaded (and thus containing the Token) to the ListKey object that was cleared. ListKeys are added into the map immediately as they are identified as being cleared in the Token. Only if the token completes successfully will the clear be committed (the contents of the ListKey in the CDOM Object will be removed during the commit() method). A rollback will result in the clearning of the map and disposal of the potential changes.
A special map is required to maintain clearing of PRE tokens (PRE:.CLEAR), and is similar to the other maps in the TrackingObjectCommitStrategy
Benefits
- Allows the rollback of changes in case of token failure, without forcing the LstToken to manage those changes and only commit them if the transaction will be successful. This lightens the programming burden on writers of the LstTokens.
- Ensures the use of the Tracking*CommitStrategy methods in the "core" runtime processing code. Since the tracking classes are used to hold the "unit of work" while the token is processing, it ensures that the key components of an "editor" framework are continually tested. This ensures that the editor system will stay in better sync with the core PCGen code.
- Ensures the tokens are applied into the Rules Data Store one at a time, and in the correct (left to right in the LST file) order.
Lifecycle
A GameMode owns a (LoadContext) specifically for information loaded within the GameMode. This may include stats and checks, if present in the GameMode, as well as other items from the system/ folder for that GameMode.
This GameMode (LoadContext) need only be built once - at PCGen load.
Whenever a new set of data is loaded, a new (RuntimeLoadContext) is built which inherits information from that GameMode LoadContext. (So if a reference was made to Ability "Foo" we can ensure that ability was eventually created in the data).
Future Work
In general, it would be nice to have destruction of the RuntimeLoadContext to do a full cleanup between data loads. Unfortunately, we have a few items that are not stored in the RuntimeLoadContext - it is stored in other places. Globals.clearLists() must be called between data loads, and it would be a better design to be able to eliminate that clear, so that normal object lifecycles can be maintained.
Design Notes
Identity
It is critically important that the maps used to identify the temporary objects or other information that track changes are identity (==) maps. Attempting to use equality (.equals) based lookups can be demonstrated to result in errors due to hash changes and equality problems between objects (this is especially a problem in an Editor context vs. a Runtime context).
Removal vs. Clearing
Note that removals encountered during tracking are .CLEAR.x calls, not .CLEAR calls. .CLEAR calls must be treated as clearing information and stored in a tracking system separately from .CLEAR.x calls.
The reason is that the set of objects impacted by a .CLEAR call is never really known until the list is actually cleared. Thus, it is unsafe to store a .CLEAR into a negative changes object. This can be demonstrated through a simple thought exercise:
Source A: MyObj <tab> TOKEN:Foo
Source B: MyObj.MOD <tab> TOKEN:Bar
Source C: MyObj.MOD <tab> TOKEN:.CLEAR
Note if Source A and Source C are loaded, the .CLEAR removes only Foo. If Sources A, B, and C are loaded, then the .CLEAR removes Foo and Bar (assuming the load priority causes the load order to be A, B, C). As a result, the .CLEAR must be stored as a .CLEAR, since the Editor context must not assume it knows what the .CLEAR will remove. (The .CLEAR must not be converted into .CLEAR.Foo or the behavior will change when Source B is also loaded)
Characteristics/Weaknesses of the existing system
- The existing system doesn't necessarily have tokens processed at runtime in the order they appear in the LST file. This WILL occur for a given token name (e.g. ADD), but will not occur across token names (e.g. REMOVE and ADD). I'm not actually sure this is a problem, per se, just should be noted to those who believe the LST files are "strictly" processed left-to-right. They are strictly loaded left to right (to preserve order of .CLEAR, etc.), but that order is destroyed during consolidation. Processing order of the tokens is defined in the core, and is independent of the order in which different tokens appear in the LST file.
- Cross-token validity is not evaluated. In particular, structures like this:
MyObject <tab> TOKEN:Foo <tab> TOKEN:.CLEAR
...are not currently detected by PCGen. This is a weakness of the existing Token/Loader system. A reasonably simple solution to this problem exists. Currently the Load system has the tokens feed a single Tracking strategy which commits to a final Tracking (Editor) or Consolidated (Runtime) strategy. The solution to this problem is to have a Token-Tracking Strategy commit to a Line-Tracking Strategy, which commits to a final Tracking (Editor) or Consolidated (Runtime) strategy. The only additional code required is a set of detection methods for the (Line-)Tracking strategy to identify when non-sensical situations (.CLEAR after addtion) occur.
- MasterListsEqual should be present solely in EditorContext in order to remove a useless (and potentially confusing) interface from the Consolidated* objects
- cloneConstructedCDOMObject needs to be evaluated to determine why it is in ObjectContext
- Currently, the StringKey clearing system is not handled well. It is possible (likely?) that the globalClearSet map should be expanded to hold ANY of the *Key objects used in CDOMObject, including IntegerKey, StringKey, etc. rather than just ListKey.
Further Reading
For more information about the Rules Data Store, see Rules Data Store
Notes
- ↑ URI is used rather than URL, due to the design of URL in Java. URL calls to equals and hashCode cause domain name resolution. This is both a performance problem and a challenge when dealing with a machine that is not network connected.