Difference between revisions of "CDOM References Concept Document"
Tom Parker (talk | contribs) (→Characteristics/Weaknesses of the defined solution) |
|||
(6 intermediate revisions by the same user not shown) | |||
Line 3: | Line 3: | ||
|} | |} | ||
+ | We use CDOMReference objects to handle the requirement for Data Persistence File Load Order Independence | ||
− | = | + | =Requirement= |
− | + | Objects (such as Races, Languages, etc.) which are loaded from LST files need to have a unique identifier so that relationships between objects can be created without ambiguity. We call that identifier its KEY, and its how the data must refer to any object. | |
− | + | Objects can reference each other (e.g. a Race settings available starting Languages). This is a challenge because circular references are likely. More specifically, due to known interactions, it is impossible to choose a load order for files and entries that guarantees objects will be constructed before references to those objects are encountered. Order independence of persistent data is therefore an architectural requirement. (As an example, consider a restriction on the language to limit it to the particular Race that allows it as a starting Language) | |
− | + | As an additional concern: Objects can use the KEY: token anywhere on a given line. This creates an (intra-line) indexing challenge that exacerbates the circular reference problem and increases the difficulty of resolving that issue. | |
− | + | Since the purpose with the transition to the CDOM design is to support an integrated LST Editor that shares a load framework with the runtime environment | |
− | + | # CDOM References must support a round-robin to LST files. This means that group references (e.g. TYPE=Foo) must know the reference name and not universally expand to their contained components (yet must be able to return the contained components at runtime). | |
+ | # Sorting of CDOMReferences must be deterministic for testing to be done without significantly unreasonable analysis of the returned text (PCGen solves this by alphabetizing the lists of CDOMReferences in the tokens). | ||
− | |||
− | = | + | =Solution Overview= |
− | + | We must gracefully handle situations where an object is referenced before it was constructed. We do this through the concept of a "reference" (specifically a CDOMReference). There are various forms of references (described in more detail later), but for now, suffice it to say that a token of "TEMPLATE:Undead" which would grant the "Undead" template to a character doesn't actually look up "Undead". It requests a reference to a Template called "Undead". This reference request is captured, and after all of the LST files are loaded, the references are all "resolved". In effect, the references are Provider objects, although that interface/concept didn't exist at the time CDOMReferences were developed. | |
− | + | Using references before objects are constructed to ensure full parsing of the data persistence file syntax during load improves error catching capability at load time and should improve runtime performance. It also allows us to keep to one pass on the file text, which is critical since we read the file once (and we aren't parsing it into a tree). Overall, we end up with a multi-pass load system in order to ensure separation of the data persistence format and the internal data structure. Any Token may request a reference to an object, regardless of whether that object has been constructed. | |
− | + | The references requested by the tokens can then be placed into objects (Abilities, Skills, etc.) and the underlying object(s) to which the reference refers can be established at runtime. | |
− | = | + | =Detecting Data Issues= |
− | + | ==Duplicate Keys== | |
− | + | Duplicate keys are detected at load, regardless of their source file, thus no references should be ambiguous. | |
− | + | ==Handling unconstructed references== | |
− | + | The first issue that might be encountered is that references might be made to objects that don't exist. This problem cannot be detected until the entire load operation is complete. The Rules Persistence System makes a call to the validate() method of ReferenceContext to test whether any references were made where the appropriate referred-to object was not constructed during data persistence file load. In order to provide for minimal functionality without truly understanding the reference, PCGen constructs a dummy (empty) object with the given identifier. | |
− | References are used | + | References which are made to objects that are not loaded can be detected and reported to the user an an LST load error. Under conditions where such a reference is harmless (because it refers to items in an optional dataset), the FORWARDREF token (PCC files) can be used to suppress the LST load error. |
− | + | ==Empty Group References== | |
− | |||
− | + | Empty Group references can flag a warning if they do not contain any contents. This serves as LST error checking to catch accidental typos in a group identifier | |
− | = | + | =Code Interaction on References= |
− | + | ==Resolving References== | |
− | |||
− | |||
− | |||
− | |||
− | |||
+ | The References constructed during data persistence file load must be resolved before they are used during "runtime". Therefore, the Rules Persistence System is responsible for resolving any references after a collection of Campaigns are loaded. This resolution is driven through the resolveReferences() method of LoadContext. Due to the construction of dummy objects during the validate() step, resolveReferences() must be called after validate(). | ||
− | == | + | ==Interacting with Game Mode== |
− | + | Since References can be constructed well before they are resolved, references allow Game Mode information to be loaded once, thus preventing re-parsing of those files when a different set of sources is loaded. | |
− | + | =Types of References= | |
− | + | CDOM References are effectively pointers to CDOM Objects. A Single reference directly refers to a single object (something like referring to the Feat Dodge), and it follows very closely the pointer analogy. Group references, which can be refer to multiple objects (something like TYPE=Foo) effectively serve as a pointer to a group of one or more objects. | |
− | + | CDOM References are created and are aware of their underlying contents during the load of PCC/LST files, and with some minor exceptions, are not dynamic (meaning they do not change at runtime). | |
− | + | CDOM Reference objects can be found in the pcgen.cdom.reference package. | |
− | + | ==Simple References== | |
− | + | Simple References are references that refer to objects. This includes most CDOM Objects within PCGen. | |
− | |||
− | + | ==Single References== | |
− | + | Simple references refer to a single object. These can be resolved during runtime to return the underlying CDOM Object. | |
− | + | ||
+ | ==Group References== | ||
+ | |||
+ | Group references refer to one or more objects. These can share various characteristics, such as a TYPE, or be universal for a given type of object by using the ANY reference. In limited cases, pattern matching references are also available. | ||
− | == | + | ==Transparent References== |
− | + | Game Mode and PCC files are loaded when PCGen is launched. When these files refer to objects that would be loaded with future data, a Transparent Reference is created. These are special references that can be loaded multiple times (vs. other References that cannot be loaded more than once). | |
− | + | ==Direct References== | |
− | + | References are used in multiple ways: | |
+ | |||
+ | # A Reference can be used directly: References can be stored in ListKey or ObjectKey locations within a CDOM Object. | ||
+ | # A Reference can be used as an index: i.e. referring to a list of objects, such as the reference constructed from the CLASSES token in the Spell LST file referring to ClassSpellLists. | ||
− | + | However, there is one infrastructure to support lists, and those lists may be either static or unconstructed at LST load. The static references are known lists, yet in order to share infrastructure with unknown lists, even the known lists have to be wrapped in a CDOM Reference. For this reason, there is the ability to create Direct References. These CDOM References do not require resolution, as they are passed the single CDOM Object they will contain at the time the Direct Reference is constructed. | |
− | + | =Lifecycle of a Reference= | |
− | |||
− | |||
− | + | # Transparent Reference Creation: Game Mode and PCC files are loaded when PCGen is launched. When these files refer to objects that would be loaded with future data, a Transparent Reference is created. References are constructed by a Game Mode-based LoadContext, so that multiple references to an object of the same Class and Key will use the same Reference. | |
+ | # Reference Creation: When LST files are loaded, and CDOM Objects are referred to in the data, a CDOM Reference is created. References are constructed by the LoadContext, so that multiple references to an object of the same Class and Key will use the same Reference. | ||
+ | # Disposal on FORGET: LST files allow certain objects to be "forgotten", meaning they are disposed of from the loaded data, and cannot be used at runtime. However, given that the LST load process is single-pass, the FORGET is encountered well after References have been created for objects that were referred to by the forgotten object. Since the object is not loaded, it is appropriate to ignore any of those references, so the Token/Loader system includes a mechanism to dispose of those CDOM References when a FORGET is encountered. | ||
+ | # Loading of References (Runtime): If the LST load occurred in support of runtime processing of characters, loading is performed. This is not done if the LST load was performed in support of the editor system (not yet performed in PCGen 5.16). Once load of the data is complete, References are resolved. For Single References, this involves loading the Reference with the object it refers to. For deterministic Group References, this also involves loading with References. Some Group References are processed dynamically, meaning their contents are determined at runtime. | ||
+ | # Resolution (Runtime): If the LST load occurred in support of runtime processing of characters, loading is performed. This is not done if the LST load was performed in support of the editor system (not yet performed in PCGen 5.16). When a PlayerCharacter is being processed, the various CDOM References are resolved to determine the objects underlying the CDOM Reference. | ||
+ | # Write: Objects may be written back to an LST file from a CDOM References, and each CDOMReference has a getLSTformat() method that provides the unique method of identifying the reference that can be stored in a persistent state (in an LST file). | ||
− | + | =Characteristics/Weaknesses of the defined solution= | |
− | # Currently, the LoadContext system uses WeakReferences to control the Disposal of CDOM Reference objects when FORGET is encountered. In theory this does not guarantee that irrelevant CDOM Reference objects do not produce load errors due to missing information. In practice, this error is unlikely to be encountered | + | # Currently, the LoadContext system uses WeakReferences to control the Disposal of CDOM Reference objects when FORGET is encountered. In theory this does not guarantee that irrelevant CDOM Reference objects do not produce load errors due to missing information. In practice, this error is unlikely to be encountered. |
− | # Currently, BONUS, CHOOSE, and the PRExxx tokens have not been converted to the new Token/Loader system and do not yet provide all of the benefits outlined above | + | # Currently, BONUS, CHOOSE, and the PRExxx tokens have not been converted to the new Token/Loader system and do not yet provide all of the benefits outlined above. |
# Currently CDOM References are classes and are not provided into CDOM Objects as an Interface. In order to reduce potential for corruption of CDOMReferences, the single references are controlled to allow them to only be resolved once. This is done without providing a method for resetting a CDOM Reference. This is done to provide a strict mechanism to prevent errors and abuse of the CDOM Reference objects during runtime (this can be thought of as a short-term overly strict system to train developers and force separation between load and runtime). Once we have a more stable and modular core in place, this strict mechanism may be relaxed and/or exchanged for interfaces. Currently, this strict mechanism makes Game Mode references slightly more complex than other references. | # Currently CDOM References are classes and are not provided into CDOM Objects as an Interface. In order to reduce potential for corruption of CDOMReferences, the single references are controlled to allow them to only be resolved once. This is done without providing a method for resetting a CDOM Reference. This is done to provide a strict mechanism to prevent errors and abuse of the CDOM Reference objects during runtime (this can be thought of as a short-term overly strict system to train developers and force separation between load and runtime). Once we have a more stable and modular core in place, this strict mechanism may be relaxed and/or exchanged for interfaces. Currently, this strict mechanism makes Game Mode references slightly more complex than other references. | ||
# Some References are resolved at runtime. Theoretically, this is a performance issue, but practically, the dynamic references are rare enough that this is not an issue. The core of PCGen has much larger performance issues that should be addressed first before this is addressed as a performance issue. It may turn out to be unnecessary optimization, so addressing this weakness is currently not scheduled for any release. | # Some References are resolved at runtime. Theoretically, this is a performance issue, but practically, the dynamic references are rare enough that this is not an issue. The core of PCGen has much larger performance issues that should be addressed first before this is addressed as a performance issue. It may turn out to be unnecessary optimization, so addressing this weakness is currently not scheduled for any release. | ||
− | # Resolution may be an issue. This is mostly a theoretical issue at this point, but the dereferencing driven by the use of CDOM Reference objects is slightly slower than direct objects. This is considered a minor performance issue, and unless demonstrated as a performance issue, will not be addressed. There are some concerns about references in relation to lists (such as adding Spells to a ClassSpellList) and the complexity required to access the underlying information. | + | # Resolution may be an issue. This is mostly a theoretical issue at this point, but the dereferencing driven by the use of CDOM Reference objects is slightly slower than direct objects. This is considered a minor performance issue, and unless demonstrated as a performance issue, will not be addressed. There are some concerns about references in relation to lists (such as adding Spells to a ClassSpellList) and the complexity required to access the underlying information. |
+ | # In certain conditions, duplicate objects are not hazardous (This is possible as some objects are retrieved during runtime in a way that provides additional context beyond the KEY to establish uniqueness). However, this is an artifact that limits the capabilities of PCGen and should be eliminated. In the meantime, for the limited cases where duplicate objects are allowed, the ALLOWDUPES token is available in PCC files. | ||
+ | |||
+ | ==Future Work== | ||
+ | |||
+ | The ALLOWDUPES feature should be removed, as it is very difficult to work around. |
Latest revision as of 21:50, 25 February 2018
We use CDOMReference objects to handle the requirement for Data Persistence File Load Order Independence
Requirement
Objects (such as Races, Languages, etc.) which are loaded from LST files need to have a unique identifier so that relationships between objects can be created without ambiguity. We call that identifier its KEY, and its how the data must refer to any object.
Objects can reference each other (e.g. a Race settings available starting Languages). This is a challenge because circular references are likely. More specifically, due to known interactions, it is impossible to choose a load order for files and entries that guarantees objects will be constructed before references to those objects are encountered. Order independence of persistent data is therefore an architectural requirement. (As an example, consider a restriction on the language to limit it to the particular Race that allows it as a starting Language)
As an additional concern: Objects can use the KEY: token anywhere on a given line. This creates an (intra-line) indexing challenge that exacerbates the circular reference problem and increases the difficulty of resolving that issue.
Since the purpose with the transition to the CDOM design is to support an integrated LST Editor that shares a load framework with the runtime environment
- CDOM References must support a round-robin to LST files. This means that group references (e.g. TYPE=Foo) must know the reference name and not universally expand to their contained components (yet must be able to return the contained components at runtime).
- Sorting of CDOMReferences must be deterministic for testing to be done without significantly unreasonable analysis of the returned text (PCGen solves this by alphabetizing the lists of CDOMReferences in the tokens).
Solution Overview
We must gracefully handle situations where an object is referenced before it was constructed. We do this through the concept of a "reference" (specifically a CDOMReference). There are various forms of references (described in more detail later), but for now, suffice it to say that a token of "TEMPLATE:Undead" which would grant the "Undead" template to a character doesn't actually look up "Undead". It requests a reference to a Template called "Undead". This reference request is captured, and after all of the LST files are loaded, the references are all "resolved". In effect, the references are Provider objects, although that interface/concept didn't exist at the time CDOMReferences were developed.
Using references before objects are constructed to ensure full parsing of the data persistence file syntax during load improves error catching capability at load time and should improve runtime performance. It also allows us to keep to one pass on the file text, which is critical since we read the file once (and we aren't parsing it into a tree). Overall, we end up with a multi-pass load system in order to ensure separation of the data persistence format and the internal data structure. Any Token may request a reference to an object, regardless of whether that object has been constructed.
The references requested by the tokens can then be placed into objects (Abilities, Skills, etc.) and the underlying object(s) to which the reference refers can be established at runtime.
Detecting Data Issues
Duplicate Keys
Duplicate keys are detected at load, regardless of their source file, thus no references should be ambiguous.
Handling unconstructed references
The first issue that might be encountered is that references might be made to objects that don't exist. This problem cannot be detected until the entire load operation is complete. The Rules Persistence System makes a call to the validate() method of ReferenceContext to test whether any references were made where the appropriate referred-to object was not constructed during data persistence file load. In order to provide for minimal functionality without truly understanding the reference, PCGen constructs a dummy (empty) object with the given identifier.
References which are made to objects that are not loaded can be detected and reported to the user an an LST load error. Under conditions where such a reference is harmless (because it refers to items in an optional dataset), the FORWARDREF token (PCC files) can be used to suppress the LST load error.
Empty Group References
Empty Group references can flag a warning if they do not contain any contents. This serves as LST error checking to catch accidental typos in a group identifier
Code Interaction on References
Resolving References
The References constructed during data persistence file load must be resolved before they are used during "runtime". Therefore, the Rules Persistence System is responsible for resolving any references after a collection of Campaigns are loaded. This resolution is driven through the resolveReferences() method of LoadContext. Due to the construction of dummy objects during the validate() step, resolveReferences() must be called after validate().
Interacting with Game Mode
Since References can be constructed well before they are resolved, references allow Game Mode information to be loaded once, thus preventing re-parsing of those files when a different set of sources is loaded.
Types of References
CDOM References are effectively pointers to CDOM Objects. A Single reference directly refers to a single object (something like referring to the Feat Dodge), and it follows very closely the pointer analogy. Group references, which can be refer to multiple objects (something like TYPE=Foo) effectively serve as a pointer to a group of one or more objects.
CDOM References are created and are aware of their underlying contents during the load of PCC/LST files, and with some minor exceptions, are not dynamic (meaning they do not change at runtime).
CDOM Reference objects can be found in the pcgen.cdom.reference package.
Simple References
Simple References are references that refer to objects. This includes most CDOM Objects within PCGen.
Single References
Simple references refer to a single object. These can be resolved during runtime to return the underlying CDOM Object.
Group References
Group references refer to one or more objects. These can share various characteristics, such as a TYPE, or be universal for a given type of object by using the ANY reference. In limited cases, pattern matching references are also available.
Transparent References
Game Mode and PCC files are loaded when PCGen is launched. When these files refer to objects that would be loaded with future data, a Transparent Reference is created. These are special references that can be loaded multiple times (vs. other References that cannot be loaded more than once).
Direct References
References are used in multiple ways:
- A Reference can be used directly: References can be stored in ListKey or ObjectKey locations within a CDOM Object.
- A Reference can be used as an index: i.e. referring to a list of objects, such as the reference constructed from the CLASSES token in the Spell LST file referring to ClassSpellLists.
However, there is one infrastructure to support lists, and those lists may be either static or unconstructed at LST load. The static references are known lists, yet in order to share infrastructure with unknown lists, even the known lists have to be wrapped in a CDOM Reference. For this reason, there is the ability to create Direct References. These CDOM References do not require resolution, as they are passed the single CDOM Object they will contain at the time the Direct Reference is constructed.
Lifecycle of a Reference
- Transparent Reference Creation: Game Mode and PCC files are loaded when PCGen is launched. When these files refer to objects that would be loaded with future data, a Transparent Reference is created. References are constructed by a Game Mode-based LoadContext, so that multiple references to an object of the same Class and Key will use the same Reference.
- Reference Creation: When LST files are loaded, and CDOM Objects are referred to in the data, a CDOM Reference is created. References are constructed by the LoadContext, so that multiple references to an object of the same Class and Key will use the same Reference.
- Disposal on FORGET: LST files allow certain objects to be "forgotten", meaning they are disposed of from the loaded data, and cannot be used at runtime. However, given that the LST load process is single-pass, the FORGET is encountered well after References have been created for objects that were referred to by the forgotten object. Since the object is not loaded, it is appropriate to ignore any of those references, so the Token/Loader system includes a mechanism to dispose of those CDOM References when a FORGET is encountered.
- Loading of References (Runtime): If the LST load occurred in support of runtime processing of characters, loading is performed. This is not done if the LST load was performed in support of the editor system (not yet performed in PCGen 5.16). Once load of the data is complete, References are resolved. For Single References, this involves loading the Reference with the object it refers to. For deterministic Group References, this also involves loading with References. Some Group References are processed dynamically, meaning their contents are determined at runtime.
- Resolution (Runtime): If the LST load occurred in support of runtime processing of characters, loading is performed. This is not done if the LST load was performed in support of the editor system (not yet performed in PCGen 5.16). When a PlayerCharacter is being processed, the various CDOM References are resolved to determine the objects underlying the CDOM Reference.
- Write: Objects may be written back to an LST file from a CDOM References, and each CDOMReference has a getLSTformat() method that provides the unique method of identifying the reference that can be stored in a persistent state (in an LST file).
Characteristics/Weaknesses of the defined solution
- Currently, the LoadContext system uses WeakReferences to control the Disposal of CDOM Reference objects when FORGET is encountered. In theory this does not guarantee that irrelevant CDOM Reference objects do not produce load errors due to missing information. In practice, this error is unlikely to be encountered.
- Currently, BONUS, CHOOSE, and the PRExxx tokens have not been converted to the new Token/Loader system and do not yet provide all of the benefits outlined above.
- Currently CDOM References are classes and are not provided into CDOM Objects as an Interface. In order to reduce potential for corruption of CDOMReferences, the single references are controlled to allow them to only be resolved once. This is done without providing a method for resetting a CDOM Reference. This is done to provide a strict mechanism to prevent errors and abuse of the CDOM Reference objects during runtime (this can be thought of as a short-term overly strict system to train developers and force separation between load and runtime). Once we have a more stable and modular core in place, this strict mechanism may be relaxed and/or exchanged for interfaces. Currently, this strict mechanism makes Game Mode references slightly more complex than other references.
- Some References are resolved at runtime. Theoretically, this is a performance issue, but practically, the dynamic references are rare enough that this is not an issue. The core of PCGen has much larger performance issues that should be addressed first before this is addressed as a performance issue. It may turn out to be unnecessary optimization, so addressing this weakness is currently not scheduled for any release.
- Resolution may be an issue. This is mostly a theoretical issue at this point, but the dereferencing driven by the use of CDOM Reference objects is slightly slower than direct objects. This is considered a minor performance issue, and unless demonstrated as a performance issue, will not be addressed. There are some concerns about references in relation to lists (such as adding Spells to a ClassSpellList) and the complexity required to access the underlying information.
- In certain conditions, duplicate objects are not hazardous (This is possible as some objects are retrieved during runtime in a way that provides additional context beyond the KEY to establish uniqueness). However, this is an artifact that limits the capabilities of PCGen and should be eliminated. In the meantime, for the limited cases where duplicate objects are allowed, the ALLOWDUPES token is available in PCC files.
Future Work
The ALLOWDUPES feature should be removed, as it is very difficult to work around.