CDOM References Concept Document

From PCGen Wiki
Revision as of 11:12, 1 April 2009 by James (talk | contribs)
Jump to: navigation, search


Overview

CDOM References are effectively pointers to CDOM Objects. A Single reference directly refers to a single object (something like referring to the Feat Dodge), and it follows very closely the pointer analogy. Group references, which can be refer to multiple objects (something like TYPE=Foo) effectively serve as a pointer to a one or more objects.

CDOM References are created and be aware of their underlying contents during the load of PCC/LST files, and with some minor exceptions, are not dynamic (meaning they do not change at runtime).

CDOM Reference objects can be found in the pcgen.cdom.reference package.

Types of References

Simple References

Simple References are references that refer to non-categorized objects. This includes most CDOM Objects within PCGen; the major exception is Ability objects, which are Categorized. A Simple Reference cannot refer to a Categorized CDOM Object.

Categorized References

Categorized References are references that refer to categorized objects. This includes Ability objects. A Categorized CDOM Object cannot refer to a Simple CDOM Object.

Single References

Simple references refer to a single object. These can be resolved during runtime to return the underlying CDOM Object.

Group References

Group referenfes refer to one or more objects. These can share various characteristics, such as a TYPE, or be universal for a given type of object by using the ANY reference. In limited cases, pattern matching references are also available.

Transparent References

Game Mode and PCC files are loaded when PCGen is launched. When these files refer to objects that would be loaded with future data, a Transparent Reference is created. These are special references that can be loaded multiple times (vs. other References that cannot be loaded more than once).

Direct References

References are used in multiple ways:

  1. Reference can be directly used: References can be stored in ListKey or ObjectKey locations within a CDOM Object.
  2. References are used as an index: This is referring to a list of objects, such as the CLASSES token in the Spell LST file referring to ClassSpellLists.

However, there is one infrastructure to support lists, and those lists may be either static or unconstructed at LST load. The static references are known lists, yet in order to share infrastructure with unknown lists, even the known lists have to be wrapped in a CDOM Reference. For this reason, there is the ability to create Direct References. These CDOM References do not require resolution, as they are passed the single CDOM Object they will contain at the time the Direct Reference is constructed.

Lifecycle of a Reference

  1. Transparent Reference Creation: Game Mode and PCC files are loaded when PCGen is launched. When these files refer to objects that would be loaded with future data, a Transparent Reference is created. References are constructed by a Game Mode-based LoadContext, so that multiple references to an object of the same Class and Key will use the same Reference.
  2. Reference Creation: When LST files are loaded, and CDOM Objects are referred to in the data, a CDOM Reference is created. References are constructed by the LoadContext, so that multiple references to an object of the same Class and Key will use the same Reference.
  3. Disposal on FORGET: LST files allow certain objects to be "forgotten", meaning they are disposed of from the loaded data, and cannot be used at runtime. However, given that the LST load process is single-pass, the FORGET is encountered well after References have been created for objects that were referred to by the forgotten object. Since the object is not loaded, it is appropriate to ignore any of those references, so the Token/Loader system includes a mechanism to dispose of those CDOM References when a FORGET is encountered.
  4. Loading of References (Runtime): If the LST load occurred in support of runtime processing of characters, loading is performed. This is not done if the LST load was performed in support of the editor system (not yet performed in PCGen 5.16). Once load of the data is complete, References are resolved. For Single References, this involves loading the Reference with the object it refers to. For deterministic Group References, this also involves loading with References. Some Group References are processed dynamically, meaning their contents are determined at runtime.
  5. Resolution (Runtime): If the LST load occurred in support of runtime processing of characters, loading is performed. This is not done if the LST load was performed in support of the editor system (not yet performed in PCGen 5.16). When a PlayerCharacter is being processed, the various CDOM References are resolved to determine the objects underlying the CDOM Reference.
  6. Write: Objects may be written back to an LST file from a CDOM References, and each CDOMReference has a getLSTformat() method that provides the unique method of identifying the reference that can be stored in a persistent state (in an LST file).


The Origin of References

Background Concepts

Keys: Objects (such as Races, Languages, etc.) which are loaded from LST files need to have a unique identifier so that relationships between objects can be created without ambiguity.

Category: In addition to a Key, Ability objects also have another part of their unique identifier, called a Category. This is usage #1 of Ability Categories, see the Ability Category explanation.

The Challenge

When data is loaded from PCC/LST files, there is an order of operations challenge. (For any developers, this is similar to challenges that a compiler faces when compiling source code).

Specifically:

  1. Objects can reference each other (e.g. a Race settings available starting Languages). This is a challenge because circular references are likely (such as a restriction on the language to limit it to the particular Race that allows it as a starting Language)
  2. Objects can use KEY and CATEGORY anywhere on a given line. This creates an (intra-line) indexing challenge that exacerbates the circular reference problem and increases the difficulty of resolving that issue.

Additionally, since the purpose with the transition to the CDOM design is to support an integrated LST Editor that shares a load framework with the runtime environment

  1. CDOM References must support a round-robin to LST files. This means that group references (e.g. TYPE=Foo) must know the reference name and not universally expand to their contained components (yet must be able to return the contained components at runtime).
  2. Sorting of CDOMReferences must be deterministic (PCGen 5.16 solves this by alphabetizing the lists of CDOMReferences in the tokens)

The Solution in PCGen 5.16

In order to keep the data structured in a way that keeps information stored in a location close to where it is addressed in the rules, addressing the order of operations challenges is a code issue, not a data structure issue.

In PCGen 5.14 and earlier, this problem was solved by storing information within PCGen as Strings, and then resolving those Strings against unique identifiers for each object (generally the Key). This led to a number of issues, including performance issues due to String parsing, as well as failure to catch some errors until runtime.

There are various potential solutions to such problems, and some of these are covered in Rebuild of the Token/Loader System. These were addressed in PCGen with the integration of a new Token/Loader system during the PCGen 5.16 cycle. While there are solutions that would drive a two-pass parse of the data, the solution implemented in PCGen 5.16 limits the load of data to a single pass on the LST files. Thus PCGen uses a series of references to allow the data at LST load to refer to objects which have not been encountered within the data. After the load is complete, the references are resolved to their underlying objects.

Benefits of the defined solution

  1. Duplicate keys are detected at load, regardless of their source file, thus no references should be ambiguous
  2. Empty Group references can flag a warning if they do not contain any contents. This serves as LST error checking to catch accidental typos in a group identifier
  3. Since References can be constructed well before they are resolved, references allow Game Mode information to be loaded once, thus preventing re-parsing of those files when a different set of sources is loaded.

Characteristics/Weaknesses of the defined solution

This section refers to PCGen 5.16

  1. Currently, the LoadContext system uses WeakReferences to control the Disposal of CDOM Reference objects when FORGET is encountered. In theory this does not guarantee that irrelevant CDOM Reference objects do not produce load errors due to missing information. In practice, this error is unlikely to be encountered. However, an ideal structure would design around this problem. This issue is partially a function of the method of storage used in PCGen 5.16 (the Graph structure envisioned for CDOM is not in place), and partially a function of using a one-pass load. There are multiple solutions to this problem, but since it's mostly a theoretical issue, it is not being addressed as a critical issue.
  2. Currently, BONUS, CHOOSE, and the PRExxx tokens have not been converted to the new Token/Loader system and do not yet provide all of the benefits outlined above. A CHOOSE rebuild has been proposed as part of the PCGen 5.17 cycle (which leads to the next stable release, either 5.18 or 6.0).
  3. Currently CDOM References are classes and are not provided into CDOM Objects as an Interface. In order to reduce potential for corruption of CDOMReferences, the single references are controlled to allow them to only be resolved once. This is done without providing a method for resetting a CDOM Reference. This is done to provide a strict mechanism to prevent errors and abuse of the CDOM Reference objects during runtime (this can be thought of as a short-term overly strict system to train developers and force separation between load and runtime). Once we have a more stable and modular core in place, this strict mechanism may be relaxed and/or exchanged for interfaces. Currently, this strict mechanism makes Game Mode references slightly more complex than other references.
  4. Some References are resolved at runtime. Theoretically, this is a performance issue, but practically, the dynamic references are rare enough that this is not an issue. The core of PCGen has much larger performance issues that should be addressed first before this is addressed as a performance issue. It may turn out to be unnecessary optimization, so addressing this weakness is currently not scheduled for any release.
  5. Resolution may be an issue. This mostly a theoretical issue at this point, but the dereferencing driven by the use of CDOM Reference objects is slightly slower than direct objects. This is considered a minor performance issue, and unless demonstrated as a performance issue, will not be addressed. There are some concerns about references in relation to lists (such as adding Spells to a ClassSpellList) and the complexity required to access the underlying information. While this is a recognized issue, this pain point will be addressed in a near-future release as the PlayerCharacter data structure is changed to a Graph-based structure.