Comprehensive Chemical Exposure Framework

Title Page

1.0 Introduction

2.0 Literature Review

3.0 Model Framework

4.0 Scenarios

5.0 Qualitative Analysis

6.0 Recommendations

7.0 References

Appendix A
A.1 Data Dictionary
A.2 Sensitivity Uncertainty
A.3 Model Considerations
A.4 Editors
A.5 Server Side

A.1 Data Dictionary File

The Data Dictionary File consists of three types of information: 1) parameter declarations, 2) table declarations, and 3) parameter relationships. A Data Dictionary File is a comma delimited text file that contains the metadata about the data that are contained in a particular dataset. The first line of the Data Dictionary File contains the number of parameters listed within the Data Dictionary File, the name of the dataset, and a universal reference location. The second line of the file contains the parameter field header of the file, which is outlined in Table A1.1.

Data Dictionary Files provide the metadata describing attributes of the actual data in datasets. The datasets contain the actual numbers that are consumed and produced by each model and database using the Data Dictionary File metadata formats, as illustrated by Table A1.1 through A1.3. Table A1.1 presents a definition of the data fields associated with a typical Data Dictionary File. Tables A1.2 and A1.3 illustrate the application of Table A1.1 as they relate to the parameters describing the chemical list, including degradation/decay products, and chemical toxicity information. If a parameter is indexed to a parameter in another Data Dictionary File, then the index contains the name of the other Data Dictionary File and an extension containing the other parameter. For example, the Inhalation Cancer Potency Factor in Table A1.3 is a function of chemical; therefore, it is indexed to the chemical CAS ID in the ChemList Data Dictionary File (i.e., ChemList.CAS). By providing indices and referenced parameters, the information only has to be stored once and an understandable mapping is provided for the user. Also identified in the Data Dictionary File tables are those parameters that exhibit statistical variation and can be represented by a distribution in a sensitivity/uncertainty analysis (e.g., Monte Carlo simulation) (see stochastic column in Tables A1.2 and A1.3).

The Data Dictionary Files are an effective mechanism for transferring information between components. The Data Dictionary Files can be cataloged into five categories, although the design allows for expansion:

  1. System Delineation Data Dictionary Files - Metadata that is developed and maintained by the system.
    • a. Start-Up Data Dictionary File - Metadata that describes the information necessary for the User Interface set-up, such as system path and run names, formatting information (e.g., font, color, size, etc.), screen and line colors (i.e., both background and foreground), interface flags indicating items such as visible logo image, visible identifier (e.g., American Chemistry Council, etc.), and window size and location. The dataset for this Data Dictionary File is initially populated with default settings, many of which can be modified by the user through the customize option in the user interface. The Star-Up Data Dictionary File contains all the necessary information for the user interface initialization settings and domain. Table A1.4 illustrates a Start-Up Data Dictionary File.
    • b. Simulation Data Dictionary File - Metadata that contains the necessary information to reproduce any particular conceptual site model within the Comprehensive Chemical Exposure Framework. This dataset contains module names, identifiers, dataset names and locations, module status, linkages, module locking information, and simulation comments. Table A1.5 illustrates a Simulation Data Dictionary File.
  2. Module Description Data Dictionary File - Metadata associated with the information describing the model and supporting information about the model (e.g., who to contact for more information, Input and boundary condition Data Dictionary Files consumed and boundary condition Data Dictionary Files produced by the model, and how the model fits into the system). This is a system maintained Data Dictionary File whose corresponding dataset is populated by the model developer. Table A1.6 illustrates a Module Description Data Dictionary File.
  3. Input Data Dictionary File - Metadata associated with the information required as user-supplied input to a model. Module developer Input Data Dictionary Files are specific to each model. Table A1.7 illustrates an Input Data Dictionary File, which is unique to the Surface Impoundment Module in Hazardous Waste Identification Rule Project. Note that the Input Data Dictionary File, as with all Data Dictionary Files, can be accepted and understood system-wide, or they can be unique to a particular model and only understood by that model, as illustrated in Table A1.7. A system-supported Input Data Dictionary File is maintained by the system, while Unique Input Data Dictionary Files are developed and maintained by the model developer.
  4. Boundary Condition Data Dictionary Files - Metadata defining data consumed by a model originating from an upstream model or database.
    • a. Model Dictionary File - Metadata associated with the information that is passed from a producing model to a consuming model. Model Dictionary Files represent the output results from the model. Table A1.8 illustrates and Output Model Dictionary File Unique to the Surface Impoundment Module in Hazardous Waste Identification Rule Project
    • b. Database Data Dictionary File - Metadata associated with the mapping of information between the database and the system.
  5. Sensitivity/Uncertainty Data Dictionary Files - Metadata defining the statistical information associated with the stochastic data.
    • a. Seed Data Dictionary File - Metadata defining the starting seed number associated with the random number generator for a Monte-Carlo simulation. Table A1.9 illustrates a Seed Data Dictionary File
    • b. Iteration Data Dictionary File - Metadata defining the current iteration of the simulation. Table A1.10 illustrates an Iteration Data Dictionary File
    • c. Sampled Values Data Dictionary File - Metadata defining the model inputs that are sampled as being stochastic and available for sampling. Table A1.11 illustrates a Sampled Values Data Dictionary File
    • d. SummaryValues Data Dictionary File - Metadata defining the model outputs that are summarized as part of the statistical results. Table A1.12 illustrates a Summary Values Data Dictionary File
    • e. Stochastic Data Dictionary File - Metadata defining the distribution and attributes associated with the stochastic parameters. Table A1.13 illustrates a Distribution Data Dictionary File for a Normal Distribution.
The major requirement of Data Dictionary Files, associated with model output, is that the associated datasets must be complete. Because many databases are incomplete to begin with, datasets associated with the Data Dictionary File associated with databases do not have to be complete. It is the model developer’s responsibility to deal with incomplete datasets from databases (i.e., shared-responsibility).

With the advent of standardized Data Dictionary Files, an Application Programming Interface can be developed for the Comprehensive Chemical Exposure Framework to coordinate and manage the input and outputs between components [e.g., range checking of parameters, data retrieval, data storage, units checking, open/close data sources, metadata functions (cardinality, units, definitions, etc.), etc.], Read and Write functionality (e.g., error handling, command line functions, producer-consumer relationships, conceptual site model security, model selection, run calls between models, documenting user comments, etc.), units conversion so each model can work with its own unique units without concern for unit conversion errors, and the conceptual site model graphical user interface (e.g., drag & drop functionality, tiered-icon pallet, etc.).

As noted earlier, if the producing component’s output Data Dictionary Files match the consuming component’s input Data Dictionary File requirements, then the two components can communicate. Figure A1.1 Communication between modelsillustrates the linkage of two models with each model receiving input from a database (i.e., Database Data Dictionary File) and user (i.e., Input Data Dictionary File), and Model 2 receiving upstream boundary conditions from the upstream model (i.e., Model Data Dictionary File). In this case the models can communicate because all upstream model boundary condition requirements of Model 2 are met by the information produced by Model 1.

The Application Programming Interface and Data Dictionary File design allows for the Plug & Play feature, which is the most important feature of the design. By ensuring Plug & Play, the CCEF inherently includes the ability to

  • Link any type of model, database, or framework into the system to communicate with any other component
  • Allow model developers, government organization, private companies, etc. to incorporate their own models and databases into the system with the necessity of going through the system developer as a middle-man
  • Ensure backward compatibility between legacy models and databases, and new models and databases
  • Allow linkage specifications to change with time and company
  • Link to remote databases or models
  • Link to remote frameworks, without having to integrate the remote frameworks into the Comprehensive Chemical Exposure Framework
  • Link to remote databases and only download the information necessary for the assessment
  • Integrate change into the system without having to redesign the components that are already included in the system.