Battelle Logo

Using Dictionaries to Manage Data Within a Modeling Framework System


Title Page
Legal Notice
Summary
Table of Contents
Acknowledgments
Abbreviations and Acronyms
Glossary
Introduction
Background
Understanding Dictionaries
Creating Dictionaries
References
Further Reading
Appendix

Background

In the past, the traditional approach for managing data within modeling systems was to directly connect specific models (i.e., hardwire the models to each other).  Each connection specifically reflected the data needs of the consuming model, resulting in an efficient transfer of data and dynamic feedback between models.  Figure 1 illustrates this traditional approach for linking models and managing data.

Figure 1.  Traditional Approach to Managing Data in Modeling Systems

Unfortunately, as the complexity of the modeled problem increases, so does the complexity of the data management.  Thus, this approach quickly becomes unmanageable.  It also prevents the user from adding new models, parameters, data requirements, databases, or other components without having to modify the entire system and revamp older models.

As modeling systems evolved, some developers took advantage of advances in "object-oriented" modeling to allow models entering the system to agree on a data transfer protocol (Figure 2).  This nontraditional approach identified system data specifications to which models would have to adhere when passing information between model types and databases.  Pre- and post-processors allowed older models to remain unaffected and facilitated the ability to connect these models directly into the system.  The ability enhanced quality control by making the entire legacy of testing and validation for each model and database still appropriate (i.e., the original code was not modified).  The ability also simplified management of and modification to multiple models (Whelan et al.  2001.  PNNL-13453).

Figure 2.  Nontraditional Approach to Managing Data in Modeling Systems

Unfortunately, ensuring that all model developers followed a rigid set of data specifications, while attractive in theory, proved difficult in practice.  Following a file specification required each developer to incorporate both format and data content, which had the effect of having every developer recode the specification.  This approach, while effective, increased code maintenance and made the interpretation of the file specification critical.  Many problems can arise when different individuals make slightly different interpretations of the specification.

In addition, many modeling frameworks required that the specification be built on the needs of upstream models (that is, models that run earlier in the analysis process).  These upstream models might produce three output files, while only two would be needed by models running later (downstream models).  For example, a chemical database might provide information on 17 different constituents, but the health effects model that will run later in the analysis only needs information on the three constituents that could be linked to cancer.  This specification approach, then, results in an excess of data moving through the system, slowing processing and analysis time.

FRAMES Version 2.0 provides the flexibility to allow model developers to use standard requirements (i.e., DICs) to minimize the data produced and consumed between models.  If a model developer wants his/her model to produce additional data that are not consumed by a specific downstream model, that can be accomplished.  Another, yet different, model might require these data in future assessments.  By giving model developers this flexibility to produce a wide range of output datasets, models can link to and communicate with a larger set of other models.  Models may also be able to be applied in a wider set of scenarios.

FRAMES 2.0 also utilizes an Application Programming Interface (API) to manage data within the system.  This API coordinates and manages the input and output between components.

Basically, in FRAMES, "what" is being stored is different from "how" it is stored.  Thus, the API can provide the following functions:

  • Range checking of parameters (ensures that parameters remain within acceptable bounds)

  • Data retrieval

  • Data storage

  • Units checking (ensures that the parameters are affiliated with the correct units)

  • Opening and closing of data sources

  • Read/write abilities (e.g., alerting users to errors, issuing command lines to instruct models how and when to run, identifying which models produce and consume data from other models, selecting models to be used in the analysis, and documenting user comments)

  • Units conversion (allows models to use whatever units they require)

  • Graphical user interface (e.g., being able to place and connect models by dragging icons and dropping them onto a work space, providing a pallet of models among which to choose, etc.).

The key to successful data management within a modeling system, then, is to ensure that the producing component provides information that meets the needs of the consuming component, in a form that is recognizable by both components as well as to provide a mechanism for ensuring the compatibility of that form.  Figure 3 illustrates this approach to managing data within a modeling system, with the arrows representing the transfer of data through one or many DICs.

Figure 3.  Using Dictionaries (indicated by arrows) to Manage Data in Modeling Systems

In FRAMES, this shared responsibility for managing data is based on datasets, whose describing information is characterized in DIC files.  These files can be created by hand or through the use of the API.

Figure 4 provides a more detailed example of the use of DICs by FRAMES.  The model shown in the figure can accept information from three different source types: user-defined input provided through a user interface, data supplied from a database, and data provided by a model that runs earlier in the analysis process.  When the output dataset (i.e., Model DICs as Output in Figure 4) from an upstreammodel provides enough information to satisfy the input required by a downstream model (e.g., Model 1 in Figure 4), then the data from theupstream model can be successfully and seamlessly transferred to thedownstream model.

Figure 4.  How Dictionaries Manage Datasets

The following sections provide additional information on DIC files, both to understand and to create them.


Home | Security and Privacy | Contact Us