XMDR

eXtended MetaData Registry (XMDR) Project

XMDR Use Cases
Specific Use Cases

Specific use cases are being solicited of all our collaborators, and will be included here as they come in. It has been suggested that we use the GBIF Use Case Template, or something close to it, to describe them. We may also decide to include some or all of the use cases captured in each of the following related works found on the web:

Note that we don't necessarily want to provide all of the functionality required for all of these use cases, rather we want a story for how the XMDR fits (or doesn't) in each use case.

High-level Use Cases

These are the high-level use cases, not specific to any particular domain but hopefully together "covering" the Specific Use Cases above. Again we do not expect XMDR to provide all of the support for these very broad use cases, but we do want to develop a clear story telling what role(s) XMDR can and will play in each of them. Furthermore, while for completeness these use cases include very generic functionality required of any MDR, we are particularly concerned in this project with functionality not yet provided by most, if any, existing MDRs, and most especially the functionality required to fully support the use of thesauri, taxonomies, and ontologies and various types of data and inter-schema mappings.

In this use case, a single application is being developed or maintained and the MDR is being used to track and manage all or some of the data elements, properties, concepts, classes, domains, contexts, and classification schemes (hereafter "administered items") being used and/or developed for that application. Its most important functions in this use case are the ability to centrally collect and track all of these administered items within the development or administration team. This is the simplest high-level use case.

In this use case, multiple applications are being developed or maintained and the MDR is being used to track and manage all or some of the data elements, properties, concepts, classes, domains, contexts, and classification schemes being used and/or developed for the applications. The MDR is used to foster consistency of data between the applications. The most important function in this use case is the ability to centrally collect and track all of these administered items and other metadata within and between the development or administration team(s).

In this use case, data needs to be either migrated from one system to another, or exchanged (on an on-going basis) between two different systems. The key is that there are two systems involved, each with their own sets of metadata potentially registered in two different MDRs.

In this use case, an organization wants to provide an integrated portal or data warehouse for information obtained from multiple systems or sources within the organization, typically in order to address a particular task, problem, or user community. This is no longer pair-based integration, and may require the development and maintenance of a common data transformation model in order to succeed. In some ways this use case is much like a set of pair-wise integrations of the various systems with the portal or warehouse, but the portal or warehouse will typically have a larger and more volatile set of metadata than any typical single application.

In this use case, an organization wants to provide an integrated portal or data warehouse for information obtained from multiple systems or sources, not all of which are internal to the organization. In this use case much of the metadata may be beyond the control of the Registration Authority for the MDR, in which case change management may be particularly challenging.

In this use case, one wants to integrate an open-ended set of resources, each of which may be either a single application or a portal or data warehouse. Because of the much greater scale involved, the implementation strategy needs to be much more top-down rather than bottom-up, i.e., integration needs to be driven by published standards, protocols and APIs (often starting with a lowest common denominator of functionality) rather than attempting to fully leverage all the functionality of all the resources being integrated.

Some examples of existing online transaction networks are travel reservation networks, securities trading networks, electronic banking networks, utility grids, and electronic purchasing or supplier networks. Online transaction networks are also under development in a number of additional sectors, most notably healthcare services and reimbursement.

Data grids are a nascent development primarily being pursued within various scientific research communities; very few could be said to yet exist, but quite a few are under active development and they are of great interest to many of the project participants. They are being grouped with online transaction networks because the integration challenges and adopted implementation strategies appear (at first blush, at least) to be very similar.

This most ambitious use case is as yet only vaguely defined, but the general concept is a "Grid" which is not limited to a particular domain or area of discourse, but within which resources can nonetheless be dynamically discovered, navigated, and utilized by software agents and/or other client systems via some common upper ontology(s) or other intermediating semantic metamodel(s).

Mid-level Use Cases

Drilling down one level of abstraction, these are the clusters of functionality common to one or more of the high-level use cases. Note that, while for completeness these use cases include very generic functionality required of any MDR, we are particularly concerned in this project with functionality not yet provided by most, if any, existing MDRs, and most especially the functionality required to fully support the use of thesauri, taxonomies, and ontologies and various types of data and inter-schema mappings.

A developer registers metadata that includes complex terminological and concept structures and ontologies in addition to the other types of administered items in the 11179 metamodel.

A developer or supervisor locates and retrieves part or all of a terminology/concept structure and related administered items, e.g., the definition of any data element, property, concept, class, domain, context, classification scheme, or ontology (hereafter "registered item"), along with the identification of the steward and registrar responsible for it.

As a system is used, gaps, errors, and duplication will be discovered in the metadata sets being used. As these discoveries are made metadata will need to be added and/or updated within the registry, in such a way that:

  • integrity of pre-existing data will not be compromised
  • registration of the new metadata will not require more of a delay in collection of new data than is acceptable for the given application
  • unnecessarily burdensome constraints are not imposed on the degree and manner of evolution of the metadata

Such an update may be either done by a developer or supervisor, or done (in part, at least) by a client application through an update API.

Developers and supervisors track, coordinate, and maintain versions of the administered items, along with the indentification of the steward and registrar responsible for it.

A developer (or supervisor) detects and, if possible, rectifies redundancies within the registry's inventory. For example, multiple entries within a single thesaurus or taxonomy for the same meaning, or multiple value domains with the same the same set of values, or with nearly the same sets of values where using the same set would make more sense.

It is common for an organization to accumulate a number of concepts with similar but significantly differing definitions. This is especially the case when legislation or government regulation is involved, i.e., when particular concepts are defined differently in different pieces of legislation and regulation. In such cases it becomes imperative that the particular definition being used in any one context is kept explicit, and perhaps even that additional notes/guidelines/etc. for determining which one to use or is being used in any particular context also be registered in the MDR. Very similar issues come up with the association between names and concepts. The MDR may also be called upon to support a harmonization project, in which an organization attempts to reduce the number of such conflicting definitions and/or names by standardizing them wherever possible.

A developer records a known mapping or set of interrelationships within terminology/concept structure, between two terminologies/concept structures, between a terminology/concept structure and other administered items, or directly between two administered items (without any explicit mediating concept(s)). If the MDR is also extended to support registration of full-blown schemas, then support will also be required for registration of inter-schema mappings.

A developer or supervisor navigates within a thesaurus or taxonomy, between thesauri or taxonomies, between data elements and concepts or between data elements in different systems via associated concepts and properties. Alternatively, an application uses the MDR to support similar navigation within that application (by an end user of that application), most often within a thesaurus or taxonomy. An information retrieval application might also use a navigation API to support query expansion and/or clustering of result sets by semantic neighborhood.

A developer or supervisor aggregates (i.e. selects) registered items in terms of some linked thesaurus or taxonomy, or an application uses the MDR to support aggregation. The most common uses of such aggregation are in support of information retrieval (in an operation often called "explode") and in support of OLAP. There may be two flavors of aggregation: one which returns all the descendent nodes of a root, and one which returns only the leaf nodes within a subtree.

A software agent or other client application queries an MDR in order to find resources of potential applicability to a particular task. Of particular interest is discovery of mappings between coding schemes, data formats, and schemas. Also of strong interest is semantic grounding for UDDI and other directory services.

A client application pulls metadata from the MDR in order to provide online help for an application end user. For example, it may provide the user with a more extensive definition for a form data element or a domain value concept than is provided directly by the application.

Version 6, minor copy edits by Frank Olken 2006-05-08. Version 5, edits and links updated by Frank Olken, 2006-03-07. Version 4, edited by Kevin D. Keck, August 20, 2004. Maintained by Kevin D. Keck, Lawrence Berkeley National Laboratory, kdkeck@lbl.gov. and by Frank Olken , at Lawrence Berkeley National Laboratory olken@lbl.gov . Last updated: 2006-05-08 4:24 PM PST by Frank Olken

© 2007, Lawrence Berkeley National Laboratory
maintained by Karlo Berket
Credits: The research and development of the eXtended MetaData Registry is supported by a variety of participating organizations. Valid XHTML 1.0 Strict Valid CSS!