XMDR Content Survey Tabular Template v0011

Frank Olken (LBNL)
2004-12-10

I. Introduction and Usage

This is intended to be a template for use in characterizing candidate metadata content for inclusion in the XMDR (Extended Metadata Registry) prototype.

Note that we have implicitly assumed that the metadata resources are unclassified.

It is suggested that for each candidate metadata resource, surveyors should simply edit this web page, deleting introduction and replacing text for each section with the appropriate text for a specific metadata resource. Please use the terminology suggested. Please complete the survey in English.

Upon completion, email the web page to olken@lbl.gov. Please put "XMDR Content Survey - [specific metadata resource name]" in the subject line, e.g., "XMDR Content Survey - UMLS". I will collect and post the individual web pages on our XMDR web site.

If an attribute is unknown, simply write "Unknown". If an attribute is not applicable, write "Not Applicable".

Comments, proposed revisions, etc. should also be emailed to Frank Olken at olken@lbl.gov.

Version 6 reflect changes proposed by Gail Hodge.

We anticipate that surveyors will initially complete section II (Core Content Characterization). Sections III and IV will be completed later for those resources deemed appropriate for inclusion in the prototype registry.

II. Core Content Characterization

The following attributes of the content characterization will be supplied by XMDR staff.

Attributes marked with an asterisk (*) are considered mandatory.

Core Content Characterization
Attribute Replace comment with value
Title*

The title (name) of the metadata resource. Taken from Dublin Core Metadata Element Set, Version 1.1: Reference Description.

Acronym

The acronym (if any) of the title (name) of the metadata resource.

**One of the following (Web page(s), Identifier, or Contact Information) is mandatory:
Web page(s)**

URL or URI for a web page which describes and/or provides access to the metadata resource.

Identifier**

Unique identifier for the metadata resource, e.g., URN, URI, URL, ISBN, ISSN, DOI, etc.

Contact Information**

Name of contact person, mailing address, email address, phone number, etc. for this metadata resource.

Inclusion Rationale*

Why should we include this metadata content in the XMDR prototype? Which sponsoring organizations of XMDR are likely to be interested?

Subject*

Indicate application domain(s) of the metadata content metadata resource. See below. Taken from Dublin Core Metadata Element Set, Version 1.1: Reference Description.

  • agricultural
  • environmental
  • biological, biomedical, medical or healthcare
  • chemical
  • demographic
  • defense or military science
  • economic
  • geographic information systems (GIS) (indicate 2D, 2.5D, 3D)
  • oceanographic
  • weather or climate
  • energy related
  • space sciences
  • administrative, including organizational and personal names
  • legal
  • general (e.g., Library of Congress subject headings)
  • other (not to be used in place of "general")
Kind of Metadata*

Indicate type(s) of metadata included in the data set: cf. "type" attribute from the Dublin Core Metadata Element Set, Version 1.1: Reference Description.

  • data element characterization
    • definitions - natural language, logic-based
    • types
    • dimensionality / measurement units
  • Term Lists
    • Authority Files
    • Glossaries
    • Gazetteers
    • Dictionaries
    • code sets - e.g., country codes, CAS numbers, airport codes
  • Classification and Categorization
    • Subject Headings
    • Classification/Categorization Schemes and Taxonomies
  • Relationship Groups
    • Thesauri
    • Semantic Networks
    • partonomies (part-of relationships) - geographic, organizational, anatomic, manufactured, ...
    • Ontologies
    • schemas - e.g., for databases, messages, file formats, etc.
    • matchings, mappings - across terminologies, schemas, ...
Size statistics (estimated)*

How big is the data set: number of terms or concepts (nodes), number of relationships (edges), number of constraints, size in bytes (in various formats/compressions), size in bytes of internal representations?

Initial Submitter*

Name of person who filled out this survey for this metadata resource in the initial phase. Also include email address.

Date of Initial Survey*

Date that the initial phase of the survey was completed or updated. Used ISO 8601 date format, i.e., yyyy-mm-dd.

III. Supplementary Content Characterization

The following attributes of the content characterization will be completed by XMDR staff. These attributes will be collected for those content collections which are considered to be high priority for inclusion in the prototype.

Attributes marked with an asterisk (*) are considered mandatory.

Supplementary Content Characterization
Attribute Replace comment with value
Date*

Publication date of most recent version of the metadata resource in ISO 8601 Date Format: YYYY-MM-DD. Taken from Dublin Core Metadata Element Set, Version 1.1: Reference Description.

Creator

Organization or person(s) responsible for the creation (authoring) of the metadata resource. Taken from Dublin Core Metadata Element Set, Version 1.1: Reference Description.

Publisher*

Organization or person(s) responsible for publishing / distributing the metadata resource. Note that we do not differentiate here between publishers and distributors. Taken from Dublin Core Metadata Element Set, Version 1.1: Reference Description.

Description*

Additional textual description of the metadata resource.

Language(s)*

Language(s) of the content of the metadata resource. Is the metadata resource multi-lingual (e.g., Thesauri)? Taken from the Dublin Core Metadata Element Set, Version 1.1: Reference Description.

Graph-theoretic Classification*

Indicate type of metadata according to the following graph-theoretic classification scheme:

  • tree (e.g., Dewey Decimal Classification) - in a directed graph, every node, except the root (which has no parents), has exactly one parent node
  • faceted classification (e.g., LOINC, ... )
  • directed acyclic graph (DAG) (e.g., instance-of relations) - also known as a acyclic digraph
  • partial order (e.g., is-a, and part-of relations) Partial order = DAG + transitivity
  • lattice (temporal intervals, sets)
  • general simple directed graph (e.g., UMLS) - may contain cycles
  • nested graph - nodes may contain subgraphs, containment relation forms a tree, edges do not penetrate containers
  • directed hypergraph - edges connect sets of edges
  • compound graph - "meta-nodes" contain subgraphs, which may overlap, containment relation forms DAG, edges may penetrate containers
Format / Schemas(s)*

What file formats (ASN.1, XML, RDF, KIF, HDF5, netCDF, ... ) can the metadata resource be had in? What schemas are used (e.g., for XML, RDF, ...)

Media / Download*

Is the metadata resource available for download? What are the principal / mirror download sites ? What media types (CDROM, DVD, ...) are available (if any)?

Constraint Specifications

Does the metadata resource include constraints? What kinds of constraints (keys, foreign keys, ....)? How are constraints specified (SQL, logic, RuleML, SWRL, Object Constraint Language, ...) ?

Protocol(s)

What protocols does the metadata publisher support for download, other access, e.g., FTP, HTTP, REST, SOAP, UDDI, LDAP, etc.

Licensing Issues*

Open source, public domain, academic use, proprietary license, .... License agreement required ? Cost of license? Can content be redistributed or posted to the web?

Export restrictions

Restrictions on export / distribution ?

Subsets

Is there some subset of the metadata resource which would be of interest? Can we request / extract / query this subset for the originating site or will we have to obtain the complete metadata resource content and then perform the subset extraction query processing on our system?

Versions, Updates

What is the current version number of the metadata resource? How often is the metadata resource updated? How are updates named / distributed / propagated ?

Documentation

What documentation is available? Where / how to obtain documentation ? Format of documentation ?

Character Set Encoding*

Is the data set encoded in ASCII, Unicode (and if so what character encoding UTF-8, UTF-16, ...), or other?

Measurement units

What system of measurement units (e.g., SI, cgs, US customary, ... ) is used (if any)? How are they encoded ?

Dataset / Standards Dependencies

Indicate any dependencies of this metadata resource on other metadata resources or standards, e.g., country codes, terminologies, chemical or biological nomenclature standards, etc.

Related Datasets

Other similar or related metadata resources.

Software tools

What software tools are available to parse, load, convert, browse, edit, .... this metadata resource (type)?

Audience(s)

Who is the primary intended audience for this data set? Expert researchers, DBAs, scientific users, agency staffers, librarians, statisticians, teachers, general public, college students, high school students, ...

Citation

How should the metadata resource be cited in publications, etc. ? Note that the preferred bibliographic citation is often a publication rather than the web site for a resource.

Surveyor*

Person who filled out this section of the survey for this metadata resource. Contact info also.

Date of Survey*

Date this survey was completed / updated for this metadata resource.

IV. Content Characterization by XMDR Staff

The following data elements are to be supplied by XMDR project staff/collaborators.

Attributes marked with an asterisk (*) are considered mandatory.

Content Characterization by XMDR Staff
Attribute Replace comment with value
MDR Participant Expertise

Names of persons (if any) on XMDR project (and contact info) who are familiar with this data set. XMDR participant organizations who have copies of this metadata resource.

MDR Evaluator*

Names of persons on XMDR project (and contact info) who evaluated this metadata resource for inclusion (e.g., if the content survey was completed by someone at the content repository).

Inclusion Priority*

Priority suggested for acquisition and ingestion of this metadata resource.


Maintained by Frank Olken at Lawrence Berkeley National Laboratory. olken@lbl.gov Last updated: 2004-12-10, Friday, 11:05 AM PST

Valid XHTML 1.0!