`
View Publications By Topic:
  Data Management
Data Management Organizations
Data Warehouse
Expert Systems
Information Science
Metadata
Metadata Solutions
Modeling
Newsletters

View Complete Indexed List
 
 

Home > Publications > Newsletters

The Metadata Mystique
By Adrienne Tannenbaum

Abstract
"Metadata" is not a globally understood term. There is no doubt that everyone views yesterday’s metadata as insufficient. In fact, to a large extent, yesterday’s metadata is often perceived to be non-existent in that it rarely meets the needs of the current data warehouse world. But recreating it today often adds to the problem.

More important, there appear to be no standard tools or methodologies for dealing with metadata, so its definition and treatment are often left to the interpretation of individual warehouse professionals. Because these professionals are usually primarily responsible for the data warehouse itself, metadata is often an "also ran" in terms of emphasis. Those "silver bullet" repository tools just never seem to handle metadata the way we think it should, and by the time the team is ready to acknowledge metadata’s importance, it becomes much easier to resort to a repository tool’s interpretation of the way metadata should be treated. Hence, we continue our circular metadata mystique.

It is time for a metadata clarification. This article will address metadata from a practical entirety - what it is and how its importance begins. Once it exists, it never goes away, and because the data warehouse's effectiveness is directly fueled by the metadata connection, metadata is a crucial component of the entire data warehouse development lifecycle. Data warehouses cannot survive without accurate metadata, and accurate metadata is useless if it is not easily accessible.

Metadata Defined

Definitions to IT terms typically start with theory but evolve to represent deployment. In the case of metadata, the theoretical definition always discusses data about data, or any vague representation of a documented aspect of data. To the novice, metadata is understood to consist of the standard ‘data dictionary’ entries, usually managed and controlled by a Data Administration group which over time has become a Data Management organization.

These ‘data dictionary’ entries always included data element names, definitions, physical attributes such as length, data type, ranges of allowed values (often called domains), associated file/database/program names, and/or the names and contact information of responsible employees. Generally speaking, data dictionary originating documentation had a tendency to be physical in nature, and was authored by development personnel.

The definition of metadata expanded in scope and became a more widely accepted and expected set of information when data responsibility became multi-faceted and applications began servicing multiple functional organizations. Hence the definition began including aspects of data that go well beyond its physical characteristics. Organizations began developing metamodels, logical data models which depict the interrelationships of all pertinent metadata. These illustrations formed the basis of an extended metadata definition, still in many cases theoretical, as follows:

The detailed description of data instances. Depending on the types of data populated, metadata can range from simple database field names, lengths, and characteristics, to the underlying tool constructshttp://www.dbdsolutions.com/DBDS.simply stated, (metadata) is the definition, format, and characteristics of populated data.

Metadata Deployed

Based on the wide variation of 'meta-understanding', it is only reasonable that deployed metadata scenarios be quite different in terms of origins, scopes, objectives, and technical implementations. Equally confusing is the variation in vendor options available in today's data warehouse marketplace to assist us with our metadata requirements.

So when we look at metadata based on the way it is, the definition really becomes quite different. Consider the fact that metadata components can originate during one or more of the following time-based data warehouse eras:

  • Yesterday, the world of legacy applications which fuels the majority of our data warehouse efforts.
  • Today, during data warehouse development, when we create and/or re-define metadata to represent a data warehouse perspective.
  • Tomorrow, when our metadata changes based on its interpretations by non-data warehouse developers and/or the tools and applications which need to access and represent it.

Metadata and Data Warehouse Development

The most ironic characteristic of failed data warehouse metadata is the fact that virtually all metadata is an obvious byproduct of each data warehouse development phase. In many situations the outputs from one phase function as inputs to the next and so on. Unfortunately, most data warehouse teams do not consider metadata requirements from a deployment point of view until the latter phases of implementation.

Bridging Today's Metadata Gaps

How does one interchange metadata amongst today's tools and remain sane? Simply speaking, metadata requirements must be clearly identified during the planning aspects of any data warehouse effort. A great beginning involves the modeling of metadata following a methodology which is very similar to that of modeling data:

  • Identify metadata requirements
  • Organize metadata requirements by beneficiary type
  • Categorize metadata based on where it will be needed:
    • Common Metadata
    • Specific Metadata (one beneficiary category only)
    • Unique Metadata (one beneficiary alone!)
  • Create the 'metamodel'
  • Relate metamodel instances to the metamodels of your planned data warehouse architecture
  • Develop a 'metadata' flow by identifying the sources for each implemented metadata instance

As each step progresses, more metadata details will unfold, and the issues identified earlier will surface. However, by clearly understanding and relating the requirements to what can be implemented, usable accessible metadata is more likely to result.

For more information on "The Metadata Mystique" please refer to The Journal of Data Warehousing, Volume 3 Number 4 Winter 1998.

 

 

Copyright 1999-2006 Database Design Solutions, Inc. - All Rights Reserved