Metadata and Its Importance in a Data Driven World
Last Updated February 6, 2015
In most information technology usages, the prefix of meta conveys “an underlying definition or description.” So it is that, at its most basic, metadata is data about data.
More precisely, however, metadata describes data containing specific information like type, length, textual description and other characteristics.
Considering the scope of data to which it applies – from document files, images and videos to spreadsheets and webpages, along with the explosion in data quantity, it’s not surprising that understanding and managing metadata effectively has become an IT business priority. Understanding the structure, limitations, definition and description of data protects against misinterpretation or misuse and helps ensure database integrity.
Types of Metadata
Metadata falls into three main categories:
- Descriptive, or used for discovery and identification and including such information as title, author, abstract and keywords.
- Structural, which shows how information is put together – page order to chapters, for example.
- Administrative, which enables better resource management by showing such information as when and how the resource was created. Two types of administrative metadata are those that deal with intellectual property rights and preservation metadata, used to archive and preserve a resource.
The Purposes Metadata Serves
Metadata serves a variety of purposes, with resource discovery one of the most common. Here, it can be compared to effective cataloging, which includes identifying resources, defining them by criteria, bringing similar resources together and distinguishing among those that are dissimilar.
It also is an effective means of organizing electronic resources, which is an important use given the growth in Web-based resources. Typically, links to resources have been organized as lists and built as static webpages, with the names and resources hardcoded in HTML. A more efficient practice, however, is to use metadata to build these pages. For Web purposes, the information can be extracted and reformatted through use of software tools.
Another use of metadata is as a means of facilitating interoperability and integrating resources. Using metadata to describe resources enables its understanding by humans as well as machines. This permits the most effective levels of interoperability, or how data is exchanged among many systems with disparate operating platforms, data structures and interfaces. In turn, it facilitates resource searches across the network.
Metadata also facilitates digital identification via standard numbers that uniquely identify the resource the metadata defines. Along these lines, another practice is to combine metadata so that it acts as a set of identifying data that differentiate objects or resources, supporting validation needs.
Finally, metadata is an important way to protect resources and their future accessibility. It’s a critical concern given the fragility of digital information and its susceptibility to corruption or alteration. For archiving and preservation purposes, it takes metadata elements that track the object’s lineage, and describe its physical characteristics and behavior so it can be replicated on technologies in the future.
How Metadata Pays Off
Investing in metadata development can create benefits in three key areas:
- It can extend data longevity. The life-span of a typical data set can be very short, often because missing or unavailable relevant metadata renders it useless. When comprehensive metadata is developed and maintained, it counters typical data entropy and degradation.
- It also facilitates data reuse and sharing. Metadata is key to ensuring that data which is highly detailed or complicated is more easily interpreted, analyzed and processed by the data’s originator and others.
- Metadata is essential for maintaining historical records of long-term data sets, making up for inconsistencies that can occur in documenting data, personnel and methods. Comprehensive metadata can also enable data sets designed for a single purpose to be reused for other purposes and over the longer term.
Developing and maintaining metadata can be an expensive proposition. There are costs associated with editing and publishing data and metadata. Their long-term stewardship and maintenance can also be burdensome. Yet, metadata is an investment that may not be optional in an era when information is critical to the life force of an organization.