From IT vendors trying to sell you service-oriented architecture (SOA), or your own private cloud, to organizations trying to merge disparte IT systems resulting from a merger or acquisition, the battle to break down information silos rages all around us. Organizations are trying to manage more data than ever before and the growing number of data sources both internal and external are expanding.
Organizations have spent millions of dollars over the span of several years creating their existing silos and it is unrealistic to expect them to just throw this investment away. Many data silos have been created for specific purposes, optimized for specific departments or lines of business. In addition, public policy or government regulations have driven the creation of specific data silos and govern how that information is utilized. It is very easy for us to look back and say this is how we should have architected the system, but you have to deal with what the system looks like today, warts (data silos) and all.
So that leaves organizations trying to figure out the best way to break down the vertical data silos and create a horizontal data harmonization that provides a single unified view of their enterprise. In a previous post, I discussed the concept of data discovery vs service discovery and the role of a metadata catalog as part of this process. The obvious key to the metadata catalog is the metadata (data about the data).
Metadata is key to navigating the data silos within your organization. Metadata can help organizations understand what data exists so that they can better share and reuse data to minimize redundancy. Organizations that have an effective metadata strategy are seeing:
- reduced software development costs
- faster time to market of new capability
- greater data fidelity
- reduced redundancy
Because metadata is key to being able to navigate your data silos, it is important to capture the right metadata to ensure the data is discoverable. The key to good metadata design is capturing the metadata fields that increase the usage value of the data by the consumer. So how do you know what fields will provide the most value? One approach is to reverse engineer the problem by asking questions like “How do users get value out of the data?” and then work your way backward from there. I have found that most good metadata formats contain some aspects of the following attributes:
- Creation Information (Title, Author, Contributors, Creation Date, etc)
- Descriptive Information (Language, Geospatial Coverage, Temporal Coverage, General Description, etc)
- Link to Source Data
If you are struggling on where to start with metadata for your organization, you can check out the Dublin Core Metadata Initiative. If you are doing metadata work for the Department of Defense, then you can also check out Department of Defense Discovery Metadata Specification.
Metadata is a key part of the enterprise information ecosystem required to navigate the data silos across the organization. With visibility and understanding of how to find multiple sources of data and how they are related increases confidence in the information used to make critical business decisions. This in turn enables companies to strategically managing their business in an increasing competitive environment.