MDM – Master Data Management
Some types of data are shared across systems. One of the best examples of data that is shared across systems is customers. They can appear in a lot of different systems, perhaps a CRM system and a sales system. This means there is a risk of conflicting information. If the CRM system has one customer address, and the sales system another, which one is the correct?
In master data management discussions the phrase “single version of the truth” is often used. What it means is that there must be one reliable source for information regarding, for instance, customers.
Not all types of information are suitable for a MDM solution. Borrowing terminology from data warehouse concepts, dimensions can be managed in a MDM solution. Facts are not suitable in a MDM solution and are better managed by the traditional OLTP systems.
A MDM solution is also a good place to centralize data quality improving activities.
A MDM solution has to be architected so that some systems publish information to the MDM solution, and other systems subscribe to the information. Some systems can be both publisher and subscriber.
Note that normally in a MDS solution, each source system has its own copy of the data. There are solutions where the master data is removed from all source systems and they instead have to look up that information from a centralized master data hub. However, those solutions require rather large changes in the source systems that may not be practical.
In this blog post I’ll briefly describe Microsoft’s solution for MDM – SQL Server 2012 Master Data Services, or MDS. I’ll explain the central concepts in MDS, but I won’t show the steps to actually use MDS. That may be another blog post.
MDS – Master Data Services
MDS was first released in SQL Server 2008 R2. There were rather large changes in the SQL Server 2012 release, so this blog post will cover the 2012 version only.
MDS is a framework assisting in building the different parts of a MDM solution from a central web site portal. From there most of the administration regarding the MDS solution can be handled. There is also an interface to manage the members (the “rows”) in the models.
A model is a container for grouping related entities. A model has one or more master entities. So a “Product” model would probably have a “Product” entity as well.
Entities are similar to database tables. An entity has several attributes.
An attribute is similar to a column in a database table. Attributes describe the entity. There are two mandatory attributes in all entities, code and name. Code is used as a key and the name gives meaning to the attribute. Attributes can be one of three different types:
Free-form – these attributes are only limited by their data type.
Domain-based – similar to a relation in a database or a drop down of possible values. The possible values for these attributes are coming from another entity. A common example is product that has a domain-based attribute product group.
File – this type of attribute stores a file, for instance a cv or a picture.
Sometimes called instances. Members are similar to a row in a database table. This is the actual data, for instance a product. There are two types of members:
Leaf members – the “standard” type of member and is the actual “row” that contains the data.
Consolidated members – used in collections and hierarchies to group other members.
Hierarchies are used to store and validate relations between data. There are two types of hierarchies in MDS:
Derived – A derived hierarchy is derived from the existing relationships between members. Built from the actual data in domain based attributes. More on derived hierarchies can be found here.
Explicit – An explicit hierarchy is created and maintained manually within an entity. Uses consolidated members to group other members. More on explicit hierarchies can be found here.
A collection is a grouping of other members. The grouping is “flat”, meaning that no hierarchical structures can be built. More on collections can be found here.
Business Rules are used to make sure the data members are accurate and complete. Business rules are defined using IF…THEN structures in the business rule editor included in MDS.
For each entity that is created in MDS, a physical staging table is created in the MDS database. This table can be used as a data flow destination to load members in an SSIS package. The type of import is controlled with the Import Types, that covers all scenarios (insert, update, delete, purge, merge).
Systems that need the information stored in the MDS database can subscribe to the information via subscription views. Subscription views are created from within MDS.
There is an Excel plugin available that lets data stewards and other users to manage the members of entities in MDS. New members can be created from within the Excel plugin.