Basics of Master Data Management(MDM) Testing
This article is about Master Data Management testing. It’s not specific to any MDM tool, here I tried to describe in general what we do while testing a MDM implementation project. If you want to know about any specific MDM tool, please leave a comment or drop me a query , I will try my best to answer you.
What is Master Data management?
Master Data Management (MDM) is the technology to create and maintain master data of any organization. MDM tools generates a golden record (Master Data) from the information (Data) gathered from different source systems .Master Data may include data about customers, employees, inventory, suppliers, territory etc.
Why Master Data Management?
Suppose a customer of a mobile phone company moved from address A to address B. After moving the customer applied to change his billing address and got confirmation that the billing address has been changed. But after that the customer did not receive a bill for few months. One day, the customer noticed that his phone connection has been gone. The customer then went to the company office and verifies that they have the new address, and the company billing department verifies that the address on file is address B. The customer asks for a copy of the bill, to pay the amount. After few more weeks without a bill, the customer changed his connection to another company. The mobile operator company has lost a customer forever. However, if the company could track the Addresses and accordingly sent the bill then the situation might not be got so bad. Here comes the importance of using Master Data Management.
Testing a MDM implementation project:
Testing MDM project is not like testing a traditional data warehouse project. In MDM projects we have to test features of MDM tool which is being used and also the normal data migration testing .Here I will describe the normal data migration testing principles and the most common features of MDM. These features may vary in different MDM tools after reading this article you will be able to test basic features of any MDM tool. In MDM testing generally we test the backend data. I will describe the database testing here if you want to know for a specific tool’s UI testing please leave a comment I will try my best to help you.
While testing a MDM implementation we follow below testing approaches:
- Data migration testing
- Data cleansing testing
- Data standardization testing
- Data consolidation testing
Data migration testing:
Inside MDM hub there might one or more layer (depends upon the MDM tool). After data loads from Source system to MDM hub through ETL (Extract Transform and Load) we have to verify that all data have been properly loaded or not. This is the first level of verification.
Pre-requisites to start this verification:
- ETL job run successfully
- Test environment is ready
Process to test this migration:
- If source is flat file we generally do manual verification based on sampling. Sometimes we use Microsoft access database.
- If source is a Database, we develop and run SQL minus queries with database link.
Inside MDM hub if more than one layer is available, then after loading each layer we have to test them to verify data has been loaded properly or no. But in this case Database remains same so we generally run SQL minus queries for testing.
Data Cleansing and Standardization testing:
To create Master Data it’s required to cleanse and standardize data. For example: Suppose from one source system some data is been loaded, which have some special character in first name. MDM tool removes these special characters and loads the fresh and clean data. This process is called data cleansing. To test data cleansing functionality we develop SQL queries with regular expression checking.
Most of the time data loads from source systems are not in standardized form. For example: a spelling mistake in address line one. There are a lot of third party tools available in market by which data standardization can be done. Depending on the client requirement and the tool which is being used to standardize data, testing approach is different. But for all cases SQL based testing is the best approach.
Data consolidation testing:
Data consolidation process is the most important part of MDM. In this process the MDM hub search for same records (which are matched) and merge them. This match and merge process is very complex and varies upon client requirements. In general there are two types of match process: Exact match and Fuzzy match.
- The match rule where the exact values of the columns need to be matched. Any small discrepancies in between is not considered i.e. if any small difference is observed between two values, then theses records are not going to be matched.
- Suitable for columns containing consistent and predictable patterns. Exact match columns match only on identical data. Examples include IDs, postal codes, industry codes, or any other well-defined piece of information
- There are three kind of Exact Match
Null Match, Non-Equal Match, Segment Match
- By, default, NULL is not regarded as being the same as NULL.
- Valid only for Exact Match
- The values only match when both are null.
- This prevents equal values in a column from matching with each other.
- A typical scenario where this is used is to prevent the records coming from a specific source system from matching with each other.
- This allows a match rule to be limited to a specific subset of data.
- Different match rules can use different segment values. So segment matching is per rule basis.
- Probabilistic match happens.
- Suitable for columns containing data that varies in spelling, abbreviations, word sequence, completeness, reliability, and other inconsistencies. Examples include street addresses and names of people or organizations.
Example: The ‘Full Name’ field of a customer record having data as ‘Mr. Ramesh Agrawal’ may result in the fuzzy match with another Customer record having ‘Full Name’ field data as ‘Mr. Ramesh Agarwala’. In this example, the 2 names are not exactly the same but they are similar, so it will trigger a fuzzy match rule.
Merge Process\Consolidation Process:
After match pairs have been identified in the match process, consolidation is the process of consolidating data from matched records into a single, master record.
Consolidation Process can happen in two ways:
- Auto Merge 2) Manual Merge
After running the merge job records gets merge and one gold copy remains in the database.
The records that satisfy the manual match rule are queued for manual merge and a task is created for Data Steward. Data Steward then login to the hub, identify the matched records, verify them and then manually merge them.
Pre-requisites to start Match-Merge verification:
- Match-Merge job run successfully
- Test environment is ready
Process to test this migration:
- Testers first ask developers to run the match job.
- Then by running SQL queries testers identify which records are being matched and are ready to be merged. And also check that the records which are being matched are matched by proper business rule or not.
- Then developer run the merge job
- Now testers test whether all the data, which were matched before are merged successfully or not.