ETL and Data Migration Testing Methodology

Testing of ETL (Extract, Transform and Load) process is not same as traditional software testing. It’s a backend testing and there are various methods of testing ETL process. As this is mainly a backend testing, testers use mainly SQL (Structured Query Language) to test ETL.

We can split the whole process into 3 parts:

  1. Testing of Pre-Migration Stage
  2. Testing of Migration phase and staging tables
  3. Testing of Post-migration phase
  4. Testing of Extract files for Downstream systems

etl

 

Testing of Pre-Migration Stage:

To verify the pre-migration stage area, some steps need to be followed. These are common for all migration projects and depend on project scope. Follow the steps below to understand the whole pre-migration stage verification process.

Source Tables/Files format: In most cases source is old systems like mainframe and developers extracts flat files from the source systems and load them in the staging area by ETL process. Source can be flat files as well as database tables depend on project. These sources need to be validated. Generally testers check source file data length (if source is flat file) or Table structure (if source is database).

Pre-stage or landing table validation: To ensure the data load is done properly do the below testing in pre-stage or landing area.

  1. Table structure check.
  2. Data count check (Source data count should be same as target data count)
  3. Rejected record check (verify as per reject logic. If reject logic is present then the above data count check step should be source count = target count + reject count)
  4. Primary key check.
  5. Generally no transformation logic is implemented in this step. If any transformation logic is implemented, then it should be tested.

Additional checks: If pre-stage or landing area load is incremental load then the resume mechanism needs to be tested.

Testing of Migration phase and staging tables:

The most important area of testing an ETL project is testing of the data migration from pre-staging or landing tables to staging tables. All the complex transformation logics are implemented here.

Testers test this data migration by writing complex SQL queries. The following testing processes are done in this step.

  1. Staging table or target table structure check. Verify data type of each column of each table.
  2. Record count of source should be equal to record count of target, if there is no rejection.
  3. Unique key column check.
  4. Verification of referential integrity.
  5. Checking of Migration scripts.
  6. Data transformation rules.
  7. Data cleansing rules.
  8. Redundant or conflicting methodologies.

Testing of Post-migration phase:

Verification of post migration includes the testing target area of migration phase. Testing that needs to be done in this step should include:

  1. Verification of Data Transformation rules.
  2. Duplicate Record Check.
  3. Record count between source & target.
  4. Referential Integrity (primary key/foreign key validation).
  5. Reject Verification.
  6. Default value check.
  7. History Table.
  8. Lookup tables.
  9. Unique Key of each record.
  10. Table/View Structure.
  11. Sample records. (not needed if full record set can be tested)

Testing of Extract files for Downstream systems:

In this step testers need to test the files which are generated to feed the downstream systems. As this process also includes an ETL process, all the possible ETL verifications need to be done here.

  1. Data Transformation
  2. Duplicate Record.
  3. Count Verification.
  4. Reject Verification.
  5. Default value check.
  6. Table/View Structure.
  7. Check Header & Trailer files (extract generated in flat file or other format).

Arijit Naskar

Arijit Naskar is a Software Developer and QA, also founder of alltechstories.com a popular tech tutorial & how-to website since 2014. He loves to explore new technologies and writing blogs.

You may also like...