Automated approach to data mapping and integration is need of the hour
Data mapping is an important step in the data integration process and critical
to the success of a data warehouse
project. Today automated approach to data mapping and integration is more
important than even before as developers and architects seek highly effective
BI solutions to gain quick granular insights into customer’s needs. ETL is one
part of the processes that successful BI effort requires. Thanks to increased
focus on data governance, the data
quality is no longer a nice to have, in fact it’s a must have.
The success
of a data integration project to a large extent depends on the success of data
mapping and data discovery efforts. Again the success of data mapping phase depends
on employing the right tools and approach. Currently there are tools for ETL,
QA, and Profiling but there are not enough tools to automate the data mapping
process, a critical step in the data migration process. This data
mapping process is largely carried out using excel based approach. SQL query
tools or other home grown tools are used to analyze data, feed data into excel
sheets, create requirements and business rules.
Manual work and fraught with errors
Before an ETL developer can actually start working
on the ETL design process they need to access the data, they use in the
process. This data is essential to effectively plan the ETL migration. Generally spread sheets or other such documents are
extensively used to accomplish the task. But this predominantly is a manual
work and fraught with errors. Though spreadsheets are widely used during data
migration tasks but they are insufficient when it comes to handling a huge data
mapping task.
Where the
integration efforts are minimal, it may not be a problem. But imagine organizations that have gone
through a merger or acquisition, needless to say they are going to have large
chunks of data at disposal, so the conventional way of mapping data through
excel or spreadsheet is not recommended. A big ETL migration task generally
requires larger manual teams at work to handle the mammoth task. The biggest
drawback with this excel based approach is that there are no set standards ways
as far as business rules are concerned. It is a fact that business rules are
subject to change and this creates a problem. The bigger the data integration
project, the more complex the manual mapping process becomes.
Why excel based mapping is not a
recommended approach
Conventional
approach involves extensive manual intervention throughout the task which
increases the risk of errors including duplication, wrong entry, and incomplete
data to name a few. Imagine multiple teams involved in mapping data, it can
lead to more error ridden mappings because of data entry errors and overwrites.
Excel sheet or any spreadsheet is generally designed as a reporting tool and
mainly used for analytical purpose. An excel sheet is not designed for that
purpose. Further frequent changes or tweaks can make it difficult to track the
original source baseline. Remember faulty data mapping can be
counterproductive.
Conclusion
The need of
the hour is an automated approach to data mapping and integration that can help
centrally manage and track the mappings via the workflow and change process,
minimizing risk and prevent errors and helping developers gain central access
to coding requirements.