Planning a successful data migration – part one

Careful and considered planning is key to achieving a successful data migration but is often neglected. If you look at all the data migration projects we have been engaged in over the past five years, in 63% of cases we were bought in due to a failed implementation, suggesting that roughly three out of four software implementations fail to plan properly for data migration and suffer as a result. 

Applying best practice is critical. Our recommended approach to data migration is a simplification of the Johnny Morris model, which is split into four different stages; discover, design, test and deliver. In this blog, we’ll take a look at how to get the first two stages right. 

DISCOVER

Landscape analysis

Before you make any decisions on your chosen solution or approach you need to conduct a landscape analysis; where you identify, catalogue and measure all of the data and data processes in your organisation. This is so you can ensure your new solution will accommodate existing data assets and core processes. 

Use questionnaires, process maps and IT audits to identify and document all data stores, including any unregistered data stores such as spreadsheets or databases held by departments or external suppliers.

Next, you need to know what data exists in each data store, and identify the format, quantity, age, range and general health of your data. Next up create a measure of data completeness and quality to determine any gaps, issues or duplication. Pull all of the above together so you can see the relationships and processes that occur between data stores.

Stakeholder engagement

Your next job is to identify all potential stakeholders across your enterprise, including crucial suppliers, partners or even customers that should be consulted. Prioritise the list based on the stakeholder’s significance to the project. This will help you define a communication strategy.

Scoping

Scope out the legacy data and processes that need to be accounted for in your end solution. This should be high level and easily digestible by any member of the project team.

Planning and governance

Identifying, implementing and managing Data Quality Rules (DQRs) helps you to address data accuracy, formatting and completeness issues before you migrate. DQRs are a set of business rules that govern each piece of data. These are defined by you and will be appropriate for the data type and specifics of your data model and origination. It’s important that these are annotated on your code in your migration controller (more on that later).  

Commissioning 

Next, you need to think about how your migration is commissioned. The process from start to finish requires a lot of very specific data skills, so if you’re planning on carrying out the work in-house, you need to be confident that you have the skills, expertise and resource available to dedicate to it. If not, you could bring in contractors to work alongside your internal resource, however this requires careful management and can be costly. Alternatively, you could outsource the work to a specialist business or consider your systems integrator (SI) (although check that they’re not outsourcing this function themselves and marking up the cost).

DESIGN

Once you’ve selected your new technology platform and potential implementation partners, it’s time to turn your attention to design. 

Approach

First you need to decide on your preferred approach. Are you going to take a ‘big bang’ approach, where all the data is extracted from your legacy systems and loaded onto your new platform in one go? This means your new application will be available for users straight away, but requires everything to be complete before you press go. Or, are you going to take a phased approach, where data and processes are migrated one segment at a time? This requires parallel running of your new application and your legacy ones. This isn’t always as straight forward as it seems. It’s rare to have distinct business functions with no overlap in legacy data. A cut-over facility can help overcome this issue, but it can get complex.

Mapping

To effectively plot the course of your transformation exercise, you’ll need a map which takes the following into account; 

  1. The direction of data travel –show where a piece of data originates from and where you plan on sending it. 
  2. The transformation that needs to take place.
  3. The order of transformation, including whether some of the transformations need to have taken place first before others can proceed.
  4. The verification rules to reconcile the data. 

You should start with ‘right to left’ mapping, starting with the target system and working back to find the source data. Then move onto ‘left to right’ mapping – where data starts in the legacy platform and is pushed via a migration controller to the new system. This requires another technique called gap analysis which we’ll cover next, but for more guidance and tips on mapping, refer to our whitepaper.

Gap analysis

The purpose of this is to review all the data that is being left behind by your ‘right to left’ plan. Consider –  

  • What data should populate any unpopulated fields in the new platform? The subject matter expert and SI should be able to provide an overview of the data they expect to be in there, and locate it if it already exists. If not, it’s probably a new field that you’ll gather data for once the application has launched. 
  • What legacy data hasn’t been accounted for in the new platform? It’s almost impossible to capture and cater for every requirement, no matter how thorough you’ve been, so make sure you build in a phase for these corrections in your project plan. However it is likely that there’ll be lots of redundant legacy data identified by the gap analysis that you don’t need anymore. If this is the case, simply park it in your data warehouse or data lake so you can access it if you ever need it again.

In both the above cases, the gaps can be solved with the DQR programme. Once you’ve gone through that process you should be in a position to start the development work.

Development

The design stage should have given the developers everything they need to produce to code to create the transformation but they may encounter unforeseen issues or nuances. The DQR programme must be highly responsive during this phase to overcome these.

Next comes the creation of an application called a migration controller. The first step is to extract all the data from the legacy sources. Consider using a staging database so you can keep a local copy of the data and then you only have to grab the latest differential when performing the migration, not the full file. This is less time consuming and is easier to programme.

The next step is the development of the transformation code, which should cover off your mapping design and have a self-referential code to reconcile data throughout, raising error notifications for any issues.

Load

Finally comes the loading of the data, which requires close collaboration with your SI. It’s likely that a number of processes, automated tasks and stored procedures need to be switched on and off as part of any load, which can be programmed as part of the migration controller load scripts.

We hope you found our whistle-stop tour of the discover and design stages helpful. If you’re ready to test and deliver your migration, head to part two of this blog, or for more depth and insight on the steps above, take a look at our whitepaper.

A Guide for Data Migration Success

Did you know that according to Experian, due to issues with data migration, only 46% of all new database implementation are delivered on time? Or that an incredible 74% of projects go over budget?