The Three Biggest Challenges in the Data Migration Process
Steve Winkler | August 16, 2016
Every organization encounters the problem of moving large amounts of information from older systems and legacy platforms to more cost effective, modernized solutions. Managing a data migration process can be an extremely complex endeavor that presents a broad (and relatively consistent) set of challenges including the data itself, the platforms, and planning/execution. These challenges underscore an unfortunate fact about IT solutions vendors as they can sometimes underestimate the difficulty of seeding modernized systems with existing legacy data, leaving government customers with a complex migration problem after they’ve already contracted for the new solution.
1. The size, complexity, and condition of the data
One of the most central challenges is the sheer volume of the data that needs to be moved. Storage used to be expensive, but this is no longer the case. Given the relatively low cost of data storage compared to the risks of not keeping information, program managers will usually err of the side of storing more information and keeping it longer. As a result, government agencies typically maintain vast quantities of data in most of the systems they operate. Depending on the agency, that may mean multiple terabytes of data that need to be moved from legacy systems into a new system. The sheer amount of data that can be involved during modernization creates demands on bandwidth, storage space, personnel, and many other factors.
The complexity of the existing and target information models can also be a factor. Simply obtaining a detailed understanding of the existing legacy data takes time and effort. Even when a government organization has implemented mature data governance procedures and diligently maintained data reference models describing the information in legacy data platforms, the actual understanding of the data itself can remain at a high level or be incomplete. Reference models can be misleading, because although they tell us what data was supposed to be there (at one point in time), the data that is really there is often uncertain. This is why it’s smart to use data reference models as a starting point, and then review the actual data to complete your understanding of the data.
Another challenge in the data migration process is that there is often a lack of visibility into the condition and quality of the data being moved. Some of the data an agency needs to move may have been created many years or even decades ago — using systems and applications that are no longer supported and in repositories that have not been consistently checked for accuracy or compatibility. Consequently, even if the data can be mapped to the new platform, it may not be possible to actually move it where it needs to go.
Date information is a notorious example of this problem. Modernized databases have consistent definitions for valid dates, times and timestamps, and new technology systems frequently leverage these definitions in IT applications. But what happens when you encounter the date “000033”? No matter how you transpose or rearrange the digits, it’s not going to translate to a native date value. So then what? Should you throw away the entire record, or perhaps just that one field? Should you correct the anomaly in the legacy system before converting? Performing detailed analysis of the existing data at the very beginning of migration planning is a must. Only then will stakeholders have the ability to determine how best to deal with anomalous data.
The volume of data can also make it nearly impossible to evaluate existing data using only manual processes. Fortunately, there are a variety of tools and strategies that can effectively automate aspects of the processes of reviewing, cleaning, and verifying the data. In fact, one of the first places to look is in the embedded tools that are included with the new platform, combined with ubiquitous scripting techniques available on pretty much every platform.
2. Understanding the technology platform (both new and old)
A second area of challenge can be the technology platform — of both the legacy and target systems. An organization may not have staff on hand with requisite knowledge or familiarity with either or both of these systems. A related concern is that the proprietary nature of many older legacy systems can make it very difficult to use standard tooling in order to connect to data repositories, understand the information that is housed there, and efficiently pull the information into the new platform. Commercial industry tools can help here, but those tools can become very expensive for highly proprietary legacy platforms.
These challenges mean that at various stages throughout the transformation program, there may be a need to engage with subject matter experts with thorough knowledge of both existing and target platforms.
3. Blind spots in planning and scheduling
A third potential area of challenge in data migration is when there is not sufficient time allocated to critical steps. This isn’t entirely surprising, since an agency may undertake a major data transition only rarely. The people who worked on the last data migration effort may have moved on to other agencies or roles, or may have even retired. In any case, the lessons learned in the previous migration may be only partially relevant to the current project.
Another problem is that data migration scope is rarely considered at the onset of modernization programs, and trying to implement data migration requirements at the 11th hour is a recipe for disaster. Most technology vendors tend to do a great job of selling you the next new thing — but they’re far less concerned about making sure you can actually use it. They make promises about how easily you’ll be able to transition existing information into the new platform without fully understanding the data you already have. Effective planning, however, builds in the time and resources, early in the lifecycle, to test whether the process and technology will actually work when you “flip the switch.”
Overcoming the obstacles
In short, there are many moving parts involved in an effective data migration process, and a plan that does not adequately address them may impact the program’s ability to modernize on schedule, or even worse, interrupt mission-critical work during deployment. It pays to treat the data migration effort as its own subsystem or program track, allowing you to budget appropriate resources for the effort, and sometimes even to manage it as a separate effort in the modernization program.