How to prepare for a data warehouse migration

Data warehouses are at the heart of an organisation’s decision-making process which is why many businesses are moving away from the siloed approach of traditional data warehouses to a modern data warehouse that provides advanced capabilities to meet changing requirements.
We have looked at how to tackle data warehouse migration challenges. In this post, we’ll explore a few important questions that come up during the initial preparation and discovery phases including the impact of modernising a data warehouse in real life and how you can better prepare for and plan your migration to a modern data warehouse.
The preparation phase
An enterprise data warehouse has many stakeholders with a wide range of use cases. Therefore, it’s important to identify and involve the key stakeholders early in the process to make sure they’re aligned with the strategic goals. They can also help identify gaps and provide insight into potential use cases and requirements which can help prioritise the highest impact use cases and identify associated risks. These decisions can then be approved and aligned with business metrics which usually revolve around three main components:
- People
To make sure you’re getting input and buy-in for your migration, start with aligning leadership and business owners. Then, explore the skills of the project team and end-users. You might identify and interview each functional group within the team by conducting workshops, hackathons and brainstorming sessions. Remember while discussing issues to consider how to secure owner sign-off by setting success criteria and KPIs, such as:
- Time saved
- Time to create new reports
- Reporting usage increase
- Talent acquired through innovation
- Technology
By understanding the current technical landscape and classifying existing solutions to identify independent workloads, you can more easily separate upstream and downstream applications to further drill down into their dependency on specific use cases. For example, you can cluster and isolate different ETL applications/pipelines based on different use cases or source-systems being migrated to reduce the scope as well as underlying risks. Similarly, you can couple them with upstream applications and make a migration plan which moves dependent applications and related data pipelines together.
In addition to understanding current migration technologies, it’s key that you are clear on what you are migrating. This includes identifying appropriate data sources with an understanding of your data velocity, data regionality and licensing, as well as identifying business intelligence (BI) systems with current reporting requirements and desired modernisations during the migration. For example, you might want to move that daily report about sales to a real-time dashboard. You might also want to decide if any upstream or downstream applications should be replaced by a cloud-native application and could be driven by KPIs below:
- TCO of new solution vs. functionality gains
- Performance improvements and scalability
- Lower manageability
- Risk of lock-in vs. using open source
- Process
By discussing your process options, you can uncover dependencies between existing components and data access and governance requirements, as well as the ability to split migration components. For example, you should evaluate license expiration dependencies before defining any migration deadlines. Processes should be established to make effective decisions during migration and ensure optimal progress inline, using KPIs such as:
- Risk of data leakage and misuse
- Revenue growth per channel
- New services launched vs. cost of launching them
- Adoption of ML-driven analytics
A strong understanding of the processes you intend to put in place can open up new opportunities for growth. For example, a well-known e-commerce retailer wanted to drive product and services personalisation. Their existing data warehouse environment did not provide predictive analytics capabilities and required investments in new technology. BigQuery ML allowed them to be agile and apply predictive analytics, unlocking increased lifetime value, optimised marketing investment, improved customer satisfaction and increased market share.
The discovery phase
The discovery process is mainly concerned with two areas: business requirements and technical information.
1. Understanding business requirements
The discovery process of a data warehouse migration starts with understanding business requirements and usually has a number of business drivers. Replacing legacy systems has implications in many fronts, ranging from new team skill set requirements to managing ongoing license and operational costs. For example, upgrading your current system might require all of your company’s data analysts to be re-trained, as well as new additional licenses to be purchased. Quantifying these requirements and associating them with costs, will allow you to make a pragmatic, fair assessment of the migration process.
On the other hand, proposing and validating potential improvement gains by identifying gaps in the current solution will add value. This can be done by defining an approach to enhance and augment the existing tools with new solutions. For example, for a retailer, the ability to deliver new real-time reporting will increase revenue, since it provides significant improvements in forecasting and reduced shelf-outs.
This retailer realised that shelf-outs were costing them millions in lost sales. They wanted to find an effective solution to predict inventory needs accurately. Their legacy data warehouse environment had reached its performance peak, so they wanted a cloud offering like BigQuery to help them analyse massive data workloads quickly. As a result of migrating, they were able to stream terabytes of data in real-time and quickly optimise shelf availability to save on costs and get other benefits like:
- Incremental revenue increase with reduced shelf-outs
- 2x accuracy vs. previous predictive model
Business challenges that were previously perceived as too difficult to solve can be identified as new opportunities by re-examining them using new technologies. For example, the ability to store and process more granular data can aid organisations in creating more targeted solutions. A retailer may look into seasonality and gauge customer behaviour if Christmas Day falls on a Monday versus another day of the week. This can only be achieved with the ability to store and analyse increased amounts of data spanning across many years.
Educating your users is key to any technology modernisation project. In addition to learning paths defined above, this can be done by defining e-Learning plans for self-study. In addition, staff should have time to be hands-on and start using the new system to learn by doing. You can also identify external specialised partners and internal champions early on to help bridge that gap.
2. Technical information gathering
In order to identify the execution strategy, you’ll want to answer the following question: Will your migration process focus on a solution layer or an end-to-end lift-and-shift approach? Going through some of the points below can make this decision simpler:
- Identify data sources for up and downstream applications
- Identify datasets, tables and schemas relevant for use cases
- Outline ETL/ELT tools and frameworks
- Define data quality and data governance solutions
- Identify Identity and Access Management (IAM) solutions
- Outline BI and reporting tools
Further, it is important to identify some of the functional requirements before making a decision around buy or build. Are there any out-of-the-box solutions available in the market that meet the requirements or will you need a custom-built solution to meet the challenges you’ve identified? Make sure you know whether this project is core to your business and would add value before deciding on the approach.
Once you’ve concluded the preparation and discovery phase, you’ll have some solid guidance on which components you’ll be replacing or refactoring with a move to a cloud data warehouse.
Leave a Reply