Move your data and metadata to the cloud or between clusters, with no downtime and no service disruption
“WANdisco's uniqueness lies in how it packages Hadoop data migration as a fully hands-off service. Moving data under active change is delicate, and organizations don't want to use their best IT people on it. WANdisco's Data Migrator handles everything in the background and doesn't require expertise from the customer. It's as close to a silver bullet as you can find for this type of project”
Merv Adrian, Gartner Research Vice President of Data and Analytics
What is Data Migrator?
Migrate from Hadoop to cloud without disruption or downtime.
Data Migrator is a fully automated cloud migration solution that migrates HDFS data and Hive metadata to the cloud, even while those data sets are under active change. Data Migrator is fully self-service requiring no WANdisco expertise. It requires zero changes to applications or business operation. Migrations of any scale can begin immediately, and be performed while the source data is under active change without requiring any production system downtime or business disruption, and with zero risk of data loss.
Enabling administrators to easily deploy the solution and begin migration of data lake content to the cloud immediately. It is entirely non-intrusive and requires zero changes to applications, cluster or node configuration or operation.
Leveraging WANdisco's live data capabilities, data migration can occur while the source data is under active change, without requiring any production system downtime or business disruption, supporting complete and continuous data migration.
Data Migrator is able to accommodate data migration at any scale, from terabytes to exabytes, and without any risk of data loss.
Benefits of Data Migrator
Data Migrator enables you to transition to a live data environment which makes your data globally available, accurate and protected, avoiding the costs of a manual migration and the pattern of data silos that emerge when data cannot be kept consistent.
- No need for downtime of on-premises production clusters
- Immediate availability of migrated data
- High scalability and performance for migration at any scale
Complete and Continuous Migration
- Data migration with single pass of source storage
- Ongoing migration of any subsequent data changes
- Ensures zero data loss of source data and changes
- Minimizes the need for IT resource involvement
- Automated migration without custom code maintenance
- Faster time-to-value and adoption of AI and ML
Data Migrator Automates Cloud Migration
Zero Business Disruption, Zero Risk, and Best Time-to-Value
Quick deployment and operation:
Data Migrator is installed on an edge node of your Hadoop cluster. Deployment can be performed in minutes without impact to current operations, so users can begin migrations immediately.
Self-service user experience:
Migrations are designed to be easy to configure and perform, requiring simple definition of your target environment and full control of exactly what data to migrate and what data to exclude.
Complete and continuous migration:
Migrates existing data sets with a single pass through the source storage system, eliminating the overhead of repeated scans, while also supporting continuous migration of any ongoing changes from source to target with zero disruption to current production systems.
Hadoop data and Hive metadata migration:
Supports migration of HDFS data and Hive metadata to public cloud, as well as to other on-premises environments.
Multiple source and target systems support:
Supports HDFS distributions v2.6 and higher as source systems, and all leading cloud service providers and other select ISVs such as Databricks and Snowflake as the target systems. See the Data Migrator documentation and release notes for details.
Migration at any scale:
Migrates big data sets at any scale, from terabytes to multi petabytes, without impact to current production environments. Begin risk free for small migrations and scale up to multi petabyte initiatives without needing any additional installation requirements.
Browser-based user interface:
Users can leverage the WANdisco UI, a browser-based user interface that allows them to manage the full data migration (data and metadata) from the single management console.
Migrations can also be managed through a comprehensive and intuitive command-line interface or using the self-documenting REST API to integrate the solution with other programs as needed.
Configurability and control:
Ability to configure the migrations to meet the organizations specific needs. Including standard configuration such as defining sources, targets, and data to be migrated, as well as advanced capabilities such as path mapping and network bandwidth management controls.
Metrics and monitoring:
Information to keep you updated on the migration jobs, from health and status metrics providing estimates for migration completion to email notifications and real-time insights regarding usage and promote hands-off operations.
The Data Migrator Approach
Only Data Migrator is able to move data lake content to the cloud immediately, at scale, with no application downtime and no risk of data loss, even when data sets are under active change.
Other approaches to large-scale Hadoop data migration rely on repeated iterations where source data is copied, but they do not take ongoing changes into account during that time. They require significant up-front planning, and impose operation downtime if there is a need for ensuring data are migrated completely.