18 Sep 2019David Richards, Co-founder & CEO, WANdisco

WANdisco LiveAnalytics and Databricks: Running Cloud-based Analytics While Ingesting Data from Hadoop On-premises

When we introduced LiveMigrator in June, our first customers were delighted. They were able to move petabytes of unstructured data in a single pass from on-premises data centers to the cloud without blocking. Their applications could continue to access data in on-premises environments, even as their data was moving to the cloud. We had solved a problem that had vexed customers for some time and provided an accelerated path to cloud adoption.

But after the initial excitement, our customers began asking if we could go a step further. They pointed out that while moving the data overcame a huge challenge, they wanted to be able to run continuous analytics while moving to new Spark-based analytics in the cloud to avoid inconsistencies caused by meta data transformation. So, while they were eager to modernize their analytics platform to one that was Spark-based, they were worried about the potential delay in transforming the data and worried about the new data that was being ingested during the migration process. The end goal for customers was to derive business insights and competitive advantages from machine learning and AI running in the cloud to leverage all of its data. The only previous way of doing this was through batch-based data transfers which would lead to inconsistent analytics results based on stale data.

To address this challenge, WANdisco looked for a partner that could help give our customers' stronger and more up-to-date business intelligence results, and Databricks was a natural fit. The company was founded by the original creators of Apache Spark and had grown rapidly due to the platform’s in-memory processing in the cloud. WANdisco coordinated its efforts with Databricks to create LiveAnalytics, a platform that gives enterprises the capacity to seamlessly run cloud-based analytic applications as data is processed on-premises, providing much higher performance levels at a lower cost. 

With LiveAnalytics, Databricks users don’t have to wait for their metadata from on-premises to move to the cloud. There is no need to make sacrifices for the sake of enterprises’ application infrastructure, and no changes need to be made to the data pipeline. Enterprises can use Databricks to simply and immediately run analytics on petabytes of on-premises data as though it was already in the cloud. Data can continue to be processed during and after migration without completing a full-scale migration. In other words, both migrated and migrating data are immediately and continuously available for analysis.

What does this mean for enterprises? Most importantly, businesses can continue their operations as usual as they make the transition from Hadoop to the cloud. With minimal disruption when migrating between Hadoop and non-Hadoop environments, LiveAnalytics provides faster adoption of ML and AI for enterprises. They can benefit from modern data analytics infrastructure as provided by Spark and gain business insights to better compete in a data-driven world. 

Watch this video to see how LiveAnalytics works and enables continuous analytics as data migrates to the cloud.

About the author

Since co-founding WANDisco in Silicon Valley in 2005, David has led the company to rapid international expansion, opening offices in the UK, Japan and China. David spearheaded WANdisco to a successful listing on London Stock Exchange (WAND:LSE) and shortly afterward brought about the acquisition of AltoStor, which accelerated the development of WANdisco's first products for the big data market. David holds a BSc in Computer Science from the University of Huddersfield. In 2017, David was awarded an Honorary Doctorate by Sheffield Hallam University in recognition of him being a champion of British technology and a passionate advocate of entrepreneurship.


Email an Expert

Talk to us about making data movement reliable without downtime