WANdisco Announces Support for In-Memory Data Processing Technologies, Spark and Shark
San Jose (Hadoop Summit, Booth 50), CA – June 26, 2013 - WANdisco (LSE: WAND), a provider of high-availability software for global enterprises to meet the challenges of Big Data and distributed software development, today announced a technology preview of the Spark application and its associated data warehousing system Shark available as an add-on to WANdisco Distro 3.6.
Spark offers significant new computation model capabilities and performance enhancements compared to MapReduce by utilizing in-memory data storage to perform fast iterative queries. In initial testing, Spark has delivered speeds up to 100 times faster than MapReduce. Spark is capable of reading and writing to any Hadoop-supported filesystem.
Shark, which is designed for compatibility with existing Apache Hive, is a large-scale data warehousing system that runs on Spark. Shark addresses Hive limitations without the need for additional development by supporting standard SQL, Hive’s query language (HiveQL), metastore, serialization formats, and user-defined functions. Shark also allows users to cache their data sets in-memory, enabling increased efficiency while providing maximum performance.
"Spark and Shark represent the next leap forward in data processing performance," said David Richards, CEO of WANdisco. "With this technology preview, users will be able to vastly improve the efficiency of their existing Hadoop deployments without the need for additional configuration or expenditure."
Development of Spark for YARN is still in progress and will be made available as soon as it has undergone further quality assurance testing.
"WANdisco is showing strong support of Spark and Shark," said Ion Stoica, co-director, of AMPlab of UC Berkeley Electrical Engineering and Computer Sciences, where Spark, Shark and Mesos have been developed. "This helps to validate the technology in the marketplace and reduces data analysis time, allowing businesses to bring interactive data analytics into Big Data space."
Spark will be available for free download with WDD in 30 to 60 days.