26 JUNE 2013
WANdisco announces support for In-Memory Data Processing Technologies, Spark and Shark
WANdisco (LSE: WAND), a provider of high-availability software for global enterprises to meet the challenges of Big Data and distributed software development, today announced a technology preview of the Spark application and its associated data warehousing system Shark available as an add-on to WANdisco Distro 3.6.
Spark offers significant new computation model capabilities and performance enhancements compared to MapReduce by utilizing in-memory data storage to perform fast iterative queries. In initial testing, Spark has delivered speeds up to 100 times faster than MapReduce. Spark is capable of reading and writing to any Hadoop-supported filesystem.
Shark, which is designed for compatibility with existing Apache Hive, is a large-scale data warehousing system that runs on Spark. Shark addresses Hive limitations without the need for additional development by supporting standard SQL, Hive’s query language (HiveQL), metastore, serialization formats, and user-defined functions. Shark also allows users to cache their data sets in-memory, enabling increased efficiency while providing maximum performance.
"Spark and Shark represent the next leap forward in data processing performance," said David Richards, CEO of WANdisco. "With this technology preview, users will be able to vastly improve the efficiency of their existing Hadoop deployments without the need for additional configuration or expenditure."
Development of Spark for YARN is still in progress and will be made available as soon as it has undergone further quality assurance testing.
"WANdisco is showing strong support of Spark and Shark," said Ion Stoica, co-director, of AMPlab of UC Berkeley Electrical Engineering and Computer Sciences, where Spark, Shark and Mesos have been developed. "This helps to validate the technology in the marketplace and reduces data analysis time, allowing businesses to bring interactive data analytics into Big Data space."
Spark will be available for free download with WDD in 30 to 60 days.
WANdisco is the world leader in Active Data Replication™. Its patented WANdisco Fusion technology enables the replication of continuously changing data to the cloud and on-premises data centers with guaranteed consistency, no downtime and no business disruption. It also allows distributed development teams to collaborate as if they are all working in one location. WANdisco has an OEM with IBM as well as partnerships with Amazon Web Services, Cisco, Google Cloud, Hewlett Packard Enterprise, Microsoft Azure, and Oracle to resell its patented technology. WANdisco also works directly with Fortune 1000 companies around the world to ensure their data can give them the real insight they need.
For additional information, please visit www.wandisco.com.
VP Marketing & Communications