WANdisco LiveHive for Apache Hive

Consistent Hive metadata

Get Started

What is Apache Hive?

Apache Hive is a data warehouse system built on top of Apache Hadoop that allows easy data querying, analysis and reporting of massive datasets distributed across various systems, file stores and databases, built with Hadoop.

It is designed to offer an abstraction that supports applications that want to use data residing in a Hadoop cluster in a structured manner, allowing ad-hoc querying, summarization and other data analysis tasks to be performed using high-level constructs, including Apache Hive SQL queries.

What is WANdisco LiveHive?

Consistent Hive metadata

The WANdisco Fusion Plugin for Live Hive extends the capabilities of WANdisco Fusion to allow your Hive infrastructure to participate fully in a LiveData platform. Give your Hadoop clusters a shared Hive metastore without the cost of single points of failure, degraded performance or administrative headaches. Replicate Hive metadata as it changes in any cluster, with strong consistency among all environments, and selective replication based on matching databases, tables and file system locations.

Always consistent queries

Share the same Hive definitions across multiple environments, regardless of where and when changes are made. Dramatically simplify the configuration of metadata replication with a LiveData platform, so that all applications have access to the same Hive tables wherever they are required.

Read the docs
What is WANdisco LiveHive for Apache Hive?

Guaranteed data consistency

Query your Hive data from any cluster with the same results everytime, everywhere. Ingest data, alter tables, create new Hive representations and maintain consistent results at all times.

Always available

Never worry about periods of time where Hive representations may differ among clusters because of periodic replication. Replicate your changes as they occur, without conflict among environments.

Automatic recovery

Recover from network or system outages automatically without the risk of introducing metadata inconsistencies. Accommodate your planned and unplanned outages with ease, and reduce administration costs.

Simple administration and integration

Extend an existing WANdisco Fusion deployment with the WANdisco Fusion Plugin for Live Hive without downtime or disruption. Take advantage of LiveData replication for Hive metadata without changing Hive applications or each cluster’s Hive metastore. Use simple replication rules to define which Hive databases, tables and file system locations are replicated with strong consistency.


Supported Environments

Hadoop

  • CDH 5.9+
  • HDP 2.6+

Operating Systems

  • RHEL 6.1+ (x86-64)
  • CentOS 6, 7 (x86-64)
  • Ubuntu 12.04, 14.04 (x86-64)
  • SLES 11+ (x86-64)

Apache Hive Replication Across Cloud Environments

The Live Hive Proxy is a WANdisco service that is deployed with Live Hive, acting as a proxy for applications that use a standalone Hive Metastore. The service coordinates actions performed against the Metastore with actions within clusters in which associated Hive metadata are replicated.

Below is a video demonstration of the WANdisco Fusion LiveHive plugin in action.

Get Started

Get Started

Schedule a 1-on-1 consultation to see how we can help your enterprise.

Get Started

Get Started

keyboard_arrow_down

Request a free trial and demonstration of WANdisco's products now.

We use cookies to personalize content and to analyze our traffic. By continuing to browse WANdisco.com you agree to the use of cookies for analytical purposes.