Hive Installation and Quick Start Guide

mai 20, 2017

1. Objective

This Hive tutorial contains simple steps for installing and running hive on Ubuntu. Hive is a datawarehousing infrastructure on the top of Hadoop. This hive quickstart will help you setup and configure hive and run several Hive QL queries to learn the concepts of hive.

2. Introduction

Apache Hive is a warehouse infrastructure designed on high of Hadoop for providing information summarization, query, and ad-hoc analysis. Hence, in order to get your Hive running successfully, Java and Hadoop ought to be pre-installed and should be functioning well on your Linux OS. For installation procedure of Java and Hadoop you can refer Hadoop installation Guide

3. Hive Installation

Now in order to get Hive successfully installed on your system, please follow the below steps and execute them on your Linux OS:

3.1. Download Hive

In this tutorial we will use hive-0.13.1-cdh5.3.2. (you can also use any latest version of hive) Download hive using below mentioned link: http://apache.petsads.us/hive/hive-0.13.1-cdh5.3.2/ apache-hive-0.13.1-cdh5.3.2.tar.gz. This file gets downloaded on your Downloads directory.

After the successful download of Hive, we will get the following response:

apache-hive-0.13.1-cdh5.3.2 hive-0.13.1-cdh5.3.2.tar.gz

3.1.1. Untar the file

Move the setup file in home directory and untar/unzip the downloaded file by executing the below command:

$ tar zxvf hive-0.13.1-cdh5.3.2.tar.gz

3.2. Setting up Hive Environment Variables

3.2.1. Editing .bashrc file

In order to set up the Hive environment we need to append the following lines at the end of the ~/.bashrc file.

export HADOOP_USER_CLASSPATH_FIRST=true

export PATH=$PATH:$HIVE_HOME/bin

export HADOOP_HOME=/home/dataflair/hadoop-2.6.0-cdh5.5.1

export HIVE_HOME=/home/dataflair/hive-0.13.1-cdh5.3.2

Note: Here enter correct name & version of your hive and correct path of your Hive File “/home/dataflair/hive-0.13.1-cdh5.3.2” this is the path of my Hive File and “hive-0.13.1-cdh5.3.2” is the name of my hive file. So please enter correct path and name of your Hive file. After adding save this file.

And in order to execute this file use the following command:

$ source ~/.bashrc

4. Launching HIVE

$ hive

The following output gets displayed:

Logging initialized using configuration in jar:file:/home/dataflair/HADOOP/hive-0.13.1-cdh5.3.2/lib/hive-common-0.13.1-cdh5.3.2.jar!/hive-log4j.properties

hive>

5. Exit from Hive:

hive> exit;

Congratulations!! Hive gets successfully installed on your system. Now you can easily execute your commands.

Before using hive you should change the meta-store layer of hive, follow this tutorial to change meta-store of hive from derby to MySQL.

6. Hive Queries

Below are the some basic Hive queries which you will need while using Hive.

6.1. Show Databases

Syntax:

show databases;

Usage:

show databases;

This query gives a list of databases which are present in your Hive. If you had newly installed Hive and had not created any database, then by default a database named “default” is present there and would be shown up after executing above query.

6.2. Create Database

Syntax:

create database_name;

Usage:

create database test;

This will create a new database named “test”. And you can check this database by writing “show databases;” query.

6.3. Use

USE query is used to use the database created by you.

Syntax:

USE database_name;

Usage:

USE test;

6.4. Current Database

Syntax:

set hive.cli.print.current.db=true;

It is used to know the name of database in which you are currently working.

6.5. DROP

DROP query is used to delete a database

Syntax:

DROP database database_name;

Usage:

DROP database test1;

6.6. CREATE TABLE

This command is used to create new table.

Syntax:

CREATE TABLE TABLE_NAME (Parameters)

COMMENT ‘Employee details’

ROW FORMAT DELIMITED

FIELDS TERMINATED BY ‘\t’

LINES TERMINATED BY ‘\n’

STORED AS TEXTFILE;

Usage:

create table employee ( Name String comment ‘Employee Name’, Id int, MobileNumber String, Salary Float) row format delimited fields terminated by ‘,’ lines terminated by ‘\n’ stored as textfile;

6.7. View tables

Syntax:

show tables;

It will list you all the tables created by you on the current directory.

6.8. Alter Table

It is used to change attributes inside a table.

Syntax: We can change a number of attributes inside a table what we want to change.

ALTER TABLE TableName RENAME TO new_name

ALTER TABLE TableName ADD COLUMNS (col_spec[, col_spec ...])

ALTER TABLE TableName DROP [COLUMN] column_name

ALTER TABLE TableName CHANGE column_name new_name new_type

ALTER TABLE TableName REPLACE COLUMNS (col_spec[, col_spec ...])

Usage:

ALTER TABLE employee RENAME TO demo1;

6.9. Describe table

Syntax:

desc TableName;

Usage:

desc employee;

This command gives a description of the parameters inside the table.

6.10. Load data

Syntax:

LOAD DATA LOCAL INPATH 'Path of the File' OVERWRITE INTO TABLE 'Name of the Table';

Usage:

LOAD DATA LOCAL INPATH '/home/dataflair/Desktop/details.txt' OVERWRITE INTO TABLE employee;

This command loads the data from your file path to the selected table created by you in Hive.

Rechercher dans ce blog

Big data

Hive Installation and Quick Start Guide

1. Objective

2. Introduction

3. Hive Installation

3.1. Download Hive

3.1.1. Untar the file

3.2. Setting up Hive Environment Variables

3.2.1. Editing .bashrc file

4. Launching HIVE

5. Exit from Hive:

6. Hive Queries

6.1. Show Databases

6.2. Create Database

6.3. Use

6.4. Current Database

6.5. DROP

6.6. CREATE TABLE

6.7. View tables

6.8. Alter Table

6.9. Describe table

6.10. Load data

Commentaires

Enregistrer un commentaire

Posts les plus consultés de ce blog

Controlling Parallelism in Spark by controlling the input partitions by controlling the input partitions

Spark optimization

Spark performance optimization: shuffle tuning