Hive Installation and Quick Start Guide
This Hive tutorial contains simple steps for installing and running hive on Ubuntu. Hive is a datawarehousing infrastructure on the top of Hadoop. This hive quickstart will help you setup and configure hive and run several Hive QL queries to learn the concepts of hive.
Apache Hive is a warehouse infrastructure designed on high of Hadoop for providing information summarization, query, and ad-hoc analysis. Hence, in order to get your Hive running successfully, Java and Hadoop ought to be pre-installed and should be functioning well on your Linux OS. For installation procedure of Java and Hadoop you can refer Hadoop installation Guide
3. Hive Installation
Now in order to get Hive successfully installed on your system, please follow the below steps and execute them on your Linux OS:
3.1. Download Hive
In this tutorial we will use hive-0.13.1-cdh5.3.2. (you can also use any latest version of hive) Download hive using below mentioned link: http://apache.petsads.us/hive/hive-0.13.1-cdh5.3.2/ apache-hive-0.13.1-cdh5.3.2.tar.gz. This file gets downloaded on your Downloads directory.
After the successful download of Hive, we will get the following response:
3.1.1. Untar the file
Move the setup file in home directory and untar/unzip the downloaded file by executing the below command:
3.2. Setting up Hive Environment Variables
3.2.1. Editing .bashrc file
In order to set up the Hive environment we need to append the following lines at the end of the ~/.bashrc file.
Note: Here enter correct name & version of your hive and correct path of your Hive File “/home/dataflair/hive-0.13.1-cdh5.3.2” this is the path of my Hive File and “hive-0.13.1-cdh5.3.2” is the name of my hive file. So please enter correct path and name of your Hive file. After adding save this file.
And in order to execute this file use the following command:
4. Launching HIVE
The following output gets displayed:
5. Exit from Hive:
Congratulations!! Hive gets successfully installed on your system. Now you can easily execute your commands.
Before using hive you should change the meta-store layer of hive, follow this tutorial to change meta-store of hive from derby to MySQL.
6. Hive Queries
Below are the some basic Hive queries which you will need while using Hive.
6.1. Show Databases
This query gives a list of databases which are present in your Hive. If you had newly installed Hive and had not created any database, then by default a database named “default” is present there and would be shown up after executing above query.
6.2. Create Database
This will create a new database named “test”. And you can check this database by writing “show databases;” query.
USE query is used to use the database created by you.
6.4. Current Database
It is used to know the name of database in which you are currently working.
DROP query is used to delete a database
6.6. CREATE TABLE
This command is used to create new table.
6.7. View tables
It will list you all the tables created by you on the current directory.
6.8. Alter Table
It is used to change attributes inside a table.
Syntax: We can change a number of attributes inside a table what we want to change.
6.9. Describe table
This command gives a description of the parameters inside the table.
6.10. Load data
This command loads the data from your file path to the selected table created by you in Hive.