Hello World with Apache Sentry

Hello World with Apache Sentry

Apache Sentry

Sentry is Apache project for role based authorization in hadoop. Sentry works pretty well with Apache Hive.In this blog we will talk about creating a policy in Sentry using Beeline(HiveServer2) shell.



Pre-requisite : I am having Cloudera VM with Sentry installed on it.

Hive authorization is done by creating policies in sentry.
Sentry policy can be created by Sentry Admins. We need to create sentry admin group and add that group into Sentry Admin list using cloudera manager(in sentry-site.xml). Lets create user sentryAdmin with group as sentryAdmin. Fire below command on linux.

useradd sentryAdmin

Now lets Add this group to sentry admin list.

Go to Cloudera Manager - Sentry - Configuration .
Select Sentry(Service-wide) from Scope and Main from cataegory.
Add sentryAdmin in Admins Groups(sentry.service.admin.group)
Restart Sentry service.



Its time to create a policy for user. Now lets say that I have a database in Hive and I want to give read permission to group employee1
.
Now I will create a policy for employee1 using sentryAdmin.

Sentry policy creation is a three step process.
  • Role Creation
  • Assign Role to Group
  • Assign Permission to Role
Lets follow below steps :

Create user employee1 and by default in linux it will be assigned to employee1 group.

useradd employee1

Login to Beeline using employee1

beeline
!connect jdbc:hive2://localhost:10000
username : employee1
password : <Keep it Blank>

Type show databases to make employee1 does not have permission to read anything at this moment.


Login to Beeline using sentryAdmin

beeline
!connect jdbc:hive2://localhost:10000
username : sentryAdmin
password : <Keep it Blank>

Create Role in sentry.

CREATE ROLE Role1;


Assign this role to groups for granting required privileges.

GRANT ROLE Role1 TO GROUP employee1;

Grant/REVOKE privileges to created Role on required objects (Database, Table, column etc.)


GRANT select ON database department TO ROLE Role1;

Now Re -login with employee1 and try show databases command in beeline. You should be able to see department database in output.


Happy Coding....!!!!!

Commentaires

Posts les plus consultés de ce blog

Controlling Parallelism in Spark by controlling the input partitions by controlling the input partitions

Spark performance optimization: shuffle tuning

Spark optimization