HBase Secure Configuration

HBase Secure Configuration

It explains how to manually configure Kerberos for HBase.
Before implementing this step, please complete steps in  Installing the MIT Kerberos 5 KDC , Configuring Kerberos for HDFS and YARN and Zookeeper Secure Configuration.
This article refers to Pivotal HD Enterprise 2.0.1 <Stack and Tools Reference>, but the steps should be similar among Apache Hadoop enterprise editions.

In this example, the HBase architecture is:
HBase Masterhdm
Region Serverhdw1,hdw2,hdw3

1. Create the HBase Principals

For the HBase master and each region server host run:
1
kadmin.local: addprinc -randkey hbase/host_fqdn@REALM
Where host_fqdn refers to the service principal (master, regionserver) host.
eg:
1
2
3
4
kadmin.local: addprinc -randkey hbase/hdm.xxx.com@OPENKBINFO.COM
kadmin.local: addprinc -randkey hbase/hdw1.xxx.com@OPENKBINFO.COM
kadmin.local: addprinc -randkey hbase/hdw2.xxx.com@OPENKBINFO.COM
kadmin.local: addprinc -randkey hbase/hdw3.xxx.com@OPENKBINFO.COM

2. Create the HBase Keytab files

For the HBase master and each region server host run:
1
kadmin.local: ktadd -norandkey -k /etc/security/phd/keytab/hbase-hostid.service.keytab hbase/host_fqdn@REALM
eg:
1
2
3
4
ktadd -norandkey -k /etc/security/phd/keytab/hbase-hdm.service.keytab  hbase/hdm.xxx.com@OPENKBINFO.COM
ktadd -norandkey -k /etc/security/phd/keytab/hbase-hdw1.service.keytab hbase/hdw1.xxx.com@OPENKBINFO.COM
ktadd -norandkey -k /etc/security/phd/keytab/hbase-hdw2.service.keytab hbase/hdw2.xxx.com@OPENKBINFO.COM
ktadd -norandkey -k /etc/security/phd/keytab/hbase-hdw3.service.keytab hbase/hdw3.xxx.com@OPENKBINFO.COM

3. Distribute the HBase Keytab Files

For each host:
Move the appropriate keytab file for each host to that hosts' /etc/security/phd/keytab directory, then run:
1
2
3
chown hbase:hadoop hbase-hostid.service.keytab
chmod 400 hbase-hostid.service.keytab
ln -s hbase-hostid.service.keytab hbase.service.keytab
eg:
1
2
3
4
5
6
7
8
9
cd /etc/security/phd/keytab
scp hbase-hdw1*.keytab hdw1:/etc/security/phd/keytab/
scp hbase-hdw2*.keytab hdw2:/etc/security/phd/keytab/
scp hbase-hdw3*.keytab hdw3:/etc/security/phd/keytab/
scp hbase-hdm*.keytab hdm:/etc/security/phd/keytab/
massh ~/hostfile_all verbose "chown hbase:hadoop /etc/security/phd/keytab/hbase*.keytab"
massh ~/hostfile_all verbose "chmod 400 /etc/security/phd/keytab/hbase*.keytab"
massh ~/hostfile_all verbose "cd /etc/security/phd/keytab/; if ls hbase*.service.keytab &> /dev/null; then ln -s hbase*.service.keytab hbase.service.keytab ; fi"
massh ~/hostfile_all verbose "ls -altr /etc/security/phd/keytab/hbase*.keytab"

4. Edit the HBase Site XML

For each master and region server host add to /etc/gphd/hbase/conf/hbase-site.xml:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
<property>
 <name>hbase.security.authentication</name>
 <value>kerberos</value>
</property>
 
<property>
 <name>hbase.security.authorization</name>
 <value>true</value>
</property>
 
<property>
 <name>hbase.coprocessor.region.classes</name>
 <value>org.apache.hadoop.hbase.security.token.TokenProvider</value>
</property>
 
<!-- HBase secure region server configuration -->
<property>
 <name>hbase.regionserver.kerberos.principal</name>
 <value>hbase/_HOST@OPENKBINFO.COM</value>
</property>
 
<property>
 <name>hbase.regionserver.keytab.file</name>
 <value>/etc/security/phd/keytab/hbase.service.keytab</value>
</property>
 
<!-- HBase secure master configuration -->
<property>
 <name>hbase.master.kerberos.principal</name>
 <value>hbase/_HOST@OPENKBINFO.COM</value>
</property>
 
<property>
 <name>hbase.master.keytab.file</name>
 <value>/etc/security/phd/keytab/hbase.service.keytab</value>
</property>
Copy other other Hbase hosts:
1
2
3
scp /etc/gphd/hbase/conf/hbase-site.xml hdw1:/etc/gphd/hbase/conf/hbase-site.xml
scp /etc/gphd/hbase/conf/hbase-site.xml hdw2:/etc/gphd/hbase/conf/hbase-site.xml
scp /etc/gphd/hbase/conf/hbase-site.xml hdw3:/etc/gphd/hbase/conf/hbase-site.xml

5. Test HBase Start-Up

You can now test HBase start-up. Start the cluster services and check that the HBase Master and Regionservers start properly.
If they do not, look at the .log file in the /var/log/gphd/hbase/ directory for hints as to why.
Make sure HDFS came up properly.
As you fix issues you can run below commands to check that the issue is resolved.
1
2
/etc/init.d/hbase-master start
/etc/init.d/hbase-regionserver start

6. HBase with Secure Zookeeper Configuration

For secure HBase you should also run a secure Zookeeper.If you do so you will need to execute the steps in this section. These steps must be done on the HBase master and all region servers.

6.1 Create a file /etc/gphd/hbase/conf/jaas.conf and the following:

1
2
3
4
5
6
7
Client {
 com.sun.security.auth.module.Krb5LoginModule required
 useKeyTab=true
 useTicketCache=false
 keyTab="/etc/security/phd/keytab/hbase.service.keytab"
 principal="hbase/host_fqdn@REALM";
};
Important: Make sure to replace host_fqdn@REALM with the host_fqdn of the server and the correct
REALM.
eg:
1
2
3
4
5
6
7
Client {
 com.sun.security.auth.module.Krb5LoginModule required
 useKeyTab=true
 useTicketCache=false
 keyTab="/etc/security/phd/keytab/hbase.service.keytab"
 principal="hbase/hdm.xxx.com@OPENKBINFO.COM";
};

6.2 Add the following near at the bottom of /etc/gphd/hbase/conf/hbase-env.sh

1
2
export HBASE_OPTS="$HBASE_OPTS -Djava.security.auth.login.config=/etc/gphd/hbase/conf/jaas.conf"
export HBASE_MANAGES_ZK=false
Copy to other hosts:
1
2
3
scp /etc/gphd/hbase/conf/hbase-env.sh hdw1:/etc/gphd/hbase/conf/hbase-env.sh
scp /etc/gphd/hbase/conf/hbase-env.sh hdw2:/etc/gphd/hbase/conf/hbase-env.sh
scp /etc/gphd/hbase/conf/hbase-env.sh hdw3:/etc/gphd/hbase/conf/hbase-env.sh

6.3 Edit the /etc/gphd/hbase/conf/hbase-site.xml and add below entries.

(If they exist, then skip this step.)
1
2
3
4
5
6
7
8
9
<property>
 <name>hbase.zookeeper.quorum</name>
 <value>comma-separated-list-of-zookeeper-hosts</value>
</property>
 
<property>
 <name>hbase.cluster.distributed</name>
 <value>true</value>
</property>
eg:
1
2
3
4
5
6
7
8
<property>
    <name>hbase.zookeeper.quorum</name>
    <value>hdm.xxx.com,hdw1.xxx.com,hdw3.xxx.com</value>
</property>
<property>
    <name>hbase.cluster.distributed</name>
    <value>true</value>
</property>

6.4 Edit /etc/gphd/zookeeper/conf/zoo.cfg and add below entries.

1
2
kerberos.removeHostFromPrincipal=true
kerberos.removeRealmFromPrincipal=true
Scp to other hosts:
1
2
3
scp /etc/gphd/zookeeper/conf/zoo.cfg hdw1:/etc/gphd/zookeeper/conf/zoo.cfg
scp /etc/gphd/zookeeper/conf/zoo.cfg hdw2:/etc/gphd/zookeeper/conf/zoo.cfg
scp /etc/gphd/zookeeper/conf/zoo.cfg hdw3:/etc/gphd/zookeeper/conf/zoo.cfg

7. Restart Cluster

1
2
icm_client stop -l <Cluster Name>
icm_client start -l <Cluster Name>
You should see similar logs from hbase master or region server:
INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=hdw1.xxx.com:2181,hdm.xxx.com:2181,hdw3.xxx.com:2181 sessionTimeout=180000 watcher=master:60000
INFO org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: The identifier of this process is 362761@hdm
INFO org.apache.zookeeper.Login: successfully logged in.
INFO org.apache.zookeeper.Login: TGT refresh thread started.
INFO org.apache.zookeeper.client.ZooKeeperSaslClient: Client will use GSSAPI as SASL mechanism.
INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server hdm.xxx.com/192.168.192.101:2181. Will attempt to SASL-authenticate using Login Context section 'Client'
INFO org.apache.zookeeper.ClientCnxn: Socket connection established to hdm.xxx.com/192.168.192.101:2181, initiating session
INFO org.apache.zookeeper.Login: TGT valid starting at:        Thu xxx xx 11:15:41 PDT 2014
INFO org.apache.zookeeper.Login: TGT expires:                  Fri xxx xx 11:15:41 PDT 2014
INFO org.apache.zookeeper.Login: TGT refresh sleeping until: Fri xxx xx 06:52:49 PDT 2014
INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server hdm.xxx.com/192.168.192.101:2181, sessionid = 0x246914a4f9e0000, negotiated timeout = 40000

Commentaires

  1. Nice and good article. It is very useful for me to learn and understand easily. Thanks for sharing your valuable information and time. Please keep updating Hadoop Administration Online Training

    RépondreSupprimer

Publier un commentaire

Posts les plus consultés de ce blog

Spark performance optimization: shuffle tuning

Spark optimization

Use Apache Spark to write data to ElasticSearch