Configure Apache Phoenix in CDH 5.4

mai 23, 2017

Apache Phoenix is an open source, relational database layer on top of noSQL store such as Apache HBase. Phoenix provides a JDBC driver that hides the intricacies of the noSQL store enabling users to create, delete, and alter SQL tables, views, indexes, and sequences; upsert and delete rows singly and in bulk; and query data through SQL.
Installation:
Following are the steps that need to be followed to configure Apache Phoenix in Cloudera Distribution for Hadoop (CDH)
1. Login to Cloudera Manager, click on Hosts, then Parcels.
2. Select Edit Settings.
3. Click the + sign next to an existing Remote Parcel Repository URL, and add the URL: http://archive.cloudera.com/cloudera-labs/phoenix/parcels/1.0/ Click Save Changes.
4. Select Hosts, then Parcels.
5. In the list of Parcel Names, CLABS_PHOENIX is now available. Select it and choose Download.
6. The first cluster is selected by default. To choose a different cluster for distribution, select it. Find CLABS_PHOENIX in the list, and click Distribute.
7. If you to use secondary indexing, add the following to the hbase-site.xml advanced configuration snippet. Go to the HBase service, click Configuration, and choose/search for HBase Service Advanced Configuration Snippet (Safety Valve) for hbase-site.xml. Paste in the following XML, then save the changes.

<property>
    <name>hbase.regionserver.wal.codec</name>
    <value>org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec</value>
</property>

<name>hbase.regionserver.wal.codec</name>

<value>org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec</value>

</property>

8. Restart the HBase service.
Using Apache Phoenix Utilities:
Several command-line utilities for Apache Phoenix are installed into /usr/bin.
Prerequisites:
Before using the Phoenix utilities, set the JAVA_HOME environment variable in your terminal session, and ensure that the java executable is in your path. Adjust the following commands to your operating system’s configuration.

$ export JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera
$ export PATH=$PATH:$JAVA_HOME/bin

1 2	$ export JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera $ export PATH=$PATH:$JAVA_HOME/bin

phoenix-sqlline.py
A command-line interface to execute SQL from the command line. It takes a single argument, which is the ZooKeeper quorum of the corresponding HBase cluster. For example:

$ /usr/bin/phoenix-sqlline.py zookeeper01.test.com:2181

1	$ /usr/bin/phoenix-sqlline.py zookeeper01.test.com:2181

phoenix-psql.py
A command-line interface to load CSV data or execute SQL scripts. It takes two arguments, the ZooKeeper quorum and the CSV or SQL file to process. For example:

$ /usr/bin/phoenix-psql.py zookeeper01.test.com:2181 create_stmts.sql data.csv
$ /usr/bin/phoenix-psql.py zookeeper01.test.com:2181 create_stmts.sql query.sql

1 2	$ /usr/bin/phoenix-psql.py zookeeper01.test.com:2181 create_stmts.sql data.csv $ /usr/bin/phoenix-psql.py zookeeper01.test.com:2181 create_stmts.sql query.sql

phoenix-performance
A command-line interface to create a given number of rows and run timed queries against the data. It takes two arguments, the ZooKeeper quorum and the number of rows to create. For example:

$ /usr/bin/phoenix-psql.py zookeeper01.test.com:2181 100000

1	$ /usr/bin/phoenix-psql.py zookeeper01.test.com:2181 100000

References:
https://en.wikipedia.org/wiki/Apache_Phoenix
https://phoenix.apache.org/

Rechercher dans ce blog

Big data

Configure Apache Phoenix in CDH 5.4

Commentaires

Enregistrer un commentaire

Posts les plus consultés de ce blog

Controlling Parallelism in Spark by controlling the input partitions by controlling the input partitions

Spark optimization

Spark performance optimization: shuffle tuning