How to use Drill to parse ResourceManager Rest API results
How to use Drill to parse ResourceManager Rest API results
rs:http://www.openkb.info/search/label/drill
Goal:
ResourceManager REST API provides all detailed information regarding YARN applications, metrics of YARN cluster, etc.This article provides a simple demo on how to use Drill to query the result of the REST APIs.
One example use case is to show the largest YARN applications which are currently running.
Env:
Drill 1.8Hadoop 2.7.0
Solution:
1. Save the current output of RM REST API as a json file on MFS(or HDFS).
1
| curl - v -X GET -H "Content-Type: application/json" http: //s1 .poc.com:8088 /ws/v1/cluster/apps > /mapr/mysuper .cluster.com /tmp/restapi/data .json |
2. Use Drill to parse the json file to answer the question Cluster Admin want to ask.
For example, to show the largest YARN applications which are currently running:
1
2
3
4
5
6
7
| with tmp as ( select flatten(t.apps.app) as col from dfs.tmp.`restapi/data.json` t ) select tmp.col.id,tmp.col.` user ` as ` user `,tmp.col.runningContainers as `runningContainers`,tmp.col.allocatedMB as `allocatedMB`,tmp.col.allocatedVCores as `allocatedVCores` from tmp where tmp.col.state= 'RUNNING' order by tmp.col.runningContainers desc ; |
1
2
3
4
5
6
7
| + ---------------------------------+-------+--------------------+--------------+------------------+ | EXPR$0 | user | runningContainers | allocatedMB | allocatedVCores | + ---------------------------------+-------+--------------------+--------------+------------------+ | application_1475192050844_0003 | mapr | 4 | 16384 | 4 | | application_1475192050844_0004 | mapr | 1 | 2048 | 1 | + ---------------------------------+-------+--------------------+--------------+------------------+ 2 rows selected (0.525 seconds) |
Commentaires
Enregistrer un commentaire