Manage files in HDFS using WEBHDFS REST APIs
Web services have become indispensable, in the current trend of
development of applications, for exchange of data across the
applications / web applications. Various application programming
interfaces (APIs) are emerging to expose Web services. Representational
state transfer (REST), used by browsers, is logically the choice for
building APIs.
Web services have become indispensable, in the current trend of development of applications, for exchange of data across the applications / web applications. Various application programming interfaces (APIs) are emerging to expose Web services. Representational state transfer (REST), used by browsers, is logically the choice for building APIs.
I share my understanding and experience in using WEBHDFS REST API.
What is WebHDFS?
Jersey RESTful Web Services framework is open source for developing RESTful Web Services in Java and provides support for JAX-RS APIs and serves as a JAX-RS
Stack I used here is:
Step 3:
• Write a Service method to upload file to HDFS using HttpUrlConnection
• WebHDFSFileUploadService.java
Step 4:
Test your application
• Make a war file of your application and deploy your application into Jetty server
• Open any rest console and provide below URL
o http://://work/createFile
o Inputs:
Header Param: . file_name : //Uploaded file with be saved with this name
Choose a file from rest console
• You will see sample-test.txt file in the given path
• To view the content of the file,
• It will display the file content on the screen
Thus upload of the file to HDFS using WebHDFS API from a Java Jersey Application is successful.
Here I used HttpURLConnection to upload my files to HDFS
Web services have become indispensable, in the current trend of development of applications, for exchange of data across the applications / web applications. Various application programming interfaces (APIs) are emerging to expose Web services. Representational state transfer (REST), used by browsers, is logically the choice for building APIs.
I share my understanding and experience in using WEBHDFS REST API.
What is WebHDFS?
- Hadoop provides a Java native API to support file system operations such as create, rename or delete files and directories, open, read or write files, set permissions, etc.
- This is perfectly useful for applications running within the Hadoop cluster, in cases where an external application needs to exchange data/files in HDFS and need to perform operations like create directories and write files to that directory or read the content of a file stored on HDFS, a special API is required
- Hortonworks developed an additional API to support these requirements based on standard REST functionalities
- WebHDFS concept is based on HTTP operations like GET, PUT, POST and DELETE.
- Authentication can be based on user.name query parameter (as part of the HTTP query string) or if security is turned on then it relies on Kerberos.
- Not much configuration needed to enable WebHDFS. Only we have to add the property shown below in the hdfs-site.xml
- Regular operations like Create a directory, List all directories, Open a directory/file, delete a directory/file and etc.. are straight forward operations
- We have to give appropriate operator for op=<operation_type> in the WebHDFS URL
- create/upload a file to HDFS is a little complex. So let us see how to upload a file into HDFS using WebHDFS Rest API
Jersey RESTful Web Services framework is open source for developing RESTful Web Services in Java and provides support for JAX-RS APIs and serves as a JAX-RS
Stack I used here is:
- Java version 1.7
- Jersey version 1.16
- Hadoop version 2.6.0
- Jetty version 9.3.11
- Please create a Java web project. It should be Jersey rest application
- Write a controller createFile/uploadFile()
- java
• Write a Service method to upload file to HDFS using HttpUrlConnection
• WebHDFSFileUploadService.java
Test your application
• Make a war file of your application and deploy your application into Jetty server
• Open any rest console and provide below URL
o http://://work/createFile
o Inputs:
Header Param: . file_name : //Uploaded file with be saved with this name
Choose a file from rest console
- Example: Input I gave here is:
- file_name : test1/files/sample-test.txt
- file : test-file.txt (Selecting this file from local directory)
- I have chosen the file test-file.txt from local directory. This file will be saved to HDFS in the given path: /test1/files/sample-test.txt
- Then check whether the file is uploaded in the given path, using Hadoop fs command
- Login to putty and give the command:
• To view the content of the file,
Thus upload of the file to HDFS using WebHDFS API from a Java Jersey Application is successful.
Here I used HttpURLConnection to upload my files to HDFS
Commentaires
Enregistrer un commentaire