Overview

Configuration is provided for establishing connections with the Hadoop WebHDFS service. The configuration is used in the REST Connection Manager.
Setup
Press
icon to get more information about the connection parameters.
Obtain data
Use REST Source component to get data from service resource.
Insert data
Use REST Destination component to insert data into service resource.
Manage remote files and directories
Use File Transfer Task to manage remote files and directories.
Quick Start
In this section, we will show you Step-by-Step how to create a connection to the Hadoop WebHDFS service using COZYROC's REST Connection Manager.
Bravo! You have learned how to authenticate to Hadoop WebHDFS.
In MIT-Kerberos scenarios you would need to use the delegation token.
Step 4. Refreshing a token is done with a simple command, that also has to be executed from an authenticated and authorized machine. That can be done easily with the COZYROC Powershell Task.
curl --negotiate -X PUT "http://[HadoopHost]:[NameNodePort]/webhdfs/v1/?op=RENEWDELEGATIONTOKEN&token=[Token]"
NOTE: Only the lifetime is extended, the token value remains the same.
Bravo! You have learned how to use a delegation token to authenticate to Hadoop WebHDFS.
Configuration
Base URL address: http://{HOST}:{PORT}/webhdfs/v1.
- Basic
-
The authentication uses a parameters-based authentication.
The authentication has the following user-defined parameters:
- permission: Specify permission query string. Optional.
- username: Specify user.name query string. Optional.
- doas: Specify proxy user query string. Optional.
The following request parameters will be automatically processed during the authentication process:
-
permission:
{{=connection.user.permission}} -
user.name:
{{=connection.user.username}} -
doas:
{{=connection.user.doas}}
Documentation: https://hadoop.apache.org/docs/r1.0.4/webhdfs.html#Authentication.
- Delegation token
-
The authentication uses a parameters-based authentication.
The authentication has the following user-defined parameters:
- delegation: Required. Specify Kerberos delegation token.
- permission: Specify permission query string. Optional.
- username: Specify user.name query string. Optional.
- doas: Specify proxy user query string. Optional.
The following request parameters will be automatically processed during the authentication process:
-
delegation:
{{=connection.user.delegation}} -
permission:
{{=connection.user.permission}} -
user.name:
{{=connection.user.username}} -
doas:
{{=connection.user.doas}}
Documentation: https://hadoop.apache.org/docs/r1.0.4/webhdfs.html#Authentication.
Based on resource template BaseWithPath.
- [Read] action
-
The result is extracted from:{{=[response.ContentSummary]}}.The following request parameters will be automatically processed:
-
op:
GETCONTENTSUMMARY
Documentation: https://hadoop.apache.org/docs/r1.0.4/api/org/apache/hadoop/fs/FileSystem.html#getContentSummary(org.apache.hadoop.fs.Path).
-
op:
- directoryCount
Template: Long.
- fileCount
Template: Long.
- length
Template: Long.
- quota
Template: Long.
- spaceConsumed
Template: Long.
- spaceQuota
Template: Long.
Based on resource template BaseWithPath.
- [Read] action
-
The result is extracted from:{{=[response.FileChecksum]}}.The following request parameters will be automatically processed:
-
op:
GETFILECHECKSUM
Documentation: https://hadoop.apache.org/docs/r1.0.4/webhdfs.html#GETFILECHECKSUM.
-
op:
- algorithm
Template: ShortText.
- bytes
Template: LongText.
- length
Template: Int4.
Based on resource template Base.
- [Read] action
-
Endpoint URL address:
/.
The result is extracted from:{{=[response]}}.The following request parameters will be automatically processed:
-
op:
GETHOMEDIRECTORY
Documentation: https://hadoop.apache.org/docs/r1.0.4/webhdfs.html#GETHOMEDIRECTORY.
-
op:
- Path
Template: ShortText.
Based on resource template Base.
- [Create] action
-
Endpoint URL address:
{{=item.path}}.The following request parameters will be automatically processed:
-
op:
SETPERMISSION -
permission:
{{=item.permission}}
Documentation: https://hadoop.apache.org/docs/r1.0.4/webhdfs.html#SETPERMISSION.
-
op:
- path
Template: ShortText.
- permission
Template: Int4.
Based on resource template Base.
- [Create] action
-
Endpoint URL address:
{{=item.path}}.The following request parameters will be automatically processed:
-
op:
SETOWNER -
owner:
{{=item.owner}} -
group:
{{=item.group}}
Documentation: https://hadoop.apache.org/docs/r1.0.4/webhdfs.html#SETOWNER.
-
op:
- path
Template: ShortText.
- owner
Template: ShortText.
- group
Template: ShortText.
Based on resource template Base.
- [Create] action
-
Endpoint URL address:
{{=item.path}}.The following request parameters will be automatically processed:
-
op:
SETREPLICATION -
replication:
{{=item.replication}}
Documentation: https://hadoop.apache.org/docs/r1.0.4/webhdfs.html#SETREPLICATION.
-
op:
- path
Template: ShortText.
- replication
Template: Int4.
Based on resource template Base.
- [Create] action
-
Endpoint URL address:
{{=item.path}}.The following request parameters will be automatically processed:
-
op:
SETTIMES -
modificationtime:
{{=item.modificationtime}} -
accesstime:
{{=item.accesstime}}
Documentation: https://hadoop.apache.org/docs/r1.0.4/webhdfs.html#SETTIMES.
-
op:
- path
Template: ShortText.
- modificationtime
Template: ShortText.
- accesstime
Template: ShortText.
- [Read] action
-
The following request parameters will be automatically processed:
-
_includeUserParameters:
{{=parameters}}
-
_includeUserParameters:
- [Create] action
-
The action uses PUT method.The following request parameters will be automatically processed:
-
_includeUserParameters:
{{=parameters}}
-
_includeUserParameters:
- [Delete] action
-
The action uses PUT method.The following request parameters will be automatically processed:
-
_includeUserParameters:
{{=parameters}}
-
_includeUserParameters:
- [Read] action
-
Endpoint URL address:
{{=parameters.Path}}.The action has the following user-defined parameters:
- Path: Required. The directory path.
- ShortText
Data type: DT_WSTR ( length 250 )
- LongText
Data type: DT_WSTR ( length 1000 )
- DateTime
Data type: DT_DBTIMESTAMP
- Date
Data type: DT_DBDATE
- Int4
Data type: DT_I4
- Long
Data type: DT_I8
Knowledge Base
What's New
- New: Introduced connection.
Related documentation
COZYROC SSIS+ Components Suite is free for testing in your development environment.
A licensed version can be deployed on-premises, on Azure-SSIS IR and on COZYROC Cloud.




