From eab52dfb35d1e876b69cf127ccf6cc07523ddf0b Mon Sep 17 00:00:00 2001 From: Xiaoyu Yao Date: Fri, 26 Feb 2016 14:14:12 -0800 Subject: [PATCH] HDFS-9831. Document webhdfs retry configuration keys introduced by HDFS-5219/HDFS-5122. Contributed by Xiaobing Zhou. --- hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt | 3 + .../src/main/resources/hdfs-default.xml | 62 +++++++++++++++++++ .../hadoop-hdfs/src/site/markdown/WebHDFS.md | 18 ++++++ 3 files changed, 83 insertions(+) diff --git a/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt b/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt index 104b46ddc3..0f1c45dbcf 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt +++ b/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt @@ -1979,6 +1979,9 @@ Release 2.8.0 - UNRELEASED HDFS-9843. Document distcp options required for copying between encrypted locations. (Xiaoyu Yao via cnauroth) + HDFS-9831.Document webhdfs retry configuration keys introduced by + HDFS-5219/HDFS-5122. (Xiaobing Zhou via xyao) + OPTIMIZATIONS HDFS-8026. Trace FSOutputSummer#writeChecksumChunks rather than diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml b/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml index 35fe3319b5..b4fb2e0806 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml +++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml @@ -2870,4 +2870,66 @@ Keytab based login can be enabled with dfs.balancer.keytab.enabled. + + + dfs.http.client.retry.policy.enabled + false + + If "true", enable the retry policy of WebHDFS client. + If "false", retry policy is turned off. + Enabling the retry policy can be quite useful while using WebHDFS to + copy large files between clusters that could timeout, or + copy files between HA clusters that could failover during the copy. + + + + + dfs.http.client.retry.policy.spec + 10000,6,60000,10 + + Specify a policy of multiple linear random retry for WebHDFS client, + e.g. given pairs of number of retries and sleep time (n0, t0), (n1, t1), + ..., the first n0 retries sleep t0 milliseconds on average, + the following n1 retries sleep t1 milliseconds on average, and so on. + + + + + dfs.http.client.failover.max.attempts + 15 + + Specify the max number of failover attempts for WebHDFS client + in case of network exception. + + + + + dfs.http.client.retry.max.attempts + 10 + + Specify the max number of retry attempts for WebHDFS client, + if the difference between retried attempts and failovered attempts is + larger than the max number of retry attempts, there will be no more + retries. + + + + + dfs.http.client.failover.sleep.base.millis + 500 + + Specify the base amount of time in milliseconds upon which the + exponentially increased sleep time between retries or failovers + is calculated for WebHDFS client. + + + + + dfs.http.client.failover.sleep.max.millis + 15000 + + Specify the upper bound of sleep time in milliseconds between + retries or failovers for WebHDFS client. + + diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/WebHDFS.md b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/WebHDFS.md index 473ad27c81..2d3d3611a6 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/WebHDFS.md +++ b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/WebHDFS.md @@ -24,6 +24,7 @@ WebHDFS REST API * [Authentication](#Authentication) * [Proxy Users](#Proxy_Users) * [Cross-Site Request Forgery Prevention](#Cross-Site_Request_Forgery_Prevention) + * [WebHDFS Retry Policy](#WebHDFS_Retry_Policy) * [File and Directory Operations](#File_and_Directory_Operations) * [Create and Write to a File](#Create_and_Write_to_a_File) * [Append to a File](#Append_to_a_File) @@ -299,6 +300,23 @@ custom header in the request. curl -i -L -X PUT -H 'X-XSRF-HEADER: ""' 'http://:/webhdfs/v1/?op=CREATE' +WebHDFS Retry Policy +------------------------------------- + +WebHDFS supports an optional, configurable retry policy for resilient copy of +large files that could timeout, or copy file between HA clusters that could failover during the copy. + +The following properties control WebHDFS retry and failover policy. + +| Property | Description | Default Value | +|:---- |:---- |:---- +| `dfs.http.client.retry.policy.enabled` | If "true", enable the retry policy of WebHDFS client. If "false", retry policy is turned off. | `false` | +| `dfs.http.client.retry.policy.spec` | Specify a policy of multiple linear random retry for WebHDFS client, e.g. given pairs of number of retries and sleep time (n0, t0), (n1, t1), ..., the first n0 retries sleep t0 milliseconds on average, the following n1 retries sleep t1 milliseconds on average, and so on. | `10000,6,60000,10` | +| `dfs.http.client.failover.max.attempts` | Specify the max number of failover attempts for WebHDFS client in case of network exception. | `15` | +| `dfs.http.client.retry.max.attempts` | Specify the max number of retry attempts for WebHDFS client, if the difference between retried attempts and failovered attempts is larger than the max number of retry attempts, there will be no more retries. | `10` | +| `dfs.http.client.failover.sleep.base.millis` | Specify the base amount of time in milliseconds upon which the exponentially increased sleep time between retries or failovers is calculated for WebHDFS client. | `500` | +| `dfs.http.client.failover.sleep.max.millis` | Specify the upper bound of sleep time in milliseconds between retries or failovers for WebHDFS client. | `15000` | + File and Directory Operations -----------------------------