diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSCommands.md b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSCommands.md index fd77edf754..740317f9b2 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSCommands.md +++ b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSCommands.md @@ -278,6 +278,7 @@ Usage: [-blockpools ] [-idleiterations ] [-runDuringUpgrade] + [-asService] | COMMAND\_OPTION | Description | |:---- |:---- | @@ -289,6 +290,7 @@ Usage: | `-blockpools` \ | The balancer will only run on blockpools included in this list. | | `-idleiterations` \ | Maximum number of idle iterations before exit. This overwrites the default idleiterations(5). | | `-runDuringUpgrade` | Whether to run the balancer during an ongoing HDFS upgrade. This is usually not desired since it will not affect used space on over-utilized machines. | +| `-asService` | Run Balancer as a long running service. | | `-h`\|`--help` | Display the tool usage and help information and exit. | Runs a cluster balancing utility. An administrator can simply press Ctrl-C to stop the rebalancing process. See [Balancer](./HdfsUserGuide.html#Balancer) for more details. @@ -297,6 +299,8 @@ Note that the `blockpool` policy is more strict than the `datanode` policy. Besides the above command options, a pinning feature is introduced starting from 2.7.0 to prevent certain replicas from getting moved by balancer/mover. This pinning feature is disabled by default, and can be enabled by configuration property "dfs.datanode.block-pinning.enabled". When enabled, this feature only affects blocks that are written to favored nodes specified in the create() call. This feature is useful when we want to maintain the data locality, for applications such as HBase regionserver. +If you want to run Balancer as a long running service, please start Balancer using `-asService` parameter with daemon-mode. You can do this by using the following command: `hdfs --daemon start balancer -asService`, or just use sbin/start-balancer.sh script with parameter `-asService`. + ### `cacheadmin` Usage: diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsUserGuide.md b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsUserGuide.md index 6f707f64d0..54a8056068 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsUserGuide.md +++ b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsUserGuide.md @@ -242,6 +242,28 @@ HDFS data might not always be be placed uniformly across the DataNode. One commo Due to multiple competing considerations, data might not be uniformly placed across the DataNodes. HDFS provides a tool for administrators that analyzes block placement and rebalanaces data across the DataNode. A brief administrator's guide for balancer is available at [HADOOP-1652](https://issues.apache.org/jira/browse/HADOOP-1652). +Balancer supports two modes: run as a tool or as a long-running service: + +* In tool mode, it'll try to balance the clusters in best effort, and exit for the following conditions: + + * All clusters are balanced. + + * No bytes are moved for too many iterations (default is 5). + + * No blocks can be moved. + + * Cluster is upgrade in progress. + + * Other errors. + +* In service mode, balancer will run as a long running daemon service. It works like this: + + * For each round, it'll try to balance the cluster until success or return on error. + + * You can config the interval between each round, the interval is set by `dfs.balancer.service.interval`. + + * When encounter unexpected exceptions, it will try several times before stoping the service, which is set by `dfs.balancer.service.retries.on.exception`. + For command usage, see [balancer](./HDFSCommands.html#balancer). Rack Awareness