2011-10-20 00:38:42 +00:00
|
|
|
|
|
|
|
|
|
~~ Licensed under the Apache License, Version 2.0 (the "License");
|
|
|
|
|
~~ you may not use this file except in compliance with the License.
|
|
|
|
|
~~ You may obtain a copy of the License at
|
|
|
|
|
~~
|
|
|
|
|
~~ http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
|
~~
|
|
|
|
|
~~ Unless required by applicable law or agreed to in writing, software
|
|
|
|
|
~~ distributed under the License is distributed on an "AS IS" BASIS,
|
|
|
|
|
~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
|
|
|
~~ See the License for the specific language governing permissions and
|
|
|
|
|
~~ limitations under the License. See accompanying LICENSE file.
|
|
|
|
|
|
|
|
|
|
---
|
2011-11-23 22:02:19 +00:00
|
|
|
|
Hadoop Distributed File System-${project.version} - Federation
|
2011-10-20 00:38:42 +00:00
|
|
|
|
---
|
|
|
|
|
---
|
|
|
|
|
${maven.build.timestamp}
|
|
|
|
|
|
|
|
|
|
HDFS Federation
|
|
|
|
|
|
|
|
|
|
\[ {{{./index.html}Go Back}} \]
|
|
|
|
|
|
|
|
|
|
%{toc|section=1|fromDepth=0}
|
|
|
|
|
|
|
|
|
|
This guide provides an overview of the HDFS Federation feature and
|
|
|
|
|
how to configure and manage the federated cluster.
|
|
|
|
|
|
|
|
|
|
* {Background}
|
|
|
|
|
|
|
|
|
|
[./federation-background.gif] HDFS Layers
|
|
|
|
|
|
|
|
|
|
HDFS has two main layers:
|
|
|
|
|
|
|
|
|
|
* <<Namespace>>
|
|
|
|
|
|
|
|
|
|
* Consists of directories, files and blocks
|
|
|
|
|
|
|
|
|
|
* It supports all the namespace related file system operations such as
|
|
|
|
|
create, delete, modify and list files and directories.
|
|
|
|
|
|
|
|
|
|
* <<Block Storage Service>> has two parts
|
|
|
|
|
|
|
|
|
|
* Block Management (which is done in Namenode)
|
|
|
|
|
|
|
|
|
|
* Provides datanode cluster membership by handling registrations, and
|
|
|
|
|
periodic heart beats.
|
|
|
|
|
|
|
|
|
|
* Processes block reports and maintains location of blocks.
|
|
|
|
|
|
|
|
|
|
* Supports block related operations such as create, delete, modify and
|
|
|
|
|
get block location.
|
|
|
|
|
|
|
|
|
|
* Manages replica placement and replication of a block for under
|
|
|
|
|
replicated blocks and deletes blocks that are over replicated.
|
|
|
|
|
|
|
|
|
|
* Storage - is provided by datanodes by storing blocks on the local file
|
|
|
|
|
system and allows read/write access.
|
|
|
|
|
|
2011-11-23 22:02:19 +00:00
|
|
|
|
The prior HDFS architecture allows only a single namespace for the
|
2011-10-20 00:38:42 +00:00
|
|
|
|
entire cluster. A single Namenode manages this namespace. HDFS
|
2011-11-23 22:02:19 +00:00
|
|
|
|
Federation addresses limitation of the prior architecture by adding
|
2011-10-20 00:38:42 +00:00
|
|
|
|
support multiple Namenodes/namespaces to HDFS file system.
|
|
|
|
|
|
2011-11-23 22:02:19 +00:00
|
|
|
|
* {Multiple Namenodes/Namespaces}
|
2011-10-20 00:38:42 +00:00
|
|
|
|
|
|
|
|
|
In order to scale the name service horizontally, federation uses multiple
|
|
|
|
|
independent Namenodes/namespaces. The Namenodes are federated, that is, the
|
|
|
|
|
Namenodes are independent and don’t require coordination with each other.
|
|
|
|
|
The datanodes are used as common storage for blocks by all the Namenodes.
|
|
|
|
|
Each datanode registers with all the Namenodes in the cluster. Datanodes
|
|
|
|
|
send periodic heartbeats and block reports and handles commands from the
|
|
|
|
|
Namenodes.
|
|
|
|
|
|
|
|
|
|
[./federation.gif] HDFS Federation Architecture
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<<Block Pool>>
|
|
|
|
|
|
|
|
|
|
A Block Pool is a set of blocks that belong to a single namespace.
|
|
|
|
|
Datanodes store blocks for all the block pools in the cluster.
|
|
|
|
|
It is managed independently of other block pools. This allows a namespace
|
|
|
|
|
to generate Block IDs for new blocks without the need for coordination
|
|
|
|
|
with the other namespaces. The failure of a Namenode does not prevent
|
|
|
|
|
the datanode from serving other Namenodes in the cluster.
|
|
|
|
|
|
|
|
|
|
A Namespace and its block pool together are called Namespace Volume.
|
|
|
|
|
It is a self-contained unit of management. When a Namenode/namespace
|
|
|
|
|
is deleted, the corresponding block pool at the datanodes is deleted.
|
|
|
|
|
Each namespace volume is upgraded as a unit, during cluster upgrade.
|
|
|
|
|
|
|
|
|
|
<<ClusterID>>
|
|
|
|
|
|
|
|
|
|
A new identifier <<ClusterID>> is added to identify all the nodes in
|
|
|
|
|
the cluster. When a Namenode is formatted, this identifier is provided
|
|
|
|
|
or auto generated. This ID should be used for formatting the other
|
|
|
|
|
Namenodes into the cluster.
|
|
|
|
|
|
|
|
|
|
** Key Benefits
|
|
|
|
|
|
|
|
|
|
* Namespace Scalability - HDFS cluster storage scales horizontally but
|
|
|
|
|
the namespace does not. Large deployments or deployments using lot
|
|
|
|
|
of small files benefit from scaling the namespace by adding more
|
|
|
|
|
Namenodes to the cluster
|
|
|
|
|
|
2011-11-23 22:02:19 +00:00
|
|
|
|
* Performance - File system operation throughput is limited by a single
|
|
|
|
|
Namenode in the prior architecture. Adding more Namenodes to the cluster
|
|
|
|
|
scales the file system read/write operations throughput.
|
2011-10-20 00:38:42 +00:00
|
|
|
|
|
|
|
|
|
* Isolation - A single Namenode offers no isolation in multi user
|
|
|
|
|
environment. An experimental application can overload the Namenode
|
|
|
|
|
and slow down production critical applications. With multiple Namenodes,
|
|
|
|
|
different categories of applications and users can be isolated to
|
|
|
|
|
different namespaces.
|
|
|
|
|
|
|
|
|
|
* {Federation Configuration}
|
|
|
|
|
|
|
|
|
|
Federation configuration is <<backward compatible>> and allows existing
|
|
|
|
|
single Namenode configuration to work without any change. The new
|
|
|
|
|
configuration is designed such that all the nodes in the cluster have
|
|
|
|
|
same configuration without the need for deploying different configuration
|
|
|
|
|
based on the type of the node in the cluster.
|
|
|
|
|
|
|
|
|
|
A new abstraction called <<<NameServiceID>>> is added with
|
|
|
|
|
federation. The Namenode and its corresponding secondary/backup/checkpointer
|
|
|
|
|
nodes belong to this. To support single configuration file, the Namenode and
|
|
|
|
|
secondary/backup/checkpointer configuration parameters are suffixed with
|
|
|
|
|
<<<NameServiceID>>> and are added to the same configuration file.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
** Configuration:
|
|
|
|
|
|
|
|
|
|
<<Step 1>>: Add the following parameters to your configuration:
|
2012-05-20 21:13:17 +00:00
|
|
|
|
<<<dfs.nameservices>>>: Configure with list of comma separated
|
2011-10-20 00:38:42 +00:00
|
|
|
|
NameServiceIDs. This will be used by Datanodes to determine all the
|
|
|
|
|
Namenodes in the cluster.
|
|
|
|
|
|
|
|
|
|
<<Step 2>>: For each Namenode and Secondary Namenode/BackupNode/Checkpointer
|
|
|
|
|
add the following configuration suffixed with the corresponding
|
|
|
|
|
<<<NameServiceID>>> into the common configuration file.
|
|
|
|
|
|
|
|
|
|
*---------------------+--------------------------------------------+
|
|
|
|
|
|| Daemon || Configuration Parameter |
|
|
|
|
|
*---------------------+--------------------------------------------+
|
|
|
|
|
| Namenode | <<<dfs.namenode.rpc-address>>> |
|
|
|
|
|
| | <<<dfs.namenode.servicerpc-address>>> |
|
|
|
|
|
| | <<<dfs.namenode.http-address>>> |
|
|
|
|
|
| | <<<dfs.namenode.https-address>>> |
|
|
|
|
|
| | <<<dfs.namenode.keytab.file>>> |
|
|
|
|
|
| | <<<dfs.namenode.name.dir>>> |
|
|
|
|
|
| | <<<dfs.namenode.edits.dir>>> |
|
|
|
|
|
| | <<<dfs.namenode.checkpoint.dir>>> |
|
|
|
|
|
| | <<<dfs.namenode.checkpoint.edits.dir>>> |
|
|
|
|
|
*---------------------+--------------------------------------------+
|
|
|
|
|
| Secondary Namenode | <<<dfs.namenode.secondary.http-address>>> |
|
|
|
|
|
| | <<<dfs.secondary.namenode.keytab.file>>> |
|
|
|
|
|
*---------------------+--------------------------------------------+
|
|
|
|
|
| BackupNode | <<<dfs.namenode.backup.address>>> |
|
|
|
|
|
| | <<<dfs.secondary.namenode.keytab.file>>> |
|
|
|
|
|
*---------------------+--------------------------------------------+
|
|
|
|
|
|
|
|
|
|
Here is an example configuration with two namenodes:
|
|
|
|
|
|
|
|
|
|
----
|
|
|
|
|
<configuration>
|
|
|
|
|
<property>
|
2012-05-20 21:13:17 +00:00
|
|
|
|
<name>dfs.nameservices</name>
|
2011-10-20 00:38:42 +00:00
|
|
|
|
<value>ns1,ns2</value>
|
|
|
|
|
</property>
|
|
|
|
|
<property>
|
|
|
|
|
<name>dfs.namenode.rpc-address.ns1</name>
|
2012-01-12 19:11:33 +00:00
|
|
|
|
<value>nn-host1:rpc-port</value>
|
2011-10-20 00:38:42 +00:00
|
|
|
|
</property>
|
|
|
|
|
<property>
|
|
|
|
|
<name>dfs.namenode.http-address.ns1</name>
|
|
|
|
|
<value>nn-host1:http-port</value>
|
|
|
|
|
</property>
|
|
|
|
|
<property>
|
|
|
|
|
<name>dfs.namenode.secondaryhttp-address.ns1</name>
|
|
|
|
|
<value>snn-host1:http-port</value>
|
|
|
|
|
</property>
|
|
|
|
|
<property>
|
|
|
|
|
<name>dfs.namenode.rpc-address.ns2</name>
|
2012-01-12 19:11:33 +00:00
|
|
|
|
<value>nn-host2:rpc-port</value>
|
2011-10-20 00:38:42 +00:00
|
|
|
|
</property>
|
|
|
|
|
<property>
|
|
|
|
|
<name>dfs.namenode.http-address.ns2</name>
|
|
|
|
|
<value>nn-host2:http-port</value>
|
|
|
|
|
</property>
|
|
|
|
|
<property>
|
|
|
|
|
<name>dfs.namenode.secondaryhttp-address.ns2</name>
|
|
|
|
|
<value>snn-host2:http-port</value>
|
|
|
|
|
</property>
|
|
|
|
|
|
|
|
|
|
.... Other common configuration ...
|
|
|
|
|
</configuration>
|
|
|
|
|
----
|
|
|
|
|
|
|
|
|
|
** Formatting Namenodes
|
|
|
|
|
|
|
|
|
|
<<Step 1>>: Format a namenode using the following command:
|
|
|
|
|
|
|
|
|
|
----
|
|
|
|
|
> $HADOOP_PREFIX_HOME/bin/hdfs namenode -format [-clusterId <cluster_id>]
|
|
|
|
|
----
|
|
|
|
|
Choose a unique cluster_id, which will not conflict other clusters in
|
|
|
|
|
your environment. If it is not provided, then a unique ClusterID is
|
|
|
|
|
auto generated.
|
|
|
|
|
|
|
|
|
|
<<Step 2>>: Format additional namenode using the following command:
|
|
|
|
|
|
|
|
|
|
----
|
|
|
|
|
> $HADOOP_PREFIX_HOME/bin/hdfs namenode -format -clusterId <cluster_id>
|
|
|
|
|
----
|
|
|
|
|
Note that the cluster_id in step 2 must be same as that of the
|
|
|
|
|
cluster_id in step 1. If they are different, the additional Namenodes
|
|
|
|
|
will not be part of the federated cluster.
|
|
|
|
|
|
|
|
|
|
** Upgrading from older release to 0.23 and configuring federation
|
|
|
|
|
|
|
|
|
|
Older releases supported a single Namenode. Here are the steps enable
|
|
|
|
|
federation:
|
|
|
|
|
|
|
|
|
|
Step 1: Upgrade the cluster to newer release. During upgrade you can
|
|
|
|
|
provide a ClusterID as follows:
|
|
|
|
|
|
|
|
|
|
----
|
|
|
|
|
> $HADOOP_PREFIX_HOME/bin/hdfs start namenode --config $HADOOP_CONF_DIR -upgrade -clusterId <cluster_ID>
|
|
|
|
|
----
|
|
|
|
|
If ClusterID is not provided, it is auto generated.
|
|
|
|
|
|
|
|
|
|
** Adding a new Namenode to an existing HDFS cluster
|
|
|
|
|
|
|
|
|
|
Follow the following steps:
|
|
|
|
|
|
2012-05-20 21:13:17 +00:00
|
|
|
|
* Add configuration parameter <<<dfs.nameservices>>> to the configuration.
|
2011-10-20 00:38:42 +00:00
|
|
|
|
|
|
|
|
|
* Update the configuration with NameServiceID suffix. Configuration
|
|
|
|
|
key names have changed post release 0.20. You must use new configuration
|
|
|
|
|
parameter names, for federation.
|
|
|
|
|
|
|
|
|
|
* Add new Namenode related config to the configuration files.
|
|
|
|
|
|
|
|
|
|
* Propagate the configuration file to the all the nodes in the cluster.
|
|
|
|
|
|
|
|
|
|
* Start the new Namenode, Secondary/Backup.
|
|
|
|
|
|
|
|
|
|
* Refresh the datanodes to pickup the newly added Namenode by running
|
|
|
|
|
the following command:
|
|
|
|
|
|
|
|
|
|
----
|
|
|
|
|
> $HADOOP_PREFIX_HOME/bin/hdfs dfadmin -refreshNameNode <datanode_host_name>:<datanode_rpc_port>
|
|
|
|
|
----
|
|
|
|
|
|
|
|
|
|
* The above command must be run against all the datanodes in the cluster.
|
|
|
|
|
|
|
|
|
|
* {Managing the cluster}
|
|
|
|
|
|
|
|
|
|
** Starting and stopping cluster
|
|
|
|
|
|
|
|
|
|
To start the cluster run the following command:
|
|
|
|
|
|
|
|
|
|
----
|
|
|
|
|
> $HADOOP_PREFIX_HOME/bin/start-dfs.sh
|
|
|
|
|
----
|
|
|
|
|
|
2011-11-23 22:02:19 +00:00
|
|
|
|
To stop the cluster run the following command:
|
2011-10-20 00:38:42 +00:00
|
|
|
|
|
|
|
|
|
----
|
|
|
|
|
> $HADOOP_PREFIX_HOME/bin/stop-dfs.sh
|
|
|
|
|
----
|
|
|
|
|
|
|
|
|
|
These commands can be run from any node where the HDFS configuration is
|
|
|
|
|
available. The command uses configuration to determine the Namenodes
|
|
|
|
|
in the cluster and starts the Namenode process on those nodes. The
|
|
|
|
|
datanodes are started on nodes specified in the <<<slaves>>> file. The
|
|
|
|
|
script can be used as reference for building your own scripts for
|
|
|
|
|
starting and stopping the cluster.
|
|
|
|
|
|
|
|
|
|
** Balancer
|
|
|
|
|
|
|
|
|
|
Balancer has been changed to work with multiple Namenodes in the cluster to
|
|
|
|
|
balance the cluster. Balancer can be run using the command:
|
|
|
|
|
|
|
|
|
|
----
|
|
|
|
|
"$HADOOP_PREFIX"/bin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script "$bin"/hdfs start balancer [-policy <policy>]
|
|
|
|
|
----
|
|
|
|
|
|
|
|
|
|
Policy could be:
|
|
|
|
|
|
|
|
|
|
* <<<node>>> - this is the <default> policy. This balances the storage at
|
|
|
|
|
the datanode level. This is similar to balancing policy from prior releases.
|
|
|
|
|
|
|
|
|
|
* <<<blockpool>>> - this balances the storage at the block pool level.
|
|
|
|
|
Balancing at block pool level balances storage at the datanode level also.
|
|
|
|
|
|
|
|
|
|
Note that Balander only balances the data and does not balance the namespace.
|
|
|
|
|
|
|
|
|
|
** Decommissioning
|
|
|
|
|
|
|
|
|
|
Decommissioning is similar to prior releases. The nodes that need to be
|
2011-11-23 22:02:19 +00:00
|
|
|
|
decomissioned are added to the exclude file at all the Namenode. Each
|
2011-10-20 00:38:42 +00:00
|
|
|
|
Namenode decommissions its Block Pool. When all the Namenodes finish
|
|
|
|
|
decommissioning a datanode, the datanode is considered to be decommissioned.
|
|
|
|
|
|
|
|
|
|
<<Step 1>>: To distributed an exclude file to all the Namenodes, use the
|
|
|
|
|
following command:
|
|
|
|
|
|
|
|
|
|
----
|
|
|
|
|
"$HADOOP_PREFIX"/bin/distributed-exclude.sh <exclude_file>
|
|
|
|
|
----
|
|
|
|
|
|
|
|
|
|
<<Step 2>>: Refresh all the Namenodes to pick up the new exclude file.
|
|
|
|
|
|
|
|
|
|
----
|
|
|
|
|
"$HADOOP_PREFIX"/bin/refresh-namenodes.sh
|
|
|
|
|
----
|
|
|
|
|
|
|
|
|
|
The above command uses HDFS configuration to determine the Namenodes
|
|
|
|
|
configured in the cluster and refreshes all the Namenodes to pick up
|
|
|
|
|
the new exclude file.
|
|
|
|
|
|
|
|
|
|
** Cluster Web Console
|
|
|
|
|
|
|
|
|
|
Similar to Namenode status web page, a Cluster Web Console is added in
|
|
|
|
|
federation to monitor the federated cluster at
|
|
|
|
|
<<<http://<any_nn_host:port>/dfsclusterhealth.jsp>>>.
|
|
|
|
|
Any Namenode in the cluster can be used to access this web page.
|
|
|
|
|
|
|
|
|
|
The web page provides the following information:
|
|
|
|
|
|
|
|
|
|
* Cluster summary that shows number of files, number of blocks and
|
|
|
|
|
total configured storage capacity, available and used storage information
|
|
|
|
|
for the entire cluster.
|
|
|
|
|
|
|
|
|
|
* Provides list of Namenodes and summary that includes number of files,
|
|
|
|
|
blocks, missing blocks, number of live and dead data nodes for each
|
|
|
|
|
Namenode. It also provides a link to conveniently access Namenode web UI.
|
|
|
|
|
|
|
|
|
|
* It also provides decommissioning status of datanodes.
|
|
|
|
|
|
|
|
|
|
|