diff --git a/hadoop-hdfs-project/hadoop-hdfs/CHANGES_HDFS-5535.txt b/hadoop-hdfs-project/hadoop-hdfs/CHANGES_HDFS-5535.txt
index bbc1df26de..9cb03ed19f 100644
--- a/hadoop-hdfs-project/hadoop-hdfs/CHANGES_HDFS-5535.txt
+++ b/hadoop-hdfs-project/hadoop-hdfs/CHANGES_HDFS-5535.txt
@@ -83,3 +83,5 @@ HDFS-5535 subtasks:
Arpit Agarwal)
HDFS-5583. Make DN send an OOB Ack on shutdown before restarting. (kihwal)
+
+ HDFS-5778. Add rolling upgrade user document. (szetszwo)
diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsRollingUpgrade.xml b/hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsRollingUpgrade.xml
new file mode 100644
index 0000000000..b7d5894aab
--- /dev/null
+++ b/hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsRollingUpgrade.xml
@@ -0,0 +1,270 @@
+
+
+
+ HDFS rolling upgrade allows upgrading individual HDFS daemons.
+ For examples, the datanodes can be upgraded independent of the namenodes.
+ A namenode can be upgraded independent of the other namenodes.
+ The namenodes can be upgraded independent of datanods and journal nodes.
+
+ In Hadoop v2, HDFS supports highly-available (HA) namenode services and wire compatibility.
+ These two capabilities make it feasible to upgrade HDFS without incurring HDFS downtime.
+ In order to upgrade a HDFS cluster without downtime, the cluster must be setup with HA.
+
+ In a HA cluster, there are two or more NameNodes (NNs), many DataNodes (DNs),
+ a few JournalNodes (JNs) and a few ZooKeeperNodes (ZKNs).
+ JNs is relatively stable and does not require upgrade when upgrading HDFS in most of the cases.
+ In the rolling upgrade procedure described here,
+ only NNs and DNs are considered but JNs and ZKNs are not.
+ Upgrading JNs and ZKNs may incur cluster downtime.
+
+ Suppose there are two namenodes NN1 and NN2,
+ where NN1 and NN2 are respectively in active and standby states.
+ The following are the steps for upgrading a HA cluster:
+
+ In a federated cluster, there are multiple namespaces
+ and a pair of active and standby NNs for each namespace.
+ The procedure for upgrading a federated cluster is similar to upgrading a non-federated cluster
+ except that Step 1 and Step 4 are performed on each namespace
+ and Step 2 is performed on each pair of active and standby NNs, i.e.
+
+ For non-HA clusters,
+ it is impossible to upgrade HDFS without downtime since it requires restarting the namenodes.
+ However, datanodes can still be upgraded in a rolling manner.
+
+ In a non-HA cluster, there are a NameNode (NN), a SecondaryNameNode (SNN)
+ and many DataNodes (DNs).
+ The procedure for upgrading a non-HA cluster is similar to upgrading a HA cluster
+ except that Step 2 "Upgrade Active and Standby NNs" is changed to below:
+
+ When the upgraded release is undesirable
+ or, in some unlikely case, the upgrade fails (due to bugs in the newer release),
+ administrators may choose to downgrade HDFS back to the pre-upgrade release
+ or rollback HDFS to the pre-upgrade release and the pre-upgrade state.
+ Both downgrade and rollback require cluster downtime and are not done in a rolling fashion.
+
+ Note that downgrade and rollback are possible only after a rolling upgrade is started and
+ before the upgrade is terminated.
+ An upgrade can be terminated by either finalize, downgrade or rollback.
+ Therefore, it is impossible to run rollback after finalize or downgrade,
+ or to run downgrade after finalize.
+
+ Downgrade restores the software back to the pre-upgrade release
+ and preserves the user data.
+ Suppose time T is the rolling upgrade start time and the upgrade is terminated by downgrade.
+ Then, the files created before or after T remain available in HDFS.
+ The files deleted before or after T remain deleted in HDFS.
+
+ A newer release is downgradable to the pre-upgrade release
+ only if both the namenode layout version and the datenode layout version
+ are not changed between these two releases.
+ Below are the steps for downgrade:
+
+ Rollback restores the software back to the pre-upgrade release
+ but also reverts the user data back to the pre-upgrade state.
+ Suppose time T is the rolling upgrade start time and the upgrade is terminated by rollback.
+ The files created before T remain available in HDFS but the files created after T become unavailable.
+ The files deleted before T remain deleted in HDFS but the files deleted after T are restored.
+
+ Rollback from a newer release to the pre-upgrade release is always supported.
+ Below are the steps for rollback:
+
+ Execute a rolling upgrade action.
+ HDFS Rolling Upgrade
+ Upgrading Non-Federated Clusters
+
+
+
+
+
hdfs dfsadmin -rollingUpgrade prepare
"
+ to create a fsimage for rollback.
+ hdfs dfsadmin -rollingUpgrade query
"
+ to check the status of the rollback image.
+ Wait and re-run the command until the "Proceed with rolling upgrade" message is shown.
+
+
+
+
+ hdfs dfsadmin -shutdownDatanode <DATANODE_HOST:IPC_PORT> upgrade
"
+ to shutdown one of the chosen datanodes.hdfs dfsadmin -getDatanodeInfo <DATANODE_HOST:IPC_PORT>
"
+ to check and wait for the datanode to shutdown.
+
hdfs dfsadmin -rollingUpgrade finalize
"
+ to finalize the rolling upgrade.Upgrading Federated Clusters
+
+
+
+ Upgrading Non-HA Clusters
+
+
+
+
+
+
+
+
-rollingUpgrade downgrade
" option.
+
+
+
+
-rollingUpgrade rollback
" option.
+
+ dfsadmin -rollingUpgrade
+
+
+ query
Query the current rolling upgrade status.
+ prepare
Prepare a new rolling upgrade.
+ finalize
Finalize the current rolling upgrade.
dfsadmin -getDatanodeInfo
+ Get the information about the given datanode.
+ This command can be used for checking if a datanode is alive
+ like the Unix ping
command.
+
dfsadmin -shutdownDatanode
+ Submit a shutdown request for the given datanode.
+ If the optional upgrade
argument is specified,
+ clients accessing the datanode will be advised to wait for it to restart
+ and the fast start-up mode will be enabled.
+ When the restart does not happen in time, clients will timeout and ignore the datanode.
+ In such case, the fast start-up mode will also be disabled.
+
+ Note that the command does not wait for the datanode shutdown to complete. + The "dfsadmin -getDatanodeInfo" + command can be used for checking if the datanode shutdown is complete. +
+ + +namenode -rollingUpgrade
+ Downgrade or rollback an ongoing rolling upgrade. +
+downgrade |
+ Restores the namenode back to the pre-upgrade release + and preserves the user data. | +
rollback |
+ Restores the namenode back to the pre-upgrade release + but also reverts the user data back to the pre-upgrade state. | +