HDFS-5492. Port HDFS-2069 (Incorrect default trash interval in the docs) to trunk. (Contributed by Akira Ajisaka)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1562683 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
parent
a3151a9f3d
commit
0f3461e489
@ -301,6 +301,9 @@ Release 2.4.0 - UNRELEASED
|
|||||||
|
|
||||||
BUG FIXES
|
BUG FIXES
|
||||||
|
|
||||||
|
HDFS-5492. Port HDFS-2069 (Incorrect default trash interval in the
|
||||||
|
docs) to trunk. (Akira Ajisaka via Arpit Agarwal)
|
||||||
|
|
||||||
Release 2.3.0 - UNRELEASED
|
Release 2.3.0 - UNRELEASED
|
||||||
|
|
||||||
INCOMPATIBLE CHANGES
|
INCOMPATIBLE CHANGES
|
||||||
|
@ -17,11 +17,11 @@
|
|||||||
---
|
---
|
||||||
${maven.build.timestamp}
|
${maven.build.timestamp}
|
||||||
|
|
||||||
%{toc|section=1|fromDepth=0}
|
|
||||||
|
|
||||||
HDFS Architecture
|
HDFS Architecture
|
||||||
|
|
||||||
Introduction
|
%{toc|section=1|fromDepth=0}
|
||||||
|
|
||||||
|
* Introduction
|
||||||
|
|
||||||
The Hadoop Distributed File System (HDFS) is a distributed file system
|
The Hadoop Distributed File System (HDFS) is a distributed file system
|
||||||
designed to run on commodity hardware. It has many similarities with
|
designed to run on commodity hardware. It has many similarities with
|
||||||
@ -35,9 +35,9 @@ Introduction
|
|||||||
is part of the Apache Hadoop Core project. The project URL is
|
is part of the Apache Hadoop Core project. The project URL is
|
||||||
{{http://hadoop.apache.org/}}.
|
{{http://hadoop.apache.org/}}.
|
||||||
|
|
||||||
Assumptions and Goals
|
* Assumptions and Goals
|
||||||
|
|
||||||
Hardware Failure
|
** Hardware Failure
|
||||||
|
|
||||||
Hardware failure is the norm rather than the exception. An HDFS
|
Hardware failure is the norm rather than the exception. An HDFS
|
||||||
instance may consist of hundreds or thousands of server machines, each
|
instance may consist of hundreds or thousands of server machines, each
|
||||||
@ -47,7 +47,7 @@ Hardware Failure
|
|||||||
non-functional. Therefore, detection of faults and quick, automatic
|
non-functional. Therefore, detection of faults and quick, automatic
|
||||||
recovery from them is a core architectural goal of HDFS.
|
recovery from them is a core architectural goal of HDFS.
|
||||||
|
|
||||||
Streaming Data Access
|
** Streaming Data Access
|
||||||
|
|
||||||
Applications that run on HDFS need streaming access to their data sets.
|
Applications that run on HDFS need streaming access to their data sets.
|
||||||
They are not general purpose applications that typically run on general
|
They are not general purpose applications that typically run on general
|
||||||
@ -58,7 +58,7 @@ Streaming Data Access
|
|||||||
targeted for HDFS. POSIX semantics in a few key areas has been traded
|
targeted for HDFS. POSIX semantics in a few key areas has been traded
|
||||||
to increase data throughput rates.
|
to increase data throughput rates.
|
||||||
|
|
||||||
Large Data Sets
|
** Large Data Sets
|
||||||
|
|
||||||
Applications that run on HDFS have large data sets. A typical file in
|
Applications that run on HDFS have large data sets. A typical file in
|
||||||
HDFS is gigabytes to terabytes in size. Thus, HDFS is tuned to support
|
HDFS is gigabytes to terabytes in size. Thus, HDFS is tuned to support
|
||||||
@ -66,7 +66,7 @@ Large Data Sets
|
|||||||
to hundreds of nodes in a single cluster. It should support tens of
|
to hundreds of nodes in a single cluster. It should support tens of
|
||||||
millions of files in a single instance.
|
millions of files in a single instance.
|
||||||
|
|
||||||
Simple Coherency Model
|
** Simple Coherency Model
|
||||||
|
|
||||||
HDFS applications need a write-once-read-many access model for files. A
|
HDFS applications need a write-once-read-many access model for files. A
|
||||||
file once created, written, and closed need not be changed. This
|
file once created, written, and closed need not be changed. This
|
||||||
@ -75,7 +75,7 @@ Simple Coherency Model
|
|||||||
perfectly with this model. There is a plan to support appending-writes
|
perfectly with this model. There is a plan to support appending-writes
|
||||||
to files in the future.
|
to files in the future.
|
||||||
|
|
||||||
“Moving Computation is Cheaper than Moving Data”
|
** “Moving Computation is Cheaper than Moving Data”
|
||||||
|
|
||||||
A computation requested by an application is much more efficient if it
|
A computation requested by an application is much more efficient if it
|
||||||
is executed near the data it operates on. This is especially true when
|
is executed near the data it operates on. This is especially true when
|
||||||
@ -86,13 +86,13 @@ Simple Coherency Model
|
|||||||
running. HDFS provides interfaces for applications to move themselves
|
running. HDFS provides interfaces for applications to move themselves
|
||||||
closer to where the data is located.
|
closer to where the data is located.
|
||||||
|
|
||||||
Portability Across Heterogeneous Hardware and Software Platforms
|
** Portability Across Heterogeneous Hardware and Software Platforms
|
||||||
|
|
||||||
HDFS has been designed to be easily portable from one platform to
|
HDFS has been designed to be easily portable from one platform to
|
||||||
another. This facilitates widespread adoption of HDFS as a platform of
|
another. This facilitates widespread adoption of HDFS as a platform of
|
||||||
choice for a large set of applications.
|
choice for a large set of applications.
|
||||||
|
|
||||||
NameNode and DataNodes
|
* NameNode and DataNodes
|
||||||
|
|
||||||
HDFS has a master/slave architecture. An HDFS cluster consists of a
|
HDFS has a master/slave architecture. An HDFS cluster consists of a
|
||||||
single NameNode, a master server that manages the file system namespace
|
single NameNode, a master server that manages the file system namespace
|
||||||
@ -127,7 +127,7 @@ NameNode and DataNodes
|
|||||||
repository for all HDFS metadata. The system is designed in such a way
|
repository for all HDFS metadata. The system is designed in such a way
|
||||||
that user data never flows through the NameNode.
|
that user data never flows through the NameNode.
|
||||||
|
|
||||||
The File System Namespace
|
* The File System Namespace
|
||||||
|
|
||||||
HDFS supports a traditional hierarchical file organization. A user or
|
HDFS supports a traditional hierarchical file organization. A user or
|
||||||
an application can create directories and store files inside these
|
an application can create directories and store files inside these
|
||||||
@ -145,7 +145,7 @@ The File System Namespace
|
|||||||
replication factor of that file. This information is stored by the
|
replication factor of that file. This information is stored by the
|
||||||
NameNode.
|
NameNode.
|
||||||
|
|
||||||
Data Replication
|
* Data Replication
|
||||||
|
|
||||||
HDFS is designed to reliably store very large files across machines in
|
HDFS is designed to reliably store very large files across machines in
|
||||||
a large cluster. It stores each file as a sequence of blocks; all
|
a large cluster. It stores each file as a sequence of blocks; all
|
||||||
@ -164,7 +164,7 @@ Data Replication
|
|||||||
|
|
||||||
[images/hdfsdatanodes.png] HDFS DataNodes
|
[images/hdfsdatanodes.png] HDFS DataNodes
|
||||||
|
|
||||||
Replica Placement: The First Baby Steps
|
** Replica Placement: The First Baby Steps
|
||||||
|
|
||||||
The placement of replicas is critical to HDFS reliability and
|
The placement of replicas is critical to HDFS reliability and
|
||||||
performance. Optimizing replica placement distinguishes HDFS from most
|
performance. Optimizing replica placement distinguishes HDFS from most
|
||||||
@ -210,7 +210,7 @@ Replica Placement: The First Baby Steps
|
|||||||
The current, default replica placement policy described here is a work
|
The current, default replica placement policy described here is a work
|
||||||
in progress.
|
in progress.
|
||||||
|
|
||||||
Replica Selection
|
** Replica Selection
|
||||||
|
|
||||||
To minimize global bandwidth consumption and read latency, HDFS tries
|
To minimize global bandwidth consumption and read latency, HDFS tries
|
||||||
to satisfy a read request from a replica that is closest to the reader.
|
to satisfy a read request from a replica that is closest to the reader.
|
||||||
@ -219,7 +219,7 @@ Replica Selection
|
|||||||
cluster spans multiple data centers, then a replica that is resident in
|
cluster spans multiple data centers, then a replica that is resident in
|
||||||
the local data center is preferred over any remote replica.
|
the local data center is preferred over any remote replica.
|
||||||
|
|
||||||
Safemode
|
** Safemode
|
||||||
|
|
||||||
On startup, the NameNode enters a special state called Safemode.
|
On startup, the NameNode enters a special state called Safemode.
|
||||||
Replication of data blocks does not occur when the NameNode is in the
|
Replication of data blocks does not occur when the NameNode is in the
|
||||||
@ -234,7 +234,7 @@ Safemode
|
|||||||
blocks (if any) that still have fewer than the specified number of
|
blocks (if any) that still have fewer than the specified number of
|
||||||
replicas. The NameNode then replicates these blocks to other DataNodes.
|
replicas. The NameNode then replicates these blocks to other DataNodes.
|
||||||
|
|
||||||
The Persistence of File System Metadata
|
* The Persistence of File System Metadata
|
||||||
|
|
||||||
The HDFS namespace is stored by the NameNode. The NameNode uses a
|
The HDFS namespace is stored by the NameNode. The NameNode uses a
|
||||||
transaction log called the EditLog to persistently record every change
|
transaction log called the EditLog to persistently record every change
|
||||||
@ -273,7 +273,7 @@ The Persistence of File System Metadata
|
|||||||
each of these local files and sends this report to the NameNode: this
|
each of these local files and sends this report to the NameNode: this
|
||||||
is the Blockreport.
|
is the Blockreport.
|
||||||
|
|
||||||
The Communication Protocols
|
* The Communication Protocols
|
||||||
|
|
||||||
All HDFS communication protocols are layered on top of the TCP/IP
|
All HDFS communication protocols are layered on top of the TCP/IP
|
||||||
protocol. A client establishes a connection to a configurable TCP port
|
protocol. A client establishes a connection to a configurable TCP port
|
||||||
@ -284,13 +284,13 @@ The Communication Protocols
|
|||||||
RPCs. Instead, it only responds to RPC requests issued by DataNodes or
|
RPCs. Instead, it only responds to RPC requests issued by DataNodes or
|
||||||
clients.
|
clients.
|
||||||
|
|
||||||
Robustness
|
* Robustness
|
||||||
|
|
||||||
The primary objective of HDFS is to store data reliably even in the
|
The primary objective of HDFS is to store data reliably even in the
|
||||||
presence of failures. The three common types of failures are NameNode
|
presence of failures. The three common types of failures are NameNode
|
||||||
failures, DataNode failures and network partitions.
|
failures, DataNode failures and network partitions.
|
||||||
|
|
||||||
Data Disk Failure, Heartbeats and Re-Replication
|
** Data Disk Failure, Heartbeats and Re-Replication
|
||||||
|
|
||||||
Each DataNode sends a Heartbeat message to the NameNode periodically. A
|
Each DataNode sends a Heartbeat message to the NameNode periodically. A
|
||||||
network partition can cause a subset of DataNodes to lose connectivity
|
network partition can cause a subset of DataNodes to lose connectivity
|
||||||
@ -306,7 +306,7 @@ Data Disk Failure, Heartbeats and Re-Replication
|
|||||||
corrupted, a hard disk on a DataNode may fail, or the replication
|
corrupted, a hard disk on a DataNode may fail, or the replication
|
||||||
factor of a file may be increased.
|
factor of a file may be increased.
|
||||||
|
|
||||||
Cluster Rebalancing
|
** Cluster Rebalancing
|
||||||
|
|
||||||
The HDFS architecture is compatible with data rebalancing schemes. A
|
The HDFS architecture is compatible with data rebalancing schemes. A
|
||||||
scheme might automatically move data from one DataNode to another if
|
scheme might automatically move data from one DataNode to another if
|
||||||
@ -316,7 +316,7 @@ Cluster Rebalancing
|
|||||||
cluster. These types of data rebalancing schemes are not yet
|
cluster. These types of data rebalancing schemes are not yet
|
||||||
implemented.
|
implemented.
|
||||||
|
|
||||||
Data Integrity
|
** Data Integrity
|
||||||
|
|
||||||
It is possible that a block of data fetched from a DataNode arrives
|
It is possible that a block of data fetched from a DataNode arrives
|
||||||
corrupted. This corruption can occur because of faults in a storage
|
corrupted. This corruption can occur because of faults in a storage
|
||||||
@ -330,7 +330,7 @@ Data Integrity
|
|||||||
to retrieve that block from another DataNode that has a replica of that
|
to retrieve that block from another DataNode that has a replica of that
|
||||||
block.
|
block.
|
||||||
|
|
||||||
Metadata Disk Failure
|
** Metadata Disk Failure
|
||||||
|
|
||||||
The FsImage and the EditLog are central data structures of HDFS. A
|
The FsImage and the EditLog are central data structures of HDFS. A
|
||||||
corruption of these files can cause the HDFS instance to be
|
corruption of these files can cause the HDFS instance to be
|
||||||
@ -350,16 +350,16 @@ Metadata Disk Failure
|
|||||||
Currently, automatic restart and failover of the NameNode software to
|
Currently, automatic restart and failover of the NameNode software to
|
||||||
another machine is not supported.
|
another machine is not supported.
|
||||||
|
|
||||||
Snapshots
|
** Snapshots
|
||||||
|
|
||||||
Snapshots support storing a copy of data at a particular instant of
|
Snapshots support storing a copy of data at a particular instant of
|
||||||
time. One usage of the snapshot feature may be to roll back a corrupted
|
time. One usage of the snapshot feature may be to roll back a corrupted
|
||||||
HDFS instance to a previously known good point in time. HDFS does not
|
HDFS instance to a previously known good point in time. HDFS does not
|
||||||
currently support snapshots but will in a future release.
|
currently support snapshots but will in a future release.
|
||||||
|
|
||||||
Data Organization
|
* Data Organization
|
||||||
|
|
||||||
Data Blocks
|
** Data Blocks
|
||||||
|
|
||||||
HDFS is designed to support very large files. Applications that are
|
HDFS is designed to support very large files. Applications that are
|
||||||
compatible with HDFS are those that deal with large data sets. These
|
compatible with HDFS are those that deal with large data sets. These
|
||||||
@ -370,7 +370,7 @@ Data Blocks
|
|||||||
chunks, and if possible, each chunk will reside on a different
|
chunks, and if possible, each chunk will reside on a different
|
||||||
DataNode.
|
DataNode.
|
||||||
|
|
||||||
Staging
|
** Staging
|
||||||
|
|
||||||
A client request to create a file does not reach the NameNode
|
A client request to create a file does not reach the NameNode
|
||||||
immediately. In fact, initially the HDFS client caches the file data
|
immediately. In fact, initially the HDFS client caches the file data
|
||||||
@ -397,7 +397,7 @@ Staging
|
|||||||
side caching to improve performance. A POSIX requirement has been
|
side caching to improve performance. A POSIX requirement has been
|
||||||
relaxed to achieve higher performance of data uploads.
|
relaxed to achieve higher performance of data uploads.
|
||||||
|
|
||||||
Replication Pipelining
|
** Replication Pipelining
|
||||||
|
|
||||||
When a client is writing data to an HDFS file, its data is first
|
When a client is writing data to an HDFS file, its data is first
|
||||||
written to a local file as explained in the previous section. Suppose
|
written to a local file as explained in the previous section. Suppose
|
||||||
@ -406,7 +406,7 @@ Replication Pipelining
|
|||||||
DataNodes from the NameNode. This list contains the DataNodes that will
|
DataNodes from the NameNode. This list contains the DataNodes that will
|
||||||
host a replica of that block. The client then flushes the data block to
|
host a replica of that block. The client then flushes the data block to
|
||||||
the first DataNode. The first DataNode starts receiving the data in
|
the first DataNode. The first DataNode starts receiving the data in
|
||||||
small portions (4 KB), writes each portion to its local repository and
|
small portions, writes each portion to its local repository and
|
||||||
transfers that portion to the second DataNode in the list. The second
|
transfers that portion to the second DataNode in the list. The second
|
||||||
DataNode, in turn starts receiving each portion of the data block,
|
DataNode, in turn starts receiving each portion of the data block,
|
||||||
writes that portion to its repository and then flushes that portion to
|
writes that portion to its repository and then flushes that portion to
|
||||||
@ -416,7 +416,7 @@ Replication Pipelining
|
|||||||
the next one in the pipeline. Thus, the data is pipelined from one
|
the next one in the pipeline. Thus, the data is pipelined from one
|
||||||
DataNode to the next.
|
DataNode to the next.
|
||||||
|
|
||||||
Accessibility
|
* Accessibility
|
||||||
|
|
||||||
HDFS can be accessed from applications in many different ways.
|
HDFS can be accessed from applications in many different ways.
|
||||||
Natively, HDFS provides a
|
Natively, HDFS provides a
|
||||||
@ -426,7 +426,7 @@ Accessibility
|
|||||||
of an HDFS instance. Work is in progress to expose HDFS through the WebDAV
|
of an HDFS instance. Work is in progress to expose HDFS through the WebDAV
|
||||||
protocol.
|
protocol.
|
||||||
|
|
||||||
FS Shell
|
** FS Shell
|
||||||
|
|
||||||
HDFS allows user data to be organized in the form of files and
|
HDFS allows user data to be organized in the form of files and
|
||||||
directories. It provides a commandline interface called FS shell that
|
directories. It provides a commandline interface called FS shell that
|
||||||
@ -447,7 +447,7 @@ FS Shell
|
|||||||
FS shell is targeted for applications that need a scripting language to
|
FS shell is targeted for applications that need a scripting language to
|
||||||
interact with the stored data.
|
interact with the stored data.
|
||||||
|
|
||||||
DFSAdmin
|
** DFSAdmin
|
||||||
|
|
||||||
The DFSAdmin command set is used for administering an HDFS cluster.
|
The DFSAdmin command set is used for administering an HDFS cluster.
|
||||||
These are commands that are used only by an HDFS administrator. Here
|
These are commands that are used only by an HDFS administrator. Here
|
||||||
@ -463,16 +463,16 @@ DFSAdmin
|
|||||||
|Recommission or decommission DataNode(s) | <<<bin/hadoop dfsadmin -refreshNodes>>>
|
|Recommission or decommission DataNode(s) | <<<bin/hadoop dfsadmin -refreshNodes>>>
|
||||||
*---------+---------+
|
*---------+---------+
|
||||||
|
|
||||||
Browser Interface
|
** Browser Interface
|
||||||
|
|
||||||
A typical HDFS install configures a web server to expose the HDFS
|
A typical HDFS install configures a web server to expose the HDFS
|
||||||
namespace through a configurable TCP port. This allows a user to
|
namespace through a configurable TCP port. This allows a user to
|
||||||
navigate the HDFS namespace and view the contents of its files using a
|
navigate the HDFS namespace and view the contents of its files using a
|
||||||
web browser.
|
web browser.
|
||||||
|
|
||||||
Space Reclamation
|
* Space Reclamation
|
||||||
|
|
||||||
File Deletes and Undeletes
|
** File Deletes and Undeletes
|
||||||
|
|
||||||
When a file is deleted by a user or an application, it is not
|
When a file is deleted by a user or an application, it is not
|
||||||
immediately removed from HDFS. Instead, HDFS first renames it to a file
|
immediately removed from HDFS. Instead, HDFS first renames it to a file
|
||||||
@ -490,12 +490,12 @@ File Deletes and Undeletes
|
|||||||
file. The <<</trash>>> directory contains only the latest copy of the file
|
file. The <<</trash>>> directory contains only the latest copy of the file
|
||||||
that was deleted. The <<</trash>>> directory is just like any other directory
|
that was deleted. The <<</trash>>> directory is just like any other directory
|
||||||
with one special feature: HDFS applies specified policies to
|
with one special feature: HDFS applies specified policies to
|
||||||
automatically delete files from this directory. The current default
|
automatically delete files from this directory. Current default trash
|
||||||
policy is to delete files from <<</trash>>> that are more than 6 hours old.
|
interval is set to 0 (Deletes file without storing in trash). This value is
|
||||||
In the future, this policy will be configurable through a well defined
|
configurable parameter stored as <<<fs.trash.interval>>> stored in
|
||||||
interface.
|
core-site.xml.
|
||||||
|
|
||||||
Decrease Replication Factor
|
** Decrease Replication Factor
|
||||||
|
|
||||||
When the replication factor of a file is reduced, the NameNode selects
|
When the replication factor of a file is reduced, the NameNode selects
|
||||||
excess replicas that can be deleted. The next Heartbeat transfers this
|
excess replicas that can be deleted. The next Heartbeat transfers this
|
||||||
@ -505,7 +505,7 @@ Decrease Replication Factor
|
|||||||
of the setReplication API call and the appearance of free space in the
|
of the setReplication API call and the appearance of free space in the
|
||||||
cluster.
|
cluster.
|
||||||
|
|
||||||
References
|
* References
|
||||||
|
|
||||||
Hadoop {{{http://hadoop.apache.org/docs/current/api/}JavaDoc API}}.
|
Hadoop {{{http://hadoop.apache.org/docs/current/api/}JavaDoc API}}.
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user