73 lines
2.9 KiB
Plaintext
73 lines
2.9 KiB
Plaintext
|
<!---
|
|||
|
Licensed under the Apache License, Version 2.0 (the "License");
|
|||
|
you may not use this file except in compliance with the License.
|
|||
|
You may obtain a copy of the License at
|
|||
|
|
|||
|
http://www.apache.org/licenses/LICENSE-2.0
|
|||
|
|
|||
|
Unless required by applicable law or agreed to in writing, software
|
|||
|
distributed under the License is distributed on an "AS IS" BASIS,
|
|||
|
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|||
|
See the License for the specific language governing permissions and
|
|||
|
limitations under the License. See accompanying LICENSE file.
|
|||
|
-->
|
|||
|
|
|||
|
Apache Hadoop ${project.version}
|
|||
|
================================
|
|||
|
|
|||
|
Apache Hadoop ${project.version} consists of significant
|
|||
|
improvements over the previous stable release (hadoop-1.x).
|
|||
|
|
|||
|
Here is a short overview of the improvments to both HDFS and MapReduce.
|
|||
|
|
|||
|
* HDFS Federation
|
|||
|
|
|||
|
In order to scale the name service horizontally, federation uses
|
|||
|
multiple independent Namenodes/Namespaces. The Namenodes are
|
|||
|
federated, that is, the Namenodes are independent and don't require
|
|||
|
coordination with each other. The datanodes are used as common storage
|
|||
|
for blocks by all the Namenodes. Each datanode registers with all the
|
|||
|
Namenodes in the cluster. Datanodes send periodic heartbeats and block
|
|||
|
reports and handles commands from the Namenodes.
|
|||
|
|
|||
|
More details are available in the
|
|||
|
[HDFS Federation](./hadoop-project-dist/hadoop-hdfs/Federation.html)
|
|||
|
document.
|
|||
|
|
|||
|
* MapReduce NextGen aka YARN aka MRv2
|
|||
|
|
|||
|
The new architecture introduced in hadoop-0.23, divides the two major
|
|||
|
functions of the JobTracker: resource management and job life-cycle
|
|||
|
management into separate components.
|
|||
|
|
|||
|
The new ResourceManager manages the global assignment of compute
|
|||
|
resources to applications and the per-application
|
|||
|
ApplicationMaster manages the application‚ scheduling and
|
|||
|
coordination.
|
|||
|
|
|||
|
An application is either a single job in the sense of classic
|
|||
|
MapReduce jobs or a DAG of such jobs.
|
|||
|
|
|||
|
The ResourceManager and per-machine NodeManager daemon, which
|
|||
|
manages the user processes on that machine, form the computation
|
|||
|
fabric.
|
|||
|
|
|||
|
The per-application ApplicationMaster is, in effect, a framework
|
|||
|
specific library and is tasked with negotiating resources from the
|
|||
|
ResourceManager and working with the NodeManager(s) to execute and
|
|||
|
monitor the tasks.
|
|||
|
|
|||
|
More details are available in the
|
|||
|
[YARN](./hadoop-yarn/hadoop-yarn-site/YARN.html) document.
|
|||
|
|
|||
|
Getting Started
|
|||
|
===============
|
|||
|
|
|||
|
The Hadoop documentation includes the information you need to get started using
|
|||
|
Hadoop. Begin with the
|
|||
|
[Single Node Setup](./hadoop-project-dist/hadoop-common/SingleCluster.html)
|
|||
|
which shows you how to set up a single-node Hadoop installation.
|
|||
|
Then move on to the
|
|||
|
[Cluster Setup](./hadoop-project-dist/hadoop-common/ClusterSetup.html)
|
|||
|
to learn how to set up a multi-node Hadoop installation.
|