From 7846636a5d452ff5be0dd37b8c07b38d80feb271 Mon Sep 17 00:00:00 2001 From: Xiaoyu Yao Date: Mon, 2 Oct 2017 10:52:05 -0700 Subject: [PATCH] HDFS-12551. Ozone: Documentation: Add Ozone overview documentation. Contributed by Anu Engineer. --- .../src/site/markdown/OzoneOverview.md | 88 +++++++++++++++++++ hadoop-project/src/site/site.xml | 2 + 2 files changed, 90 insertions(+) create mode 100644 hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/OzoneOverview.md diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/OzoneOverview.md b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/OzoneOverview.md new file mode 100644 index 0000000000..4dfd2498d9 --- /dev/null +++ b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/OzoneOverview.md @@ -0,0 +1,88 @@ + +Ozone Overview +============== + +
Ozone is an Object store for Apache Hadoop. It aims to scale to billions of +keys. 
The following is a high-level overview of the core components of Ozone.

 + +![Ozone Architecture Overview](images/ozoneoverview.png) 

 + +The main elements of Ozone are
: + +### Clients +Ozone ships with a set of ready-made clients. They are 
Ozone CLI and Corona.
 + + * [Ozone CLI](./OzoneCommandShell.html) is the command line interface like 'hdfs' command.
 + + * Corona is a load generation tool for Ozone.
 + +### REST Handler +Ozone provides both an RPC (Remote Procedure Call) as well as a REST +(Representational State Transfer) style interface. This allows clients to be +written in many languages quickly. Ozone strives to maintain a similar +interface between REST and RPC. The Rest handler offers the REST protocol +services of Ozone. + +For most purposes, a client can make one line change to switch from REST to +RPC or vice versa. 
 + +### Ozone File System +Ozone file system (TODO: Add documentation) is a Hadoop compatible file system. +This is the important user-visible component of ozone. +This allows Hadoop services and applications like Hive/Spark to run against +Ozone without any change. + +### Ozone Client +This is like DFSClient in HDFS. This acts as the standard client to talk to +Ozone. All other components that we have discussed so far rely on Ozone client +(TODO: Add Ozone client documentation).
 + +### Key Space Manager
 +Key Space Manager(KSM) takes care of the Ozone's namespace. +All ozone entities like volumes, buckets and keys are managed by KSM +(TODO: Add KSM documentation). In Short, KSM is the metadata manager for Ozone. +KSM talks to blockManager(SCM) to get blocks and passes it on to the Ozone +client. Ozone client writes data to these blocks. +KSM will eventually be replicated via Apache Ratis for High Availability.
 + +### Storage Container Manager +Storage Container Manager (SCM) is the block and cluster manager for Ozone. +SCM along with data nodes offer a service called 'containers'. +A container is a group unrelated of blocks that are managed together +as a single entity. + +SCM offers the following abstractions.

 + +![SCM Abstractions](images/scmservices.png) +#### Blocks +Blocks are like blocks in HDFS. They are replicated store of data. + +#### Containers +A collection of blocks replicated and managed together. + +#### Pipelines +SCM allows each container to choose its method of replication. +For example, a container might decide that it needs only one copy of a block +and might choose a stand-alone pipeline. Another container might want to have +a very high level of reliability and pick a RATIS based pipeline. In other +words, SCM allows different kinds of replication strategies to co-exist. + +#### Pools +A group of data nodes is called a pool. For scaling purposes, +we define a pool as a set of machines. This makes management of datanodes +easier. + +#### Nodes +The data node where data is stored. diff --git a/hadoop-project/src/site/site.xml b/hadoop-project/src/site/site.xml index 9fe576cef9..40df7c5e85 100644 --- a/hadoop-project/src/site/site.xml +++ b/hadoop-project/src/site/site.xml @@ -111,6 +111,8 @@ +