hadoop/hadoop-hdds/docs/content/interface/OzoneFS.md

---
title: Ozone File System
date: 2017-09-14
weight: 2
summary: Hadoop Compatible file system allows any application that expects an HDFS like interface to work against Ozone with zero changes. Frameworks like Apache Spark, YARN and Hive work against Ozone without needing any change.
---
<!---
  Licensed to the Apache Software Foundation (ASF) under one or more
  contributor license agreements.  See the NOTICE file distributed with
  this work for additional information regarding copyright ownership.
  The ASF licenses this file to You under the Apache License, Version 2.0
  (the "License"); you may not use this file except in compliance with
  the License.  You may obtain a copy of the License at

      http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License.
-->

The Hadoop compatible file system interface allows storage backends like Ozone
to be easily integrated into Hadoop eco-system.  Ozone file system is an
Hadoop compatible file system.

## Setting up the Ozone file system

To create an ozone file system, we have to choose a bucket where the file system would live. This bucket will be used as the backend store for OzoneFileSystem. All the files and directories will be stored as keys in this bucket.

Please run the following commands to create a volume and bucket, if you don't have them already.

{{< highlight bash >}}
ozone sh volume create /volume
ozone sh bucket create /volume/bucket
{{< /highlight >}}

Once this is created, please make sure that bucket exists via the _list volume_ or _list bucket_ commands.

Please add the following entry to the core-site.xml.

{{< highlight xml >}}
<property>
  <name>fs.o3fs.impl</name>
  <value>org.apache.hadoop.fs.ozone.OzoneFileSystem</value>
</property>
<property>
  <name>fs.AbstractFileSystem.o3fs.impl</name>
  <value>org.apache.hadoop.fs.ozone.OzFs</value>
</property>
<property>
  <name>fs.defaultFS</name>
  <value>o3fs://bucket.volume</value>
</property>
{{< /highlight >}}

This will make this bucket to be the default file system for HDFS dfs commands and register the o3fs file system type.

You also need to add the ozone-filesystem.jar file to the classpath:

{{< highlight bash >}}
export HADOOP_CLASSPATH=/opt/ozone/share/ozonefs/lib/hadoop-ozone-filesystem-lib-current*.jar:$HADOOP_CLASSPATH
{{< /highlight >}}

Once the default Filesystem has been setup, users can run commands like ls, put, mkdir, etc.
For example,

{{< highlight bash >}}
hdfs dfs -ls /
{{< /highlight >}}

or

{{< highlight bash >}}
hdfs dfs -mkdir /users
{{< /highlight >}}


Or put command etc. In other words, all programs like Hive, Spark, and Distcp will work against this file system.
Please note that any keys created/deleted in the bucket using methods apart from OzoneFileSystem will show up as directories and files in the Ozone File System.

Note: Bucket and volume names are not allowed to have a period in them.
Moreover, the filesystem URI can take a fully qualified form with the OM host and an optional port as a part of the path following the volume name.
For example, you can specify both host and port:

{{< highlight bash>}}
hdfs dfs -ls o3fs://bucket.volume.om-host.example.com:5678/key
{{< /highlight >}}

When the port number is not specified, it will be retrieved from config key `ozone.om.address`
if defined; or it will fall back to the default port `9862`.
For example, we have `ozone.om.address` configured as following in `ozone-site.xml`:

{{< highlight xml >}}
  <property>
    <name>ozone.om.address</name>
    <value>0.0.0.0:6789</value>
  </property>
{{< /highlight >}}

When we run command:

{{< highlight bash>}}
hdfs dfs -ls o3fs://bucket.volume.om-host.example.com/key
{{< /highlight >}}

The above command is essentially equivalent to:

{{< highlight bash>}}
hdfs dfs -ls o3fs://bucket.volume.om-host.example.com:6789/key
{{< /highlight >}}

Note: Only port number from the config is used in this case, 
whereas the host name in the config `ozone.om.address` is ignored.


## Supporting older Hadoop version (Legacy jar, BasicOzoneFilesystem)

There are two ozonefs files, both of them include all the dependencies:

 * share/ozone/lib/hadoop-ozone-filesystem-lib-current-VERSION.jar
 * share/ozone/lib/hadoop-ozone-filesystem-lib-legacy-VERSION.jar

The first one contains all the required dependency to use ozonefs with a
 compatible hadoop version (hadoop 3.2).

The second one contains all the dependency in an internal, separated directory,
 and a special class loader is used to load all the classes from the location.

With this method the hadoop-ozone-filesystem-lib-legacy.jar can be used from
 any older hadoop version (eg. hadoop 3.1, hadoop 2.7 or spark+hadoop 2.7)

Similar to the dependency jar, there are two OzoneFileSystem implementation.

For hadoop 3.0 and newer, you can use `org.apache.hadoop.fs.ozone.OzoneFileSystem`
 which is a full implementation of the Hadoop compatible File System API.

For Hadoop 2.x you should use the Basic version: `org.apache.hadoop.fs.ozone.BasicOzoneFileSystem`.

This is the same implementation but doesn't include the features/dependencies which are added with
 Hadoop 3.0. (eg. FS statistics, encryption zones).

### Summary

The following table summarize which jar files and implementation should be used:

Hadoop version | Required jar            | OzoneFileSystem implementation
---------------|-------------------------|----------------------------------------------------
3.2            | filesystem-lib-current  | org.apache.hadoop.fs.ozone.OzoneFileSystem
3.1            | filesystem-lib-legacy   | org.apache.hadoop.fs.ozone.OzoneFileSystem
2.9            | filesystem-lib-legacy   | org.apache.hadoop.fs.ozone.BasicOzoneFileSystem
2.7            | filesystem-lib-legacy   | org.apache.hadoop.fs.ozone.BasicOzoneFileSystem
 With this method the hadoop-ozone-filesystem-lib-legacy.jar can be used from
 any older hadoop version (eg. hadoop 2.7 or spark+hadoop 2.7)
HDDS-435. Enhance the existing ozone documentation. Contributed by Elek, Marton. 2018-09-17 17:46:28 +00:00			`---`
			`title: Ozone File System`
			`date: 2017-09-14`
HDDS-1639. Restructure documentation pages for better understanding Closes #901 2019-06-28 17:51:30 +00:00			`weight: 2`
			`summary: Hadoop Compatible file system allows any application that expects an HDFS like interface to work against Ozone with zero changes. Frameworks like Apache Spark, YARN and Hive work against Ozone without needing any change.`
HDDS-435. Enhance the existing ozone documentation. Contributed by Elek, Marton. 2018-09-17 17:46:28 +00:00			`---`
HDDS-487. Doc files are missing ASF license headers. Contributed by Namit Maheshwari. 2018-09-17 23:21:10 +00:00			`<!---`
			`Licensed to the Apache Software Foundation (ASF) under one or more`
			`contributor license agreements. See the NOTICE file distributed with`
			`this work for additional information regarding copyright ownership.`
			`The ASF licenses this file to You under the Apache License, Version 2.0`
			`(the "License"); you may not use this file except in compliance with`
			`the License. You may obtain a copy of the License at`

			`http://www.apache.org/licenses/LICENSE-2.0`

			`Unless required by applicable law or agreed to in writing, software`
			`distributed under the License is distributed on an "AS IS" BASIS,`
			`WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.`
			`See the License for the specific language governing permissions and`
			`limitations under the License.`
			`-->`
HDDS-435. Enhance the existing ozone documentation. Contributed by Elek, Marton. 2018-09-17 17:46:28 +00:00
HDDS-2002. Update documentation for 0.4.1 release. Signed-off-by: Anu Engineer <aengineer@apache.org> 2019-08-21 17:17:41 +00:00			`The Hadoop compatible file system interface allows storage backends like Ozone`
HDDS-1639. Restructure documentation pages for better understanding Closes #901 2019-06-28 17:51:30 +00:00			`to be easily integrated into Hadoop eco-system. Ozone file system is an`
			`Hadoop compatible file system.`
HDDS-435. Enhance the existing ozone documentation. Contributed by Elek, Marton. 2018-09-17 17:46:28 +00:00
			`## Setting up the Ozone file system`

			`To create an ozone file system, we have to choose a bucket where the file system would live. This bucket will be used as the backend store for OzoneFileSystem. All the files and directories will be stored as keys in this bucket.`

			`Please run the following commands to create a volume and bucket, if you don't have them already.`

			`{{< highlight bash >}}`
HDDS-483. Update ozone Documentation to fix below issues. Contributed by Namit Maheshwari. 2018-09-18 17:23:36 +00:00			`ozone sh volume create /volume`
			`ozone sh bucket create /volume/bucket`
HDDS-435. Enhance the existing ozone documentation. Contributed by Elek, Marton. 2018-09-17 17:46:28 +00:00			`{{< /highlight >}}`

HDDS-2002. Update documentation for 0.4.1 release. Signed-off-by: Anu Engineer <aengineer@apache.org> 2019-08-21 17:17:41 +00:00			`Once this is created, please make sure that bucket exists via the _list volume_ or _list bucket_ commands.`
HDDS-435. Enhance the existing ozone documentation. Contributed by Elek, Marton. 2018-09-17 17:46:28 +00:00
			`Please add the following entry to the core-site.xml.`

			`{{< highlight xml >}}`
			`<property>`
HDDS-651. Rename o3 to o3fs for Filesystem. 2018-10-17 21:18:46 +00:00			`<name>fs.o3fs.impl</name>`
HDDS-435. Enhance the existing ozone documentation. Contributed by Elek, Marton. 2018-09-17 17:46:28 +00:00			`<value>org.apache.hadoop.fs.ozone.OzoneFileSystem</value>`
HDDS-2002. Update documentation for 0.4.1 release. Signed-off-by: Anu Engineer <aengineer@apache.org> 2019-08-21 17:17:41 +00:00			`</property>`
			`<property>`
			`<name>fs.AbstractFileSystem.o3fs.impl</name>`
			`<value>org.apache.hadoop.fs.ozone.OzFs</value>`
HDDS-435. Enhance the existing ozone documentation. Contributed by Elek, Marton. 2018-09-17 17:46:28 +00:00			`</property>`
			`<property>`
HDDS-559. fs.default.name is deprecated. Contributed by Dinesh Chitlangia. 2018-10-09 23:57:39 +00:00			`<name>fs.defaultFS</name>`
HDDS-913. Ozonefs defaultFs example is wrong in the documentation. Contributed by Supratim Deka. 2019-01-22 17:27:17 +00:00			`<value>o3fs://bucket.volume</value>`
HDDS-435. Enhance the existing ozone documentation. Contributed by Elek, Marton. 2018-09-17 17:46:28 +00:00			`</property>`
			`{{< /highlight >}}`

HDDS-807. Period should be an invalid character in bucket names. Contributed by Siddharth Wagle. 2019-03-12 18:14:55 +00:00			`This will make this bucket to be the default file system for HDFS dfs commands and register the o3fs file system type.`
HDDS-435. Enhance the existing ozone documentation. Contributed by Elek, Marton. 2018-09-17 17:46:28 +00:00
			`You also need to add the ozone-filesystem.jar file to the classpath:`

			`{{< highlight bash >}}`
HDDS-1226. Addendum. ozone-filesystem jar missing in hadoop classpath Closes #560 2019-03-07 13:06:29 +00:00			`export HADOOP_CLASSPATH=/opt/ozone/share/ozonefs/lib/hadoop-ozone-filesystem-lib-current*.jar:$HADOOP_CLASSPATH`
HDDS-435. Enhance the existing ozone documentation. Contributed by Elek, Marton. 2018-09-17 17:46:28 +00:00			`{{< /highlight >}}`

			`Once the default Filesystem has been setup, users can run commands like ls, put, mkdir, etc.`
			`For example,`

			`{{< highlight bash >}}`
			`hdfs dfs -ls /`
			`{{< /highlight >}}`

			`or`

			`{{< highlight bash >}}`
			`hdfs dfs -mkdir /users`
			`{{< /highlight >}}`


			`Or put command etc. In other words, all programs like Hive, Spark, and Distcp will work against this file system.`
HDDS-807. Period should be an invalid character in bucket names. Contributed by Siddharth Wagle. 2019-03-12 18:14:55 +00:00			`Please note that any keys created/deleted in the bucket using methods apart from OzoneFileSystem will show up as directories and files in the Ozone File System.`

			`Note: Bucket and volume names are not allowed to have a period in them.`
HDDS-1971. Update document for HDDS-1891: Ozone fs shell command should work with default port when port number is not specified (#1306) 2019-08-19 01:29:52 +00:00			`Moreover, the filesystem URI can take a fully qualified form with the OM host and an optional port as a part of the path following the volume name.`
			`For example, you can specify both host and port:`
HDDS-807. Period should be an invalid character in bucket names. Contributed by Siddharth Wagle. 2019-03-12 18:14:55 +00:00
			`{{< highlight bash>}}`
			`hdfs dfs -ls o3fs://bucket.volume.om-host.example.com:5678/key`
			`{{< /highlight >}}`

HDDS-1971. Update document for HDDS-1891: Ozone fs shell command should work with default port when port number is not specified (#1306) 2019-08-19 01:29:52 +00:00			When the port number is not specified, it will be retrieved from config key `ozone.om.address`
			if defined; or it will fall back to the default port `9862`.
			For example, we have `ozone.om.address` configured as following in `ozone-site.xml`:

			`{{< highlight xml >}}`
			`<property>`
			`<name>ozone.om.address</name>`
			`<value>0.0.0.0:6789</value>`
			`</property>`
			`{{< /highlight >}}`

			`When we run command:`

			`{{< highlight bash>}}`
			`hdfs dfs -ls o3fs://bucket.volume.om-host.example.com/key`
			`{{< /highlight >}}`

			`The above command is essentially equivalent to:`

			`{{< highlight bash>}}`
			`hdfs dfs -ls o3fs://bucket.volume.om-host.example.com:6789/key`
			`{{< /highlight >}}`

			`Note: Only port number from the config is used in this case,`
			whereas the host name in the config `ozone.om.address` is ignored.

HDDS-922. Create isolated classloder to use ozonefs with any older hadoop versions. Contributed by Elek, Marton. 2019-02-07 16:02:03 +00:00
HDDS-1462. Fix content and format of Ozone documentation. Contributed by Elek, Marton. (#767) 2019-04-29 20:28:19 +00:00			`## Supporting older Hadoop version (Legacy jar, BasicOzoneFilesystem)`
HDDS-922. Create isolated classloder to use ozonefs with any older hadoop versions. Contributed by Elek, Marton. 2019-02-07 16:02:03 +00:00
HDDS-1462. Fix content and format of Ozone documentation. Contributed by Elek, Marton. (#767) 2019-04-29 20:28:19 +00:00			`There are two ozonefs files, both of them include all the dependencies:`
HDDS-922. Create isolated classloder to use ozonefs with any older hadoop versions. Contributed by Elek, Marton. 2019-02-07 16:02:03 +00:00
HDDS-1226. ozone-filesystem jar missing in hadoop classpath Closes #560 2019-03-07 13:06:29 +00:00			`* share/ozone/lib/hadoop-ozone-filesystem-lib-current-VERSION.jar`
HDDS-922. Create isolated classloder to use ozonefs with any older hadoop versions. Contributed by Elek, Marton. 2019-02-07 16:02:03 +00:00			`* share/ozone/lib/hadoop-ozone-filesystem-lib-legacy-VERSION.jar`

HDDS-1462. Fix content and format of Ozone documentation. Contributed by Elek, Marton. (#767) 2019-04-29 20:28:19 +00:00			`The first one contains all the required dependency to use ozonefs with a`
			`compatible hadoop version (hadoop 3.2).`
HDDS-922. Create isolated classloder to use ozonefs with any older hadoop versions. Contributed by Elek, Marton. 2019-02-07 16:02:03 +00:00
HDDS-1462. Fix content and format of Ozone documentation. Contributed by Elek, Marton. (#767) 2019-04-29 20:28:19 +00:00			`The second one contains all the dependency in an internal, separated directory,`
HDDS-922. Create isolated classloder to use ozonefs with any older hadoop versions. Contributed by Elek, Marton. 2019-02-07 16:02:03 +00:00			`and a special class loader is used to load all the classes from the location.`

HDDS-1462. Fix content and format of Ozone documentation. Contributed by Elek, Marton. (#767) 2019-04-29 20:28:19 +00:00			`With this method the hadoop-ozone-filesystem-lib-legacy.jar can be used from`
			`any older hadoop version (eg. hadoop 3.1, hadoop 2.7 or spark+hadoop 2.7)`
HDDS-1639. Restructure documentation pages for better understanding Closes #901 2019-06-28 17:51:30 +00:00
HDDS-1462. Fix content and format of Ozone documentation. Contributed by Elek, Marton. (#767) 2019-04-29 20:28:19 +00:00			`Similar to the dependency jar, there are two OzoneFileSystem implementation.`
HDDS-1639. Restructure documentation pages for better understanding Closes #901 2019-06-28 17:51:30 +00:00
			For hadoop 3.0 and newer, you can use `org.apache.hadoop.fs.ozone.OzoneFileSystem`
HDDS-1462. Fix content and format of Ozone documentation. Contributed by Elek, Marton. (#767) 2019-04-29 20:28:19 +00:00			`which is a full implementation of the Hadoop compatible File System API.`
HDDS-1639. Restructure documentation pages for better understanding Closes #901 2019-06-28 17:51:30 +00:00
HDDS-1462. Fix content and format of Ozone documentation. Contributed by Elek, Marton. (#767) 2019-04-29 20:28:19 +00:00			For Hadoop 2.x you should use the Basic version: `org.apache.hadoop.fs.ozone.BasicOzoneFileSystem`.
HDDS-1639. Restructure documentation pages for better understanding Closes #901 2019-06-28 17:51:30 +00:00
			`This is the same implementation but doesn't include the features/dependencies which are added with`
HDDS-1462. Fix content and format of Ozone documentation. Contributed by Elek, Marton. (#767) 2019-04-29 20:28:19 +00:00			`Hadoop 3.0. (eg. FS statistics, encryption zones).`
HDDS-1639. Restructure documentation pages for better understanding Closes #901 2019-06-28 17:51:30 +00:00
HDDS-1462. Fix content and format of Ozone documentation. Contributed by Elek, Marton. (#767) 2019-04-29 20:28:19 +00:00			`### Summary`
HDDS-1639. Restructure documentation pages for better understanding Closes #901 2019-06-28 17:51:30 +00:00
HDDS-1462. Fix content and format of Ozone documentation. Contributed by Elek, Marton. (#767) 2019-04-29 20:28:19 +00:00			`The following table summarize which jar files and implementation should be used:`
HDDS-1639. Restructure documentation pages for better understanding Closes #901 2019-06-28 17:51:30 +00:00
HDDS-1462. Fix content and format of Ozone documentation. Contributed by Elek, Marton. (#767) 2019-04-29 20:28:19 +00:00			`Hadoop version \| Required jar \| OzoneFileSystem implementation`
			`---------------\|-------------------------\|----------------------------------------------------`
			`3.2 \| filesystem-lib-current \| org.apache.hadoop.fs.ozone.OzoneFileSystem`
			`3.1 \| filesystem-lib-legacy \| org.apache.hadoop.fs.ozone.OzoneFileSystem`
			`2.9 \| filesystem-lib-legacy \| org.apache.hadoop.fs.ozone.BasicOzoneFileSystem`
			`2.7 \| filesystem-lib-legacy \| org.apache.hadoop.fs.ozone.BasicOzoneFileSystem`
HDDS-922. Create isolated classloder to use ozonefs with any older hadoop versions. Contributed by Elek, Marton. 2019-02-07 16:02:03 +00:00			`With this method the hadoop-ozone-filesystem-lib-legacy.jar can be used from`
HDDS-1457. Sequence of configuring Ozone File System instruction needs some work. Contributed by Eric Yang. 2019-04-24 13:45:46 +00:00			`any older hadoop version (eg. hadoop 2.7 or spark+hadoop 2.7)`