diff --git a/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md b/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md index 7fcadb94c5..160aa46fcd 100644 --- a/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md +++ b/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md @@ -28,7 +28,7 @@ HADOOP_OPTIONAL_TOOLS in hadoop-env.sh has 'hadoop-aws' in the list. ### Features -**NOTE: `s3:` is being phased out. Use `s3n:` or `s3a:` instead.** +**NOTE: `s3:` has been phased out. Use `s3n:` or `s3a:` instead.** 1. The second-generation, `s3n:` filesystem, making it easy to share data between hadoop and other applications via the S3 object store. @@ -86,38 +86,6 @@ these instructions —and be aware that all issues related to S3 integration in EMR can only be addressed by Amazon themselves: please raise your issues with them. -## S3 - -The `s3://` filesystem is the original S3 store in the Hadoop codebase. -It implements an inode-style filesystem atop S3, and was written to -provide scaleability when S3 had significant limits on the size of blobs. -It is incompatible with any other application's use of data in S3. - -It is now deprecated and will be removed in Hadoop 3. Please do not use, -and migrate off data which is on it. - -### Dependencies - -* `jets3t` jar -* `commons-codec` jar -* `commons-logging` jar -* `httpclient` jar -* `httpcore` jar -* `java-xmlbuilder` jar - -### Authentication properties - - - fs.s3.awsAccessKeyId - AWS access key ID - - - - fs.s3.awsSecretAccessKey - AWS secret key - - - ## S3N S3N was the first S3 Filesystem client which used "native" S3 objects, hence @@ -171,16 +139,16 @@ it should be used wherever possible. ### Other properties - fs.s3.buffer.dir + fs.s3n.buffer.dir ${hadoop.tmp.dir}/s3 - Determines where on the local filesystem the s3:/s3n: filesystem + Determines where on the local filesystem the s3n: filesystem should store files before sending them to S3 (or after retrieving them from S3). - fs.s3.maxRetries + fs.s3n.maxRetries 4 The maximum number of retries for reading or writing files to S3, before we signal failure to the application. @@ -188,7 +156,7 @@ it should be used wherever possible. - fs.s3.sleepTimeSeconds + fs.s3n.sleepTimeSeconds 10 The number of seconds to sleep between each S3 retry. @@ -1011,7 +979,7 @@ includes `distcp`. ### `ClassNotFoundException: org.apache.hadoop.fs.s3a.S3AFileSystem` -(or `org.apache.hadoop.fs.s3native.NativeS3FileSystem`, `org.apache.hadoop.fs.s3.S3FileSystem`). +(or `org.apache.hadoop.fs.s3native.NativeS3FileSystem`). These are the Hadoop classes, found in the `hadoop-aws` JAR. An exception reporting one of these classes is missing means that this JAR is not on