From 4b6d795f28591df7761186bc5c8d31dcd1a5a99d Mon Sep 17 00:00:00 2001
From: Steve Loughran <stevel@apache.org>
Date: Fri, 9 Sep 2016 18:54:08 +0100
Subject: [PATCH] HADOOP-13540 improve section on troubleshooting s3a auth
 problems. Contributed by Steve Loughran

---
 .../site/markdown/tools/hadoop-aws/index.md   | 101 ++++++++++++++----
 1 file changed, 83 insertions(+), 18 deletions(-)

diff --git a/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md b/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md
index 5100030195..7fcadb94c5 100644
--- a/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md
+++ b/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md
@@ -1023,7 +1023,7 @@ the classpath.
 
 This means that one or more of the `aws-*-sdk` JARs are missing. Add them.
 
-### Missing method in AWS class
+### Missing method in `com.amazonaws` class
 
 This can be triggered by incompatibilities between the AWS SDK on the classpath
 and the version which Hadoop was compiled with.
@@ -1047,23 +1047,84 @@ classpath. All Jackson JARs on the classpath *must* be of the same version.
 
 ### Authentication failure
 
-The general cause is: you have the wrong credentials —or somehow
+If Hadoop cannot authenticate with the S3 service endpoint,
+the client retries a number of times before eventually failing.
+When it finally gives up, it will report a message about signature mismatch:
+
+```
+com.amazonaws.services.s3.model.AmazonS3Exception:
+ The request signature we calculated does not match the signature you provided.
+ Check your key and signing method.
+  (Service: Amazon S3; Status Code: 403; Error Code: SignatureDoesNotMatch,
+```
+
+The likely cause is that you either have the wrong credentials or somehow
 the credentials were not readable on the host attempting to read or write
 the S3 Bucket.
 
-There's not much that Hadoop can do for diagnostics here.
 Enabling debug logging for the package `org.apache.hadoop.fs.s3a`
-can help somewhat.
+can help provide more information.
 
-Most common: there's an error in the key or secret.
+The most common cause is that you have the wrong credentials for any of the current
+authentication mechanism(s) —or somehow
+the credentials were not readable on the host attempting to read or write
+the S3 Bucket. However, there are a couple of system configuration problems
+(JVM version, system clock) which also need to be checked.
 
-Otherwise, try to use the AWS command line tools with the same credentials.
-If you set the environment variables, you can take advantage of S3A's support
-of environment-variable authentication by attempting to use the `hdfs fs` command
-to read or write data on S3. That is: comment out the `fs.s3a` secrets and rely on
-the environment variables.
+Most common: there's an error in the configuration properties.
 
-### Authentication failure when using URLs with embedded secrets
+
+1. Make sure that the name of the bucket is the correct one.
+That is: check the URL.
+
+1. Make sure the property names are correct. For S3A, they are
+`fs.s3a.access.key` and `fs.s3a.secret.key` —you cannot just copy the S3N
+properties and replace `s3n` with `s3a`.
+
+1. Make sure the properties are visible to the process attempting to
+talk to the object store. Placing them in `core-site.xml` is the standard
+mechanism.
+
+1. If using session authentication, the session may have expired.
+Generate a new session token and secret.
+
+1. If using environement variable-based authentication, make sure that the
+relevant variables are set in the environment in which the process is running.
+
+The standard first step is: try to use the AWS command line tools with the same
+credentials, through a command such as:
+
+    hdfs fs -ls s3a://my-bucket/
+
+Note the trailing "/" here; without that the shell thinks you are trying to list
+your home directory under the bucket, which will only exist if explicitly created.
+
+
+Attempting to list a bucket using inline credentials is a
+means of verifying that the key and secret can access a bucket;
+
+    hdfs fs -ls s3a://key:secret@my-bucket/
+
+Do escape any `+` or `/` symbols in the secret, as discussed below, and never
+share the URL, logs generated using it, or use such an inline authentication
+mechanism in production.
+
+Finally, if you set the environment variables, you can take advantage of S3A's
+support of environment-variable authentication by attempting the same ls operation.
+That is: unset the `fs.s3a` secrets and rely on the environment variables.
+
+#### Authentication failure due to clock skew
+
+The timestamp is used in signing to S3, so as to
+defend against replay attacks. If the system clock is too far behind *or ahead*
+of Amazon's, requests will be rejected.
+
+This can surface as the situation where
+read requests are allowed, but operations which write to the bucket are denied.
+
+Check the system clock.
+
+#### Authentication failure when using URLs with embedded secrets
 
 If using the (strongly discouraged) mechanism of including the
 AWS Key and secret in a URL, then both "+" and "/" symbols need
@@ -1076,23 +1137,25 @@ encoding problems are not uncommon.
 | `/` | `%2F` |
 
 
-That is, a URL for `bucket` with AWS ID `user1` and secret `a+b/c` would
+As an example, a URL for `bucket` with AWS ID `user1` and secret `a+b/c` would
 be represented as
 
 ```
-s3a://user1:a%2Bb%2Fc@bucket
+s3a://user1:a%2Bb%2Fc@bucket/
 ```
 
 This technique is only needed when placing secrets in the URL. Again,
 this is something users are strongly advised against using.
 
-### Authentication failures running on Java 8u60+
+#### Authentication Failures When Running on Java 8u60+
 
 A change in the Java 8 JVM broke some of the `toString()` string generation
 of Joda Time 2.8.0, which stopped the Amazon S3 client from being able to
 generate authentication headers suitable for validation by S3.
 
-Fix: make sure that the version of Joda Time is 2.8.1 or later.
+**Fix**: Make sure that the version of Joda Time is 2.8.1 or later, or
+use a new version of Java 8.
+
 
 ### "Bad Request" exception when working with AWS S3 Frankfurt, Seoul, or other "V4" endpoint
 
@@ -1291,10 +1354,12 @@ expense of sequential read performance and bandwidth.
 The slow performance of `rename()` surfaces during the commit phase of work,
 including
 
-* The MapReduce FileOutputCommitter.
-* DistCp's rename after copy operation.
+* The MapReduce `FileOutputCommitter`.
+* DistCp's rename-after-copy operation.
+* The `hdfs fs -rm` command renaming the file under `.Trash` rather than
+deleting it. Use `-skipTrash` to eliminate that step.
 
-Both these operations can be significantly slower when S3 is the destination
+These operations can be significantly slower when S3 is the destination
 compared to HDFS or other "real" filesystem.
 
 *Improving S3 load-balancing behavior*