81edbebdd8
Tune AWS v2 SDK changes based on testing with third party stores including GCS. Contains HADOOP-18889. S3A v2 SDK error translations and troubleshooting docs * Changes needed to work with multiple third party stores * New third_party_stores document on how to bind to and test third party stores, including google gcs (which works!) * Troubleshooting docs mostly updated for v2 SDK Exception translation/resilience * New AWSUnsupportedFeatureException for unsupported/unavailable errors * Handle 501 method unimplemented as one of these * Error codes > 500 mapped to the AWSStatus500Exception if no explicit handler. * Precondition errors handled a bit better * GCS throttle exception also recognized. * GCS raises 404 on a delete of a file which doesn't exist: swallow it. * Error translation uses reflection to create IOE of the right type. All IOEs at the bottom of an AWS stack chain are regenerated. then a new exception of that specific type is created, with the top level ex its cause. This is done to retain the whole stack chain. * Reduce the number of retries within the AWS SDK * And those of s3a code. * S3ARetryPolicy explicitly declare SocketException as connectivity failure but subclasses BindException * SocketTimeoutException also considered connectivity * Log at debug whenever retry policies looked up * Reorder exceptions to alphabetical order, with commentary * Review use of the Invoke.retry() method The reduction in retries is because its clear when you try to create a bucket which doesn't resolve that the time for even an UnknownHostException to eventually fail over 90s, which then hit the s3a retry code. - Reducing the SDK retries means these escalate to our code better. - Cutting back on our own retries makes it a bit more responsive for most real deployments. - maybeTranslateNetworkException() and s3a retry policy means that unknown host exception is recognised and fails fast. Contributed by Steve Loughran |
||
---|---|---|
.github | ||
.yetus | ||
dev-support | ||
hadoop-assemblies | ||
hadoop-build-tools | ||
hadoop-client-modules | ||
hadoop-cloud-storage-project | ||
hadoop-common-project | ||
hadoop-dist | ||
hadoop-hdfs-project | ||
hadoop-mapreduce-project | ||
hadoop-maven-plugins | ||
hadoop-minicluster | ||
hadoop-project | ||
hadoop-project-dist | ||
hadoop-tools | ||
hadoop-yarn-project | ||
licenses | ||
licenses-binary | ||
.asf.yaml | ||
.gitattributes | ||
.gitignore | ||
BUILDING.txt | ||
LICENSE-binary | ||
LICENSE.txt | ||
NOTICE-binary | ||
NOTICE.txt | ||
pom.xml | ||
README.txt | ||
start-build-env.sh |
For the latest information about Hadoop, please visit our website at: http://hadoop.apache.org/ and our wiki, at: https://cwiki.apache.org/confluence/display/HADOOP/