81edbebdd8
Tune AWS v2 SDK changes based on testing with third party stores including GCS. Contains HADOOP-18889. S3A v2 SDK error translations and troubleshooting docs * Changes needed to work with multiple third party stores * New third_party_stores document on how to bind to and test third party stores, including google gcs (which works!) * Troubleshooting docs mostly updated for v2 SDK Exception translation/resilience * New AWSUnsupportedFeatureException for unsupported/unavailable errors * Handle 501 method unimplemented as one of these * Error codes > 500 mapped to the AWSStatus500Exception if no explicit handler. * Precondition errors handled a bit better * GCS throttle exception also recognized. * GCS raises 404 on a delete of a file which doesn't exist: swallow it. * Error translation uses reflection to create IOE of the right type. All IOEs at the bottom of an AWS stack chain are regenerated. then a new exception of that specific type is created, with the top level ex its cause. This is done to retain the whole stack chain. * Reduce the number of retries within the AWS SDK * And those of s3a code. * S3ARetryPolicy explicitly declare SocketException as connectivity failure but subclasses BindException * SocketTimeoutException also considered connectivity * Log at debug whenever retry policies looked up * Reorder exceptions to alphabetical order, with commentary * Review use of the Invoke.retry() method The reduction in retries is because its clear when you try to create a bucket which doesn't resolve that the time for even an UnknownHostException to eventually fail over 90s, which then hit the s3a retry code. - Reducing the SDK retries means these escalate to our code better. - Cutting back on our own retries makes it a bit more responsive for most real deployments. - maybeTranslateNetworkException() and s3a retry policy means that unknown host exception is recognised and fails fast. Contributed by Steve Loughran |
||
---|---|---|
.. | ||
hadoop-aliyun | ||
hadoop-archive-logs | ||
hadoop-archives | ||
hadoop-aws | ||
hadoop-azure | ||
hadoop-azure-datalake | ||
hadoop-benchmark | ||
hadoop-datajoin | ||
hadoop-distcp | ||
hadoop-dynamometer | ||
hadoop-extras | ||
hadoop-federation-balance | ||
hadoop-fs2img | ||
hadoop-gridmix | ||
hadoop-kafka | ||
hadoop-openstack | ||
hadoop-pipes | ||
hadoop-resourceestimator | ||
hadoop-rumen | ||
hadoop-sls | ||
hadoop-streaming | ||
hadoop-tools-dist | ||
pom.xml |