36198b5edf
Differentiate from "EOF out of range/end of GET" from "EOF channel problems" through two different subclasses of EOFException and input streams to always retry on http channel errors; out of range GET requests are not retried. Currently an EOFException is always treated as a fail-fast call in read() This allows for all existing external code catching EOFException to handle both, but S3AInputStream to cleanly differentiate range errors (map to -1) from channel errors (retry) - HttpChannelEOFException is subclass of EOFException, so all code which catches EOFException is still happy. retry policy: connectivityFailure - RangeNotSatisfiableEOFException is the subclass of EOFException raised on 416 GET range errors. retry policy: fail - Method ErrorTranslation.maybeExtractChannelException() to create this from shaded/unshaded NoHttpResponseException, using string match to avoid classpath problems. - And do this for SdkClientExceptions with OpenSSL error code WFOPENSSL0035. We believe this is the OpenSSL equivalent. - ErrorTranslation.maybeExtractIOException() to perform this translation as appropriate. S3AInputStream.reopen() code retries on EOF, except on RangeNotSatisfiableEOFException, which is converted to a -1 response to the caller as is done historically. S3AInputStream knows to handle these with read(): HttpChannelEOFException: stream aborting close then retry lazySeek(): Map RangeNotSatisfiableEOFException to -1, but do not map any other EOFException class raised. This means that * out of range reads map to -1 * channel problems in reopen are retried * channel problems in read() abort the failed http connection so it isn't recycled Tests for this using/abusing mocking. Testing through actually raising 416 exceptions and verifying that readFully(), char read() and vector reads are all good. There is no attempt to recover within a readFully(); there's a boolean constant switch to turn this on, but if anyone does it a test will spin forever as the inner PositionedReadable.read(position, buffer, len) downgrades all EOF exceptions to -1. A new method would need to be added which controls whether to downgrade/rethrow exceptions. What does that mean? Possibly reduced resilience to non-retried failures on the inner stream, even though more channel exceptions are retried on. Contributed by Steve Loughran |
||
---|---|---|
.github | ||
.yetus | ||
dev-support | ||
hadoop-assemblies | ||
hadoop-build-tools | ||
hadoop-client-modules | ||
hadoop-cloud-storage-project | ||
hadoop-common-project | ||
hadoop-dist | ||
hadoop-hdfs-project | ||
hadoop-mapreduce-project | ||
hadoop-maven-plugins | ||
hadoop-minicluster | ||
hadoop-project | ||
hadoop-project-dist | ||
hadoop-tools | ||
hadoop-yarn-project | ||
licenses | ||
licenses-binary | ||
.asf.yaml | ||
.gitattributes | ||
.gitignore | ||
BUILDING.txt | ||
LICENSE-binary | ||
LICENSE.txt | ||
NOTICE-binary | ||
NOTICE.txt | ||
pom.xml | ||
README.txt | ||
start-build-env.sh |
For the latest information about Hadoop, please visit our website at: http://hadoop.apache.org/ and our wiki, at: https://cwiki.apache.org/confluence/display/HADOOP/