Go to file
Steve Loughran 36198b5edf
HADOOP-19027. S3A: S3AInputStream doesn't recover from HTTP/channel exceptions (#6425)
Differentiate from "EOF out of range/end of GET" from
"EOF channel problems" through
two different subclasses of EOFException and input streams to always
retry on http channel errors; out of range GET requests are not retried.
Currently an EOFException is always treated as a fail-fast call in read()

This allows for all existing external code catching EOFException to handle
both, but S3AInputStream to cleanly differentiate range errors (map to -1)
from channel errors (retry)

- HttpChannelEOFException is subclass of EOFException, so all code
  which catches EOFException is still happy.
  retry policy: connectivityFailure
- RangeNotSatisfiableEOFException is the subclass of EOFException
  raised on 416 GET range errors.
  retry policy: fail
- Method ErrorTranslation.maybeExtractChannelException() to create this
  from shaded/unshaded NoHttpResponseException, using string match to
  avoid classpath problems.
- And do this for SdkClientExceptions with OpenSSL error code WFOPENSSL0035.
  We believe this is the OpenSSL equivalent.
- ErrorTranslation.maybeExtractIOException() to perform this translation as
  appropriate.

S3AInputStream.reopen() code retries on EOF, except on
 RangeNotSatisfiableEOFException,
 which is converted to a -1 response to the caller
 as is done historically.

S3AInputStream knows to handle these with
 read(): HttpChannelEOFException: stream aborting close then retry
 lazySeek(): Map RangeNotSatisfiableEOFException to -1, but do not map
  any other EOFException class raised.

This means that
* out of range reads map to -1
* channel problems in reopen are retried
* channel problems in read() abort the failed http connection so it
  isn't recycled

Tests for this using/abusing mocking.

Testing through actually raising 416 exceptions and verifying that
readFully(), char read() and vector reads are all good.

There is no attempt to recover within a readFully(); there's
a boolean constant switch to turn this on, but if anyone does
it a test will spin forever as the inner PositionedReadable.read(position, buffer, len)
downgrades all EOF exceptions to -1.
A new method would need to be added which controls whether to downgrade/rethrow
exceptions.

What does that mean? Possibly reduced resilience to non-retried failures
on the inner stream, even though more channel exceptions are retried on.

Contributed by Steve Loughran
2024-01-16 14:14:03 +00:00
.github HADOOP-18823. Add Labeler Github Action. (#5874). Contributed by Ayush Saxena. 2023-07-25 03:04:49 +05:30
.yetus Add .yetus/excludes.txt (#4984) 2022-10-11 09:23:34 -07:00
dev-support HADOOP-19034. Fix Download Maven Url Not Found. (#6438). Contributed by Shilun Fan. 2024-01-14 18:30:40 +08:00
hadoop-assemblies HDFS-15346. FedBalance tool implementation. Contributed by Jinglun. 2020-06-18 13:33:25 +08:00
hadoop-build-tools HADOOP-17968 Migrate checkstyle module illegalimport to maven enforcer banned-illegal-imports (#3584) 2021-10-28 15:57:15 +09:00
hadoop-client-modules HADOOP-18916. Exclude all module-info classes from uber jars (#6131) 2023-10-13 20:01:44 +01:00
hadoop-cloud-storage-project HADOOP-18890. Remove use of okhttp in runtime code (#6057) 2023-09-19 12:38:36 +01:00
hadoop-common-project HADOOP-19040. mvn site commands fails due to MetricsSystem And MetricsSystemImpl changes. (#6450) Contributed by Shilun Fan. 2024-01-16 22:11:16 +08:00
hadoop-dist HADOOP-18718. Fix several maven build warnings (#5592). Contributed by Dongjoon Hyun. 2023-06-11 11:38:13 +05:30
hadoop-hdfs-project HDFS-17291. DataNode metric bytesWritten is not totally accurate in some situations. (#6360). Contributed by farmmamba. 2024-01-13 20:45:00 +08:00
hadoop-mapreduce-project MAPREDUCE-7468. [Addendum] Fix TestMapReduceChildJVM unit tests. (#6451) 2024-01-15 14:24:56 +01:00
hadoop-maven-plugins HADOOP-19011. Possible ConcurrentModificationException if Exec command fails (#6353) 2023-12-14 19:46:19 +01:00
hadoop-minicluster HADOOP-18131. Upgrade maven enforcer plugin and relevant dependencies (#4000) 2022-03-08 17:27:04 +09:00
hadoop-project HADOOP-18540. Upgrade Bouncy Castle to 1.70 (#5166) 2024-01-01 19:04:06 +00:00
hadoop-project-dist HADOOP-18751. Fix incorrect output path in javadoc build phase (#5688) 2023-06-26 15:52:17 -07:00
hadoop-tools HADOOP-19027. S3A: S3AInputStream doesn't recover from HTTP/channel exceptions (#6425) 2024-01-16 14:14:03 +00:00
hadoop-yarn-project YARN-11638. [GPG] GPG Support CLI. (#6396) Contributed by Shilun Fan. 2024-01-16 21:49:51 +08:00
licenses HADOOP-17144. Update Hadoop's lz4 to v1.9.2. Contributed by Hemanth Boyina. 2020-10-18 18:37:46 +05:30
licenses-binary HADOOP-15993. Upgrade Kafka to 2.4.0 in hadoop-kafka module. (#1796) 2020-01-09 16:24:58 +09:00
.asf.yaml HADOOP-18630. Add gh-pages in asf.yaml to deploy the current trunk doc (#5393). Contributed by Simhadri Govindappa. 2023-02-14 18:13:29 +05:30
.gitattributes HADOOP-13598. Add eol=lf for unix format files in .gitattributes. Contributed by Yiqun Lin. 2016-09-14 11:14:31 +09:00
.gitignore HADOOP-18963. Fix typos in .gitignore (#6243) 2023-11-04 05:12:39 +05:30
BUILDING.txt HADOOP-18487. Protobuf 2.5 removal part 2: stop exporting protobuf-2.5 (#6185) 2023-11-06 17:52:05 +00:00
LICENSE-binary HADOOP-18540. Upgrade Bouncy Castle to 1.70 (#5166) 2024-01-01 19:04:06 +00:00
LICENSE.txt YARN-11356. Upgrade DataTables to 1.11.5 to fix CVEs. Contributed by Bence Kosztolnik. 2022-10-26 22:29:01 +02:00
NOTICE-binary HADOOP-18890. Remove use of okhttp in runtime code (#6057) 2023-09-19 12:38:36 +01:00
NOTICE.txt HADOOP-15958. Revisiting LICENSE and NOTICE files. 2019-08-27 13:47:12 +09:00
pom.xml HADOOP-18957. Use StandardCharsets.UTF_8 (#6231). Contributed by PJ Fanning. 2023-11-20 23:44:48 +05:30
README.txt HADOOP-15958. Revisiting LICENSE and NOTICE files. 2019-08-27 13:47:12 +09:00
start-build-env.sh HADOOP-18052. Support Apple Silicon in start-build-env.sh (#3817) 2021-12-23 18:13:18 +09:00

For the latest information about Hadoop, please visit our website at:

   http://hadoop.apache.org/

and our wiki, at:

   https://cwiki.apache.org/confluence/display/HADOOP/