hadoop/hadoop-common-project/hadoop-common/src/main
Steve Loughran 87fb977777
HADOOP-19098. Vector IO: Specify and validate ranges consistently. #6604
Clarifies behaviour of VectorIO methods with contract tests as well as
specification.

* Add precondition range checks to all implementations
* Identify and fix bug where direct buffer reads was broken
  (HADOOP-19101; this surfaced in ABFS contract tests)
* Logging in VectoredReadUtils.
* TestVectoredReadUtils verifies validation logic.
* FileRangeImpl toString() improvements
* CombinedFileRange tracks bytes in range which are wanted;
   toString() output logs this.

HDFS
* Add test TestHDFSContractVectoredRead

ABFS
* Add test ITestAbfsFileSystemContractVectoredRead

S3A
* checks for vector IO being stopped in all iterative
  vector operations, including draining
* maps read() returning -1 to failure
* passes in file length to validation
* Error reporting to only completeExceptionally() those ranges
  which had not yet read data in.
* Improved logging.

readVectored()
* made synchronized. This is only for the invocation;
  the actual async retrieves are unsynchronized.
* closes input stream on invocation
* switches to random IO, so avoids keeping any long-lived connection around.

+ AbstractSTestS3AHugeFiles enhancements.
+ ADDENDUM: test fix in ITestS3AContractVectoredRead

Contains: HADOOP-19101. Vectored Read into off-heap buffer broken in fallback
implementation

Contributed by Steve Loughran

Change-Id: Ia4ed71864c595f175c275aad83a2ff5741693432
2024-04-03 13:17:52 +01:00
..
arm-java/org/apache/hadoop/ipc/protobuf HADOOP-17046. Support downstreams' existing Hadoop-rpc implementations using non-shaded protobuf classes (#2026) 2020-06-12 23:16:33 +05:30
bin HADOOP-18779. Improve hadoop-function.sh#status script. (#5762) 2023-07-03 08:46:57 -07:00
conf HADOOP-18836. Some properties are missing from hadoop-policy.xml (#5922) 2023-08-07 20:03:23 +08:00
java/org/apache/hadoop HADOOP-19098. Vector IO: Specify and validate ranges consistently. #6604 2024-04-03 13:17:52 +01:00
native HADOOP-14451. Deadlock in NativeIO (#6632) 2024-03-18 10:53:21 +05:30
proto HDFS-16669: Enhance client protocol to propagate last seen state IDs for multiple nameservices. 2022-08-23 11:12:50 -07:00
resources HADOOP-19097. S3A: Set fs.s3a.connection.establish.timeout to 30s (#6601) 2024-03-05 10:10:27 +00:00
webapps/static
winutils HADOOP-17931. Fix typos in usage message in winutils.exe (#3490) 2021-09-27 13:41:55 -07:00
xsl