Go to file
Steve Loughran 87fb977777
HADOOP-19098. Vector IO: Specify and validate ranges consistently. #6604
Clarifies behaviour of VectorIO methods with contract tests as well as
specification.

* Add precondition range checks to all implementations
* Identify and fix bug where direct buffer reads was broken
  (HADOOP-19101; this surfaced in ABFS contract tests)
* Logging in VectoredReadUtils.
* TestVectoredReadUtils verifies validation logic.
* FileRangeImpl toString() improvements
* CombinedFileRange tracks bytes in range which are wanted;
   toString() output logs this.

HDFS
* Add test TestHDFSContractVectoredRead

ABFS
* Add test ITestAbfsFileSystemContractVectoredRead

S3A
* checks for vector IO being stopped in all iterative
  vector operations, including draining
* maps read() returning -1 to failure
* passes in file length to validation
* Error reporting to only completeExceptionally() those ranges
  which had not yet read data in.
* Improved logging.

readVectored()
* made synchronized. This is only for the invocation;
  the actual async retrieves are unsynchronized.
* closes input stream on invocation
* switches to random IO, so avoids keeping any long-lived connection around.

+ AbstractSTestS3AHugeFiles enhancements.
+ ADDENDUM: test fix in ITestS3AContractVectoredRead

Contains: HADOOP-19101. Vectored Read into off-heap buffer broken in fallback
implementation

Contributed by Steve Loughran

Change-Id: Ia4ed71864c595f175c275aad83a2ff5741693432
2024-04-03 13:17:52 +01:00
.github HADOOP-18823. Add Labeler Github Action. (#5874). Contributed by Ayush Saxena. 2023-07-25 03:04:49 +05:30
.yetus Add .yetus/excludes.txt (#4984) 2022-10-11 09:23:34 -07:00
dev-support HADOOP-19127. Do not run unit tests on Windows pre-commit CI (#6672) 2024-03-25 09:16:03 -07:00
hadoop-assemblies HADOOP-18088. Replace log4j 1.x with reload4j. (#4052) 2024-02-13 16:33:51 +00:00
hadoop-build-tools Preparing for 3.5.0 development (#6411) 2024-01-19 15:05:22 +08:00
hadoop-client-modules HADOOP-19024. Use bouncycastle jdk18 1.77 (#6410). Contributed 2024-03-30 19:58:12 +05:30
hadoop-cloud-storage-project HADOOP-19024. Use bouncycastle jdk18 1.77 (#6410). Contributed 2024-03-30 19:58:12 +05:30
hadoop-common-project HADOOP-19098. Vector IO: Specify and validate ranges consistently. #6604 2024-04-03 13:17:52 +01:00
hadoop-dist Preparing for 3.5.0 development (#6411) 2024-01-19 15:05:22 +08:00
hadoop-hdfs-project HADOOP-19098. Vector IO: Specify and validate ranges consistently. #6604 2024-04-03 13:17:52 +01:00
hadoop-mapreduce-project HADOOP-19077. Remove use of javax.ws.rs.core.HttpHeaders (#6554). Contributed by PJ Fanning 2024-04-01 12:43:39 +05:30
hadoop-maven-plugins HADOOP-19041. Use StandardCharsets in more places (#6449) 2024-03-28 23:17:18 -04:00
hadoop-minicluster Preparing for 3.5.0 development (#6411) 2024-01-19 15:05:22 +08:00
hadoop-project HADOOP-19123. Update to commons-configuration2 2.10.1 due to CVE (#6661). Contributed by PJ Fanning 2024-04-03 01:20:00 +05:30
hadoop-project-dist HADOOP-19112. Hadoop 3.4.0 release wrap-up. (#6640) Contributed by Shilun Fan. 2024-03-19 20:08:03 +08:00
hadoop-tools HADOOP-19098. Vector IO: Specify and validate ranges consistently. #6604 2024-04-03 13:17:52 +01:00
hadoop-yarn-project YARN-11663. [Federation] Add Cache Entity Nums Limit. (#6662) Contributed by Shilun Fan. 2024-04-02 07:47:59 +08:00
licenses HADOOP-17144. Update Hadoop's lz4 to v1.9.2. Contributed by Hemanth Boyina. 2020-10-18 18:37:46 +05:30
licenses-binary HADOOP-15993. Upgrade Kafka to 2.4.0 in hadoop-kafka module. (#1796) 2020-01-09 16:24:58 +09:00
.asf.yaml HADOOP-18630. Add gh-pages in asf.yaml to deploy the current trunk doc (#5393). Contributed by Simhadri Govindappa. 2023-02-14 18:13:29 +05:30
.gitattributes HADOOP-13598. Add eol=lf for unix format files in .gitattributes. Contributed by Yiqun Lin. 2016-09-14 11:14:31 +09:00
.gitignore HADOOP-18963. Fix typos in .gitignore (#6243) 2023-11-04 05:12:39 +05:30
BUILDING.txt YARN-11657. Remove protobuf-2.5 from hadoop-yarn-api module (#6575) (#6580) 2024-03-05 11:01:14 +00:00
LICENSE-binary HADOOP-19123. Update to commons-configuration2 2.10.1 due to CVE (#6661). Contributed by PJ Fanning 2024-04-03 01:20:00 +05:30
LICENSE.txt YARN-11356. Upgrade DataTables to 1.11.5 to fix CVEs. Contributed by Bence Kosztolnik. 2022-10-26 22:29:01 +02:00
NOTICE-binary HADOOP-19046. S3A: update AWS V2 SDK to 2.23.5; v1 to 1.12.599 (#6467) 2024-01-21 19:00:34 +00:00
NOTICE.txt HADOOP-15958. Revisiting LICENSE and NOTICE files. 2019-08-27 13:47:12 +09:00
pom.xml Preparing for 3.5.0 development (#6411) 2024-01-19 15:05:22 +08:00
README.txt HADOOP-15958. Revisiting LICENSE and NOTICE files. 2019-08-27 13:47:12 +09:00
start-build-env.sh HADOOP-18052. Support Apple Silicon in start-build-env.sh (#3817) 2021-12-23 18:13:18 +09:00

For the latest information about Hadoop, please visit our website at:

   http://hadoop.apache.org/

and our wiki, at:

   https://cwiki.apache.org/confluence/display/HADOOP/