hadoop

Go to file

Steve Loughran e1842b2a74 HADOOP-18103. Add a high-performance vectored read API. (#4476 ) This feature adds methods for ranged vectored read operations in PositionedReadable. All stream which implement that interface support the new API. The default implementation reads each range in the vector sequentially. However, specific implementations may provide higher performance versions. This is done in two places * Local FileSystem/Checksum FileSystem * The S3A client. The S3A client first coalesces adjacent and "nearby" ranges together, then fetches each range in separate HTTP GET requests, executed in parallel. As such it delivers significant speedups to applications reading separate blocks of data from the same file, columnar data format libraries in particular. This is the merge commit of the feature branch; the work is in HADOOP-11867. Add a high-performance vectored read API. HADOOP-18104. S3A: Add configs to configure minSeekForVectorReads and maxReadSizeForVectorReads. HADOOP-18107. Adding scale test for vectored reads for large file HADOOP-18105. Implement buffer pooling with weak references. HADOOP-18106. Handle memory fragmentation in S3A Vectored IO. Contributed By: Owen O'Malley and Mukund Thakur		2022-06-22 18:19:23 +01:00
.github	HADOOP-17799. Improve the GitHub pull request template (#3277 )	2021-08-14 21:16:15 +09:00
dev-support	HADOOP-11867. Add a high-performance vectored read API. (#3904 )	2022-06-22 17:29:32 +01:00
hadoop-assemblies	HDFS-15346. FedBalance tool implementation. Contributed by Jinglun.	2020-06-18 13:33:25 +08:00
hadoop-build-tools	HADOOP-17968 Migrate checkstyle module illegalimport to maven enforcer banned-illegal-imports (#3584 )	2021-10-28 15:57:15 +09:00
hadoop-client-modules	HDFS-16453. Upgrade okhttp from 2.7.5 to 4.9.3 (#4229 )	2022-05-21 02:53:14 +09:00
hadoop-cloud-storage-project	HADOOP-18159. Bump cos_api-bundle to 5.6.69 to update public-suffix-list.txt (#4444 )	2022-06-15 20:03:26 +01:00
hadoop-common-project	HADOOP-18106: Handle memory fragmentation in S3A Vectored IO. (#4445 )	2022-06-22 17:29:32 +01:00
hadoop-dist	Preparing for 3.4.0 development	2020-03-29 23:24:25 +05:30
hadoop-hdfs-project	HDFS-16616. remove use of org.apache.hadoop.util.Sets (#4400 )	2022-06-22 10:17:36 +05:30
hadoop-mapreduce-project	MAPREDUCE-7391. TestLocalDistributedCacheManager failing after HADOOP-16202 (#4472 )	2022-06-22 12:52:41 +01:00
hadoop-maven-plugins	HADOOP-18131. Upgrade maven enforcer plugin and relevant dependencies (#4000 )	2022-03-08 17:27:04 +09:00
hadoop-minicluster	HADOOP-18131. Upgrade maven enforcer plugin and relevant dependencies (#4000 )	2022-03-08 17:27:04 +09:00
hadoop-project	HADOOP-11867. Add a high-performance vectored read API. (#3904 )	2022-06-22 17:29:32 +01:00
hadoop-project-dist	HADOOP-18198. Release 3.3.3: release notes and jdiff files.	2022-05-17 19:00:54 +01:00
hadoop-tools	HADOOP-18106: Handle memory fragmentation in S3A Vectored IO. (#4445 )	2022-06-22 17:29:32 +01:00
hadoop-yarn-project	YARN-11188. Only files belong to the first file controller are removed even if multiple log aggregation file controllers are configured. Contributed by Szilard Nemeth.	2022-06-22 14:40:20 +02:00
licenses	HADOOP-17144. Update Hadoop's lz4 to v1.9.2. Contributed by Hemanth Boyina.	2020-10-18 18:37:46 +05:30
licenses-binary	HADOOP-15993. Upgrade Kafka to 2.4.0 in hadoop-kafka module. (#1796 )	2020-01-09 16:24:58 +09:00
.asf.yaml	HADOOP-17234. Add .asf.yaml to allow Github to Jira integration. (#2253 ). Contributed by Ayush Saxena.	2020-08-28 17:22:46 +05:30
.gitattributes	HADOOP-13598. Add eol=lf for unix format files in .gitattributes. Contributed by Yiqun Lin.	2016-09-14 11:14:31 +09:00
.gitignore	YARN-10407. Add phantomjsdriver.log to gitignore. (#2244 )	2020-09-01 10:44:55 +09:00
BUILDING.txt	Update BUILDING.txt (#3811 )	2021-12-22 13:08:14 +08:00
LICENSE-binary	HDFS-16453. Upgrade okhttp from 2.7.5 to 4.9.3 (#4229 )	2022-05-21 02:53:14 +09:00
LICENSE.txt	HADOOP-18044. Hadoop - Upgrade to jQuery 3.6.0 (#3791 )	2022-01-12 11:40:32 +08:00
NOTICE-binary	HADOOP-18068. upgrade AWS SDK to 1.12.132 (#3864 )	2022-01-18 10:31:28 +00:00
NOTICE.txt	HADOOP-15958. Revisiting LICENSE and NOTICE files.	2019-08-27 13:47:12 +09:00
pom.xml	HADOOP-11867. Add a high-performance vectored read API. (#3904 )	2022-06-22 17:29:32 +01:00
README.txt	HADOOP-15958. Revisiting LICENSE and NOTICE files.	2019-08-27 13:47:12 +09:00
start-build-env.sh	HADOOP-18052. Support Apple Silicon in start-build-env.sh (#3817 )	2021-12-23 18:13:18 +09:00

README.txt

For the latest information about Hadoop, please visit our website at:

   http://hadoop.apache.org/

and our wiki, at:

   https://cwiki.apache.org/confluence/display/HADOOP/