hadoop/hadoop-tools
Steve Loughran 936e9e15d0
MAPREDUCE-7435. Manifest Committer OOM on abfs (#5519)
This modifies the manifest committer so that the list of files
to rename is passed between stages as a file of
writeable entries on the local filesystem.

The map of directories to create is still passed in memory;
this map is built across all tasks, so even if many tasks
created files, if they all write into the same set of directories
the memory needed is O(directories) with the
task count not a factor.

The _SUCCESS file reports on heap size through gauges.
This should give a warning if there are problems.

Contributed by Steve Loughran
2023-06-12 13:43:43 +01:00
..
hadoop-aliyun HADOOP-18641. Cloud connector dependency and LICENSE fixup. (#5429) 2023-02-28 14:05:13 +00:00
hadoop-archive-logs HADOOP-18305. Preparing for 3.3.4 release: branch-3.3 version => 3.3.9 (#4482) 2022-06-22 13:09:50 +01:00
hadoop-archives HADOOP-18305. Preparing for 3.3.4 release: branch-3.3 version => 3.3.9 (#4482) 2022-06-22 13:09:50 +01:00
hadoop-aws MAPREDUCE-7435. Manifest Committer OOM on abfs (#5519) 2023-06-12 13:43:43 +01:00
hadoop-azure MAPREDUCE-7435. Manifest Committer OOM on abfs (#5519) 2023-06-12 13:43:43 +01:00
hadoop-azure-datalake HADOOP-18641. Cloud connector dependency and LICENSE fixup. (#5429) 2023-02-28 14:05:13 +00:00
hadoop-benchmark HADOOP-18507. VectorIO FileRange type to support a "reference" field (#5076) 2022-11-08 13:35:42 +00:00
hadoop-datajoin HADOOP-18305. Preparing for 3.3.4 release: branch-3.3 version => 3.3.9 (#4482) 2022-06-22 13:09:50 +01:00
hadoop-distcp HADOOP-18582. Addendum: Skip unnecessary cleanup logic in DistCp. (#5409) 2023-02-22 19:32:05 +00:00
hadoop-dynamometer HADOOP-18305. Preparing for 3.3.4 release: branch-3.3 version => 3.3.9 (#4482) 2022-06-22 13:09:50 +01:00
hadoop-extras HADOOP-18305. Preparing for 3.3.4 release: branch-3.3 version => 3.3.9 (#4482) 2022-06-22 13:09:50 +01:00
hadoop-fs2img HADOOP-18305. Preparing for 3.3.4 release: branch-3.3 version => 3.3.9 (#4482) 2022-06-22 13:09:50 +01:00
hadoop-gridmix HADOOP-18305. Preparing for 3.3.4 release: branch-3.3 version => 3.3.9 (#4482) 2022-06-22 13:09:50 +01:00
hadoop-kafka HADOOP-18305. Preparing for 3.3.4 release: branch-3.3 version => 3.3.9 (#4482) 2022-06-22 13:09:50 +01:00
hadoop-openstack HADOOP-18442. Remove openstack support (#4855) 2022-10-07 12:03:08 +01:00
hadoop-pipes HADOOP-18305. Preparing for 3.3.4 release: branch-3.3 version => 3.3.9 (#4482) 2022-06-22 13:09:50 +01:00
hadoop-resourceestimator HADOOP-15983. Use jersey-json that is built to use jackson2 ((#3988) 2022-10-20 17:37:56 +01:00
hadoop-rumen HADOOP-18469. Add secure XML parser factories to XMLUtils (#4940) 2022-10-07 10:47:55 +01:00
hadoop-sls HADOOP-18305. Preparing for 3.3.4 release: branch-3.3 version => 3.3.9 (#4482) 2022-06-22 13:09:50 +01:00
hadoop-streaming MAPREDUCE-7371. DistributedCache alternative APIs should not use DistributedCache APIs internally (#3855) 2022-06-22 13:13:05 +01:00
hadoop-tools-dist HADOOP-18442. Remove openstack support (#4855) 2022-10-07 12:03:08 +01:00
pom.xml HADOOP-11867. Add a high-performance vectored read API. (#3904) 2022-06-23 17:09:16 -05:00