hadoop/hadoop-tools/hadoop-aws
Steve Loughran 6574f27fa3
HADOOP-16570. S3A committers encounter scale issues.
Contributed by Steve Loughran.

This addresses two scale issues which has surfaced in large scale benchmarks
of the S3A Committers.

* Thread pools are not cleaned up.
  This now happens, with tests.

* OOM on job commit for jobs with many thousands of tasks,
  each generating tens of (very large) files.

Instead of loading all pending commits into memory as a single list, the list
of files to load is the sole list which is passed around; .pendingset files are
loaded and processed in isolation -and reloaded if necessary for any
abort/rollback operation.

The parallel commit/abort/revert operations now work at the .pendingset level,
rather than that of individual pending commit files. The existing parallelized
Tasks API is still used to commit those files, but with a null thread pool, so
as to serialize the operations.

Change-Id: I5c8240cd31800eaa83d112358770ca0eb2bca797
2019-10-04 18:54:22 +01:00
..
dev-support HADOOP-15229. Add FileSystem builder-based openFile() API to match createFile(); 2019-02-05 11:51:02 +00:00
src HADOOP-16570. S3A committers encounter scale issues. 2019-10-04 18:54:22 +01:00
pom.xml HADOOP-16207 Improved S3A MR tests. 2019-10-04 14:12:31 +01:00