hadoop/hadoop-tools/hadoop-aws
Steve Loughran 7bb09f1010
HADOOP-18752. Change fs.s3a.directory.marker.retention to "keep" (#5689)
This 
1. changes the default value of fs.s3a.directory.marker.retention
   to "keep"
2. no longer prints a message when an S3A FS instances is
   instantiated with any option other than delete.

Switching to marker retention improves performance
on any S3 bucket as there are no needless marker DELETE requests
-leading to a reduction in write IOPS and and any delays waiting
for the DELETE call to finish.

There are *very* significant improvements on versioned buckets,
where tombstone markers slow down LIST operations: the more
tombstones there are, the worse query planning gets.

Having versioning enabled on production stores is the foundation
of any data protection strategy, so this has tangible benefits
in production.

It is *not* compatible with older hadoop releases; specifically
- Hadoop branch 2 < 2.10.2
- Any release of Hadoop 3.0.x and Hadoop 3.1.x
- Hadoop 3.2.0 and 3.2.1
- Hadoop 3.3.0
Incompatible releases have no problems reading data in stores
where markers are retained, but can get confused when deleting
or renaming directories.

If you are still using older versions to write to data, and cannot
yet upgrade, switch the option back to "delete"

Contributed by Steve Loughran
2023-06-08 12:12:29 +01:00
..
dev-support HADOOP-18466. Limit the findbugs suppression IS2_INCONSISTENT_SYNC to S3AFileSystem field (#4926) 2022-09-26 18:56:58 +01:00
src HADOOP-18752. Change fs.s3a.directory.marker.retention to "keep" (#5689) 2023-06-08 12:12:29 +01:00
pom.xml HADOOP-18399. S3A Prefetch - SingleFilePerBlockCache to use LocalDirAllocator (#5054) 2023-04-18 16:37:48 +01:00