Go to file
Steve Loughran 7bb09f1010
HADOOP-18752. Change fs.s3a.directory.marker.retention to "keep" (#5689)
This 
1. changes the default value of fs.s3a.directory.marker.retention
   to "keep"
2. no longer prints a message when an S3A FS instances is
   instantiated with any option other than delete.

Switching to marker retention improves performance
on any S3 bucket as there are no needless marker DELETE requests
-leading to a reduction in write IOPS and and any delays waiting
for the DELETE call to finish.

There are *very* significant improvements on versioned buckets,
where tombstone markers slow down LIST operations: the more
tombstones there are, the worse query planning gets.

Having versioning enabled on production stores is the foundation
of any data protection strategy, so this has tangible benefits
in production.

It is *not* compatible with older hadoop releases; specifically
- Hadoop branch 2 < 2.10.2
- Any release of Hadoop 3.0.x and Hadoop 3.1.x
- Hadoop 3.2.0 and 3.2.1
- Hadoop 3.3.0
Incompatible releases have no problems reading data in stores
where markers are retained, but can get confused when deleting
or renaming directories.

If you are still using older versions to write to data, and cannot
yet upgrade, switch the option back to "delete"

Contributed by Steve Loughran
2023-06-08 12:12:29 +01:00
.github HADOOP-18524. Addendum: Deploy Hadoop trunk version website. (#5389). Contributed by Ayush Saxena. 2023-02-14 11:05:41 +05:30
.yetus Add .yetus/excludes.txt (#4984) 2022-10-11 09:23:34 -07:00
dev-support HADOOP-18746. Install Python 3 for Windows 10 docker image (#5679) 2023-05-21 21:10:04 +05:30
hadoop-assemblies HDFS-15346. FedBalance tool implementation. Contributed by Jinglun. 2020-06-18 13:33:25 +08:00
hadoop-build-tools HADOOP-17968 Migrate checkstyle module illegalimport to maven enforcer banned-illegal-imports (#3584) 2021-10-28 15:57:15 +09:00
hadoop-client-modules HADOOP-18676. jettison dependency override in hadoop-common lib (#5513) 2023-03-27 09:59:02 +02:00
hadoop-cloud-storage-project HADOOP-18442. Remove openstack support (#4855) 2022-10-06 11:49:38 +01:00
hadoop-common-project HADOOP-18740. S3A prefetch cache blocks should be accessed by RW locks (#5675) 2023-06-07 14:05:52 +01:00
hadoop-dist MAPREDUCE-7386. Maven parallel builds (skipping tests) fail (#4415) 2022-11-04 11:50:43 +00:00
hadoop-hdfs-project HDFS-17003. Erasure Coding: invalidate wrong block after reporting bad blocks from datanode (#5643). Contributed by hfutatzhanghb. 2023-06-08 18:06:51 +08:00
hadoop-mapreduce-project Revert "HADOOP-18207. Introduce hadoop-logging module (#5503)" 2023-06-05 09:34:40 +05:30
hadoop-maven-plugins HADOOP-18441. Remove hadoop custom ServicesResourceTransformer (#4850). Contributed by PJ Fanning. 2022-09-07 17:11:12 +05:30
hadoop-minicluster HADOOP-18131. Upgrade maven enforcer plugin and relevant dependencies (#4000) 2022-03-08 17:27:04 +09:00
hadoop-project Revert "HADOOP-18207. Introduce hadoop-logging module (#5503)" 2023-06-05 09:34:40 +05:30
hadoop-project-dist HADOOP-18470. Hadoop 3.3.5 release wrap-up (#5558) 2023-04-18 10:12:07 +01:00
hadoop-tools HADOOP-18752. Change fs.s3a.directory.marker.retention to "keep" (#5689) 2023-06-08 12:12:29 +01:00
hadoop-yarn-project YARN-11502. Refactor AMRMProxy#FederationInterceptor#registerApplicationMaster. (#5705) 2023-06-05 15:54:41 -07:00
licenses HADOOP-17144. Update Hadoop's lz4 to v1.9.2. Contributed by Hemanth Boyina. 2020-10-18 18:37:46 +05:30
licenses-binary HADOOP-15993. Upgrade Kafka to 2.4.0 in hadoop-kafka module. (#1796) 2020-01-09 16:24:58 +09:00
.asf.yaml HADOOP-18630. Add gh-pages in asf.yaml to deploy the current trunk doc (#5393). Contributed by Simhadri Govindappa. 2023-02-14 18:13:29 +05:30
.gitattributes HADOOP-13598. Add eol=lf for unix format files in .gitattributes. Contributed by Yiqun Lin. 2016-09-14 11:14:31 +09:00
.gitignore YARN-10407. Add phantomjsdriver.log to gitignore. (#2244) 2020-09-01 10:44:55 +09:00
BUILDING.txt HADOOP-18506. Update build instructions for Windows using VS2019 (#5066) 2022-10-24 09:28:29 -07:00
LICENSE-binary HADOOP-18359. Update commons-cli from 1.2 to 1.5. (#5095). Contributed by Shilun Fan. 2023-05-10 01:42:12 +05:30
LICENSE.txt YARN-11356. Upgrade DataTables to 1.11.5 to fix CVEs. Contributed by Bence Kosztolnik. 2022-10-26 22:29:01 +02:00
NOTICE-binary HADOOP-18068. upgrade AWS SDK to 1.12.132 (#3864) 2022-01-18 10:31:28 +00:00
NOTICE.txt HADOOP-15958. Revisiting LICENSE and NOTICE files. 2019-08-27 13:47:12 +09:00
pom.xml HADOOP-18590. Publish SBOM artifacts (#5555). Contributed by Dongjoon Hyun. 2023-04-15 21:35:43 +05:30
README.txt HADOOP-15958. Revisiting LICENSE and NOTICE files. 2019-08-27 13:47:12 +09:00
start-build-env.sh HADOOP-18052. Support Apple Silicon in start-build-env.sh (#3817) 2021-12-23 18:13:18 +09:00

For the latest information about Hadoop, please visit our website at:

   http://hadoop.apache.org/

and our wiki, at:

   https://cwiki.apache.org/confluence/display/HADOOP/