hadoop/hadoop-tools
Steve Loughran b1ea32f91c
HADOOP-18517. ABFS: Add fs.azure.enable.readahead option to disable readahead (#5103)
* HADOOP-18517. ABFS: Add fs.azure.enable.readahead option to disable readahead

Adds new config option to turn off readahead
* also allows it to be passed in through openFile(),
* extends ITestAbfsReadWriteAndSeek to use the option, including one
  replicated test...that shows that turning it off is slower.

Important: this does not address the critical data corruption issue
HADOOP-18521. ABFS ReadBufferManager buffer sharing across concurrent HTTP requests

What is does do is provide a way to completely bypass the ReadBufferManager.
To mitigate the problem, either fs.azure.enable.readahead needs to be set to false,
or set "fs.azure.readaheadqueue.depth" to 0 -this still goes near the (broken)
ReadBufferManager code, but does't trigger the bug.

For safe reading of files through the ABFS connector, readahead MUST be disabled
or the followup fix to HADOOP-18521 applied

Contributed by Steve Loughran
2022-11-08 13:41:31 +00:00
..
hadoop-aliyun HADOOP-18313: AliyunOSSBlockOutputStream should not mark the temporary file for deletion (#4502) 2022-07-06 14:31:07 +08:00
hadoop-archive-logs HADOOP-18305. Preparing for 3.3.4 release: branch-3.3 version => 3.3.9 (#4482) 2022-06-22 13:09:50 +01:00
hadoop-archives HADOOP-18305. Preparing for 3.3.4 release: branch-3.3 version => 3.3.9 (#4482) 2022-06-22 13:09:50 +01:00
hadoop-aws HADOOP-18233. Possible race condition with TemporaryAWSCredentialsProvider (#5024) 2022-10-31 17:50:49 +00:00
hadoop-azure HADOOP-18517. ABFS: Add fs.azure.enable.readahead option to disable readahead (#5103) 2022-11-08 13:41:31 +00:00
hadoop-azure-datalake HADOOP-18305. Preparing for 3.3.4 release: branch-3.3 version => 3.3.9 (#4482) 2022-06-22 13:09:50 +01:00
hadoop-benchmark HADOOP-18507. VectorIO FileRange type to support a "reference" field (#5076) 2022-11-08 13:35:42 +00:00
hadoop-datajoin HADOOP-18305. Preparing for 3.3.4 release: branch-3.3 version => 3.3.9 (#4482) 2022-06-22 13:09:50 +01:00
hadoop-distcp HADOOP-18442. Remove openstack support (#4855) 2022-10-07 12:03:08 +01:00
hadoop-dynamometer HADOOP-18305. Preparing for 3.3.4 release: branch-3.3 version => 3.3.9 (#4482) 2022-06-22 13:09:50 +01:00
hadoop-extras HADOOP-18305. Preparing for 3.3.4 release: branch-3.3 version => 3.3.9 (#4482) 2022-06-22 13:09:50 +01:00
hadoop-fs2img HADOOP-18305. Preparing for 3.3.4 release: branch-3.3 version => 3.3.9 (#4482) 2022-06-22 13:09:50 +01:00
hadoop-gridmix HADOOP-18305. Preparing for 3.3.4 release: branch-3.3 version => 3.3.9 (#4482) 2022-06-22 13:09:50 +01:00
hadoop-kafka HADOOP-18305. Preparing for 3.3.4 release: branch-3.3 version => 3.3.9 (#4482) 2022-06-22 13:09:50 +01:00
hadoop-openstack HADOOP-18442. Remove openstack support (#4855) 2022-10-07 12:03:08 +01:00
hadoop-pipes HADOOP-18305. Preparing for 3.3.4 release: branch-3.3 version => 3.3.9 (#4482) 2022-06-22 13:09:50 +01:00
hadoop-resourceestimator HADOOP-15983. Use jersey-json that is built to use jackson2 ((#3988) 2022-10-20 17:37:56 +01:00
hadoop-rumen HADOOP-18469. Add secure XML parser factories to XMLUtils (#4940) 2022-10-07 10:47:55 +01:00
hadoop-sls HADOOP-18305. Preparing for 3.3.4 release: branch-3.3 version => 3.3.9 (#4482) 2022-06-22 13:09:50 +01:00
hadoop-streaming MAPREDUCE-7371. DistributedCache alternative APIs should not use DistributedCache APIs internally (#3855) 2022-06-22 13:13:05 +01:00
hadoop-tools-dist HADOOP-18442. Remove openstack support (#4855) 2022-10-07 12:03:08 +01:00
pom.xml HADOOP-11867. Add a high-performance vectored read API. (#3904) 2022-06-23 17:09:16 -05:00