Commit Graph

60 Commits

Author SHA1 Message Date
Anmol Asrani
6306f5b2bc
HADOOP-18146: ABFS: Added changes for expect hundred continue header #4039
This change lets the client react pre-emptively to server load without getting to 503 and the exponential backoff
which follows. This stops performance suffering so much as capacity limits are approached for an account.

Contributed by Anmol Asranii
2023-03-28 16:32:01 +01:00
Anmol Asrani
1cc8cb68f2
HADOOP-18457. ABFS: Support account level throttling (#5034)
This allows  abfs request throttling to be shared across all
abfs connections talking to containers belonging to the same abfs storage
account -as that is the level at which IO throttling is applied.

The option is enabled/disabled in the configuration option
"fs.azure.account.throttling.enabled";
The default is "true"

Contributed by Anmol Asrani
2022-11-30 13:14:11 +00:00
Steve Loughran
8ccc586af6
HADOOP-17409. Remove s3guard from S3A module (#3534)
Completely removes S3Guard support from the S3A codebase.

If the connector is configured to use any metastore other than
the null and local stores (i.e. DynamoDB is selected) the s3a client
will raise an exception and refuse to initialize.

This is to ensure that there is no mix of S3Guard enabled and disabled
deployments with the same configuration but different hadoop releases
-it must be turned off completely.

The "hadoop s3guard" command has been retained -but the supported
subcommands have been reduced to those which are not purely S3Guard
related: "bucket-info" and "uploads".

This is major change in terms of the number of files
changed; before cherry picking subsequent s3a patches into
older releases, this patch will probably need backporting
first.

Goodbye S3Guard, your work is done. Time to die.

Contributed by Steve Loughran.
2022-01-18 18:04:48 +00:00
Anoop Sam John
913d06ad4d
HADOOP-17770 WASB : Support disabling buffered reads in positional reads (#3233) 2021-10-22 11:45:42 +05:30
sumangala-patki
dd30db78e7
HADOOP-17290. ABFS: Add Identifiers to Client Request Header (#2520)
Contributed by Sumangala Patki.

(cherry picked from commit 35570e414a)
2021-09-21 16:45:51 +01:00
Brian Loss
37e0828e76
HADOOP-17811: ABFS ExponentialRetryPolicy doesn't pick up configuration values (#3221)
Contributed by Brian Loss.

Change-Id: I5f24196d1d02de91336c3679abaf8d55cfaed746
2021-08-02 11:37:33 +01:00
sumangala-patki
aa6a9cac72
HADOOP-17596. ABFS: Change default Readahead Queue Depth from num(processors) to const (#3106)
* HADOOP-17596. ABFS: Change default Readahead Queue Depth from num(processors) to const (#2795)
. Contributed by Sumangala Patki.

(cherry picked from commit 76d92eb2a2)
2021-07-10 15:09:59 +05:30
billierinaldi
8170a7bb60 HADOOP-16948. Support infinite lease dirs (#1925). Contributed by Billie Rinaldi.
(cherry picked from commit c1fde4fe94)
2021-04-20 14:36:54 -04:00
sumangala-patki
cdaa64458d
HADOOP-17191. ABFS: Run the tests with various combinations of configurations and publish a consolidated results (#2597)
Contributed by Bilahari T H and Sumangala Patki
2021-03-10 18:25:41 +00:00
Anoop Sam John
5857b781a3
HADOOP-17038 Support disabling buffered reads in ABFS positional reads. (#2646)
- Contributed by @anoopsjohn

Change-Id: Ibd11cc9d7aed0c2cc831a01e07d0a1595f7026fb
2021-02-22 11:46:35 +00:00
Sumangala
5f312a0d85 HADOOP-17422: ABFS: Set default ListMaxResults to max server limit (#2535)
Contributed by Sumangala Patki

TEST RESULTS:

namespace.enabled=true
auth.type=SharedKey
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 90, Failures: 0, Errors: 0, Skipped: 0
Tests run: 462, Failures: 0, Errors: 0, Skipped: 24
Tests run: 208, Failures: 0, Errors: 0, Skipped: 24

namespace.enabled=true
auth.type=OAuth
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 90, Failures: 0, Errors: 0, Skipped: 0
Tests run: 462, Failures: 0, Errors: 0, Skipped: 70
Tests run: 208, Failures: 0, Errors: 0, Skipped: 141

(cherry picked from commit a35fc3871b)
2021-01-22 10:48:04 +00:00
Sneha Vijayarajan
a44890eb63 HADOOP-17296. ABFS: Force reads to be always of buffer size.
Contributed by Sneha Vijayarajan.

(cherry picked from commit 142941b96e)
2021-01-22 10:48:04 +00:00
bilaharith
f208da286c
HADOOP-17166. ABFS: configure output stream thread pool (#2179)
Adds the options to control the size of the per-output-stream threadpool
when writing data through the abfs connector

* fs.azure.write.max.concurrent.requests
* fs.azure.write.max.requests.to.queue

Contributed by Bilahari T H
2020-10-14 22:29:13 +00:00
bilaharith
d80dfad900
HADOOP-17183. ABFS: Enabling checkaccess on ABFS (#2331)
Contributed by Bilahari TH

Change-Id: If4224697deed733d6db44145994cdd85547c27d1
2020-10-01 21:29:48 +01:00
Sneha Vijayarajan
eed06b46eb
Hadoop-17015. ABFS: Handling Rename and Delete idempotency
Contributed by Sneha Vijayarajan.
2020-07-25 13:08:01 +00:00
bilaharith
1ae72d2438
HADOOP-17092. ABFS: Making AzureADAuthenticator.getToken() throw HttpException
- Contributed by Bilahari T H

Change-Id: Id9576d9509faaf057bf419ccb1879ac0cef7a07b
2020-07-22 18:26:36 +01:00
bilaharith
d639c11986
HADOOP-17004. Fixing a formatting issue
Contributed by Bilahari T H.
2020-06-19 19:11:06 +00:00
bilaharith
11307f3be9
HADOOP-17004. ABFS: Improve the ABFS driver documentation
Contributed by Bilahari T H.
2020-06-19 19:10:22 +00:00
Thomas Marquardt
af98f32f7d
HADOOP-16916: ABFS: Delegation SAS generator for integration with Ranger
Contributed by Thomas Marquardt.

DETAILS:

Previously we had a SASGenerator class which generated Service SAS, but we need to add DelegationSASGenerator.
I separated SASGenerator into a base class and two subclasses ServiceSASGenerator and DelegationSASGenreator.  The
code in ServiceSASGenerator is copied from SASGenerator but the DelegationSASGenrator code is new.  The
DelegationSASGenerator code demonstrates how to use Delegation SAS with minimal permissions, as would be used
by an authorization service such as Apache Ranger.  Adding this to the tests helps us lock in this behavior.

Added a MockDelegationSASTokenProvider for testing User Delegation SAS.

Fixed the ITestAzureBlobFileSystemCheckAccess tests to assume oauth client ID so that they are ignored when that
is not configured.

To improve performance, AbfsInputStream/AbfsOutputStream re-use SAS tokens until the expiry is within 120 seconds.
After this a new SAS will be requested.  The default period of 120 seconds can be changed using the configuration
setting "fs.azure.sas.token.renew.period.for.streams".

The SASTokenProvider operation names were updated to correspond better with the ADLS Gen2 REST API, since these
operations must be provided tokens with appropriate SAS parameters to succeed.

Support for the version 2.0 AAD authentication endpoint was added to AzureADAuthenticator.

The getFileStatus method was mistakenly calling the ADLS Gen2 Get Properties API which requires read permission
while the getFileStatus call only requires execute permission.  ADLS Gen2 Get Status API is supposed to be used
for this purpose, so the underlying AbfsClient.getPathStatus API was updated with a includeProperties
parameter which is set to false for getFileStatus and true for getXAttr.

Added SASTokenProvider support for delete recursive.

Fixed bugs in AzureBlobFileSystem where public methods were not validating the Path by calling makeQualified.  This is
necessary to avoid passing null paths and to convert relative paths into absolute paths.

Canonicalized the path used for root path internally so that root path can be used with SAS tokens, which requires
that the path in the URL and the path in the SAS token match.  Internally the code was using
"//" instead of "/" for the root path, sometimes.  Also related to this, the AzureBlobFileSystemStore.getRelativePath
API was updated so that we no longer remove and then add back a preceding forward / to paths.

To run ITestAzureBlobFileSystemDelegationSAS tests follow the instructions in testing_azure.md under the heading
"To run Delegation SAS test cases".  You also need to set "fs.azure.enable.check.access" to true.

TEST RESULTS:

namespace.enabled=true
auth.type=SharedKey
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 0, Skipped: 41
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24

namespace.enabled=false
auth.type=SharedKey
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 0, Skipped: 244
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24

namespace.enabled=true
auth.type=SharedKey
sas.token.provider.type=MockDelegationSASTokenProvider
enable.check.access=true
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 0, Skipped: 33
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24

namespace.enabled=true
auth.type=OAuth
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 1, Skipped: 74
Tests run: 206, Failures: 0, Errors: 0, Skipped: 140
2020-06-19 19:00:46 +00:00
bilaharith
76ee7e5494
HADOOP-17002. ABFS: Adding config to determine if the account is HNS enabled or not
Contributed by Bilahari T H.
2020-06-19 18:57:47 +00:00
Sneha Vijayarajan
32fb174da2 Hadoop 16857. ABFS: Stop CustomTokenProvider retry logic to depend on AbfsRestOp retry policy
Contributed by Sneha Vijayarajan
2020-04-23 14:37:25 +01:00
Steve Loughran
745a6c1e69
Revert "HADOOP-16818. ABFS: Combine append+flush calls for blockblob & appendblob"
This reverts commit 3612317038.

Change-Id: Ie0d36f25de0b55a937894f4d9963c495bae0576a
2020-03-26 15:24:37 +00:00
ishaniahuja
3612317038
HADOOP-16818. ABFS: Combine append+flush calls for blockblob & appendblob
Contributed by Ishani Ahuja.
2020-03-20 10:27:41 +00:00
Sneha Vijayarajan
791270a2e5
HADOOP-16730: ABFS: Support for Shared Access Signatures (SAS). Contributed by Sneha Vijayarajan. 2020-02-27 18:27:22 +00:00
Karthick Narendran
978c487672
HADOOP-16826. ABFS: update abfs.md to include config keys for identity transformation
Contributed by Karthick Narendran
2020-01-23 20:35:57 -08:00
bilaharith
9e69628f55 HADOOP-16455. ABFS: Implement FileSystem.access() method.
Contributed by Bilahari T H.
2019-11-27 15:56:38 +00:00
Jeetesh Mangwani
b033c681e4
HADOOP-16612. Track Azure Blob File System client-perceived latency
Contributed by Jeetesh Mangwani.

This add the ability to track the end-to-end performance of ADLS Gen 2 REST APIs by measuring latency in the Hadoop ABFS driver.
The latency information is sent back to the ADLS Gen 2 REST API endpoints in the subsequent requests.
2019-11-19 09:00:24 -08:00
Andras Bokor
96c4520f89
HADOOP-16710. Testing_azure.md documentation is misleading.
Contributed by Andras Bokor.

Change-Id: Icf07a53145936953629c7dace2e9648b7b21588d
2019-11-17 17:04:29 +00:00
Sneha Vijayarajan
c0edc848a8
HADOOP-16548 : Disable Flush() over config 2019-09-28 20:39:42 -07:00
Steve Loughran
65f60e56b0
HADOOP-16068. ABFS Authentication and Delegation Token plugins to optionally be bound to specific URI of the store.
Contributed by Steve Loughran.
2019-02-28 14:22:32 +00:00
Da Zhou
1f1655028e
HADOOP-15954. ABFS: Enable owner and group conversion for MSI and login user using OAuth.
Contributed by Da Zhou and Junhua Gu.
2019-02-07 21:58:21 +00:00
Steve Loughran
668817a6ce
Revert "HADOOP-15954. ABFS: Enable owner and group conversion for MSI and login user using OAuth."
(accidentally mixed in two patches)

This reverts commit fa8cd1bf28.
2019-02-07 21:57:22 +00:00
Da Zhou
fa8cd1bf28
HADOOP-15954. ABFS: Enable owner and group conversion for MSI and login user using OAuth.
Contributed by Da Zhou and Junhua Gu.
2019-02-05 19:23:15 +00:00
Sean Mackrory
8e831ba458 HADOOP-15773. Fixing checkstyle and other issues raised by Yetus. 2018-09-19 16:56:33 -06:00
Thomas Marquardt
e5593cbd83 HADOOP-15694. ABFS: Allow OAuth credentials to not be tied to accounts.
Contributed by Sean Mackrory.
2018-09-17 19:54:01 +00:00
Thomas Marquardt
81dc4a995c HADOOP-15663. ABFS: Simplify configuration.
Contributed by Da Zhou.
2018-09-17 19:54:01 +00:00
Thomas Marquardt
ce03a93f78 HADOOP-15446. ABFS: tune imports & javadocs; stabilise tests.
Contributed by Steve Loughran and Da Zhou.
2018-09-17 19:54:01 +00:00
Yiqun Lin
1312f9ae4c HADOOP-15391. Add missing css file in hadoop-aws, hadoop-aliyun, hadoop-azure and hadoop-azure-datalake modules. 2018-04-18 16:04:00 +08:00
Steve Loughran
572cdb5463 HADOOP-14899. Restrict Access to setPermission operation when authorization is enabled in WASB
Contributed by Kannapiran Srinivasan.
2017-10-06 17:43:38 +01:00
Steve Loughran
2d2d97fa7d
HADOOP-14553. Add (parallelized) integration tests to hadoop-azure
Contributed by Steve Loughran
2017-09-15 17:03:01 +01:00
Steve Loughran
13eda50003
HADOOP-14520. WASB: Block compaction for Azure Block Blobs.
Contributed by Georgi Chalakov
2017-09-07 18:35:03 +01:00
Steve Loughran
021974f4cb
HADOOP-14802. Add support for using container saskeys for all accesses.
Contributed by Sivaguru Sankaridurg
2017-08-29 19:02:43 +01:00
Jitendra Pandey
f2921e51f0 HADOOP-14518. Customize User-Agent header sent in HTTP/HTTPS requests by WASB. Contributed by Georgi Chalakov. 2017-07-24 23:01:01 -07:00
Jitendra Pandey
2843c688bc HADOOP-14642. wasb: add support for caching Authorization and SASKeys. Contributed by Sivaguru Sankaridurg. 2017-07-19 00:13:06 -07:00
Steve Loughran
7d272ea124
HADOOP-14581. Restrict setOwner to list of user when security is enabled in wasb.
Contributed by Varada Hemeswari

(cherry picked from commit 1e69e5260351effc8077d1bdc397cec57cf1ff1b)
2017-07-12 10:37:39 +01:00
Mingliang Liu
38996fdcf0 HADOOP-14443. Azure: Support retry and client side failover for authorization, SASKey and delegation token generation. Contributed by Santhosh G Nayak 2017-06-30 16:53:48 -07:00
Mingliang Liu
536f057158 HADOOP-14491. Azure has messed doc structure. Contributed by Mingliang Liu 2017-06-06 11:09:28 -07:00
Mingliang Liu
ece33208b8 HADOOP-14460. Azure: update doc for live and contract tests. Contributed by Mingliang Liu 2017-06-01 11:52:11 -07:00
Mingliang Liu
686823529b HADOOP-13930. Azure: Add Authorization support to WASB. Contributed by Sivaguru Sankaridurg and Dushyanth 2017-03-06 17:16:36 -08:00
Mingliang Liu
52d7d5aa1a Revert "HADOOP-13930. Azure: Add Authorization support to WASB. Contributed by Sivaguru Sankaridurg and Dushyanth"
This reverts commit 6b7cd62b8c.
2017-03-06 17:10:11 -08:00