hadoop/hadoop-tools
Thomas Marquardt b214bbd2d9
HADOOP-16916: ABFS: Delegation SAS generator for integration with Ranger
Contributed by Thomas Marquardt.

DETAILS:

Previously we had a SASGenerator class which generated Service SAS, but we need to add DelegationSASGenerator.
I separated SASGenerator into a base class and two subclasses ServiceSASGenerator and DelegationSASGenreator.  The
code in ServiceSASGenerator is copied from SASGenerator but the DelegationSASGenrator code is new.  The
DelegationSASGenerator code demonstrates how to use Delegation SAS with minimal permissions, as would be used
by an authorization service such as Apache Ranger.  Adding this to the tests helps us lock in this behavior.

Added a MockDelegationSASTokenProvider for testing User Delegation SAS.

Fixed the ITestAzureBlobFileSystemCheckAccess tests to assume oauth client ID so that they are ignored when that
is not configured.

To improve performance, AbfsInputStream/AbfsOutputStream re-use SAS tokens until the expiry is within 120 seconds.
After this a new SAS will be requested.  The default period of 120 seconds can be changed using the configuration
setting "fs.azure.sas.token.renew.period.for.streams".

The SASTokenProvider operation names were updated to correspond better with the ADLS Gen2 REST API, since these
operations must be provided tokens with appropriate SAS parameters to succeed.

Support for the version 2.0 AAD authentication endpoint was added to AzureADAuthenticator.

The getFileStatus method was mistakenly calling the ADLS Gen2 Get Properties API which requires read permission
while the getFileStatus call only requires execute permission.  ADLS Gen2 Get Status API is supposed to be used
for this purpose, so the underlying AbfsClient.getPathStatus API was updated with a includeProperties
parameter which is set to false for getFileStatus and true for getXAttr.

Added SASTokenProvider support for delete recursive.

Fixed bugs in AzureBlobFileSystem where public methods were not validating the Path by calling makeQualified.  This is
necessary to avoid passing null paths and to convert relative paths into absolute paths.

Canonicalized the path used for root path internally so that root path can be used with SAS tokens, which requires
that the path in the URL and the path in the SAS token match.  Internally the code was using
"//" instead of "/" for the root path, sometimes.  Also related to this, the AzureBlobFileSystemStore.getRelativePath
API was updated so that we no longer remove and then add back a preceding forward / to paths.

To run ITestAzureBlobFileSystemDelegationSAS tests follow the instructions in testing_azure.md under the heading
"To run Delegation SAS test cases".  You also need to set "fs.azure.enable.check.access" to true.

TEST RESULTS:

namespace.enabled=true
auth.type=SharedKey
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 0, Skipped: 41
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24

namespace.enabled=false
auth.type=SharedKey
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 0, Skipped: 244
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24

namespace.enabled=true
auth.type=SharedKey
sas.token.provider.type=MockDelegationSASTokenProvider
enable.check.access=true
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 0, Skipped: 33
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24

namespace.enabled=true
auth.type=OAuth
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 1, Skipped: 74
Tests run: 206, Failures: 0, Errors: 0, Skipped: 140
2020-05-12 18:35:38 +00:00
..
hadoop-aliyun Preparing for 3.4.0 development 2020-03-29 23:24:25 +05:30
hadoop-archive-logs Preparing for 3.4.0 development 2020-03-29 23:24:25 +05:30
hadoop-archives Preparing for 3.4.0 development 2020-03-29 23:24:25 +05:30
hadoop-aws HADOOP-17025. Fix invalid metastore configuration in S3GuardTool tests. (#1994) 2020-05-07 12:00:47 +09:00
hadoop-azure HADOOP-16916: ABFS: Delegation SAS generator for integration with Ranger 2020-05-12 18:35:38 +00:00
hadoop-azure-datalake Preparing for 3.4.0 development 2020-03-29 23:24:25 +05:30
hadoop-datajoin Preparing for 3.4.0 development 2020-03-29 23:24:25 +05:30
hadoop-distcp HADOOP-16932. distcp copy calls getFileStatus() needlessly and can fail against S3 (#1936) 2020-04-07 17:55:55 +01:00
hadoop-dynamometer Preparing for 3.4.0 development 2020-03-29 23:24:25 +05:30
hadoop-extras Preparing for 3.4.0 development 2020-03-29 23:24:25 +05:30
hadoop-fs2img Preparing for 3.4.0 development 2020-03-29 23:24:25 +05:30
hadoop-gridmix HADOOP-17011. Tolerate leading and trailing spaces in fs.defaultFS. Contributed by Ctest 2020-04-30 14:15:28 -07:00
hadoop-kafka Preparing for 3.4.0 development 2020-03-29 23:24:25 +05:30
hadoop-openstack Preparing for 3.4.0 development 2020-03-29 23:24:25 +05:30
hadoop-pipes Preparing for 3.4.0 development 2020-03-29 23:24:25 +05:30
hadoop-resourceestimator Preparing for 3.4.0 development 2020-03-29 23:24:25 +05:30
hadoop-rumen Preparing for 3.4.0 development 2020-03-29 23:24:25 +05:30
hadoop-sls YARN-9301. Too many InvalidStateTransitionException with SLS. Contributed by Bilwa S T. 2020-05-12 08:24:34 -07:00
hadoop-streaming Preparing for 3.4.0 development 2020-03-29 23:24:25 +05:30
hadoop-tools-dist Preparing for 3.4.0 development 2020-03-29 23:24:25 +05:30
pom.xml Preparing for 3.4.0 development 2020-03-29 23:24:25 +05:30