hadoop/hadoop-tools
Thomas Marquardt af98f32f7d
HADOOP-16916: ABFS: Delegation SAS generator for integration with Ranger
Contributed by Thomas Marquardt.

DETAILS:

Previously we had a SASGenerator class which generated Service SAS, but we need to add DelegationSASGenerator.
I separated SASGenerator into a base class and two subclasses ServiceSASGenerator and DelegationSASGenreator.  The
code in ServiceSASGenerator is copied from SASGenerator but the DelegationSASGenrator code is new.  The
DelegationSASGenerator code demonstrates how to use Delegation SAS with minimal permissions, as would be used
by an authorization service such as Apache Ranger.  Adding this to the tests helps us lock in this behavior.

Added a MockDelegationSASTokenProvider for testing User Delegation SAS.

Fixed the ITestAzureBlobFileSystemCheckAccess tests to assume oauth client ID so that they are ignored when that
is not configured.

To improve performance, AbfsInputStream/AbfsOutputStream re-use SAS tokens until the expiry is within 120 seconds.
After this a new SAS will be requested.  The default period of 120 seconds can be changed using the configuration
setting "fs.azure.sas.token.renew.period.for.streams".

The SASTokenProvider operation names were updated to correspond better with the ADLS Gen2 REST API, since these
operations must be provided tokens with appropriate SAS parameters to succeed.

Support for the version 2.0 AAD authentication endpoint was added to AzureADAuthenticator.

The getFileStatus method was mistakenly calling the ADLS Gen2 Get Properties API which requires read permission
while the getFileStatus call only requires execute permission.  ADLS Gen2 Get Status API is supposed to be used
for this purpose, so the underlying AbfsClient.getPathStatus API was updated with a includeProperties
parameter which is set to false for getFileStatus and true for getXAttr.

Added SASTokenProvider support for delete recursive.

Fixed bugs in AzureBlobFileSystem where public methods were not validating the Path by calling makeQualified.  This is
necessary to avoid passing null paths and to convert relative paths into absolute paths.

Canonicalized the path used for root path internally so that root path can be used with SAS tokens, which requires
that the path in the URL and the path in the SAS token match.  Internally the code was using
"//" instead of "/" for the root path, sometimes.  Also related to this, the AzureBlobFileSystemStore.getRelativePath
API was updated so that we no longer remove and then add back a preceding forward / to paths.

To run ITestAzureBlobFileSystemDelegationSAS tests follow the instructions in testing_azure.md under the heading
"To run Delegation SAS test cases".  You also need to set "fs.azure.enable.check.access" to true.

TEST RESULTS:

namespace.enabled=true
auth.type=SharedKey
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 0, Skipped: 41
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24

namespace.enabled=false
auth.type=SharedKey
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 0, Skipped: 244
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24

namespace.enabled=true
auth.type=SharedKey
sas.token.provider.type=MockDelegationSASTokenProvider
enable.check.access=true
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 0, Skipped: 33
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24

namespace.enabled=true
auth.type=OAuth
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 1, Skipped: 74
Tests run: 206, Failures: 0, Errors: 0, Skipped: 140
2020-06-19 19:00:46 +00:00
..
hadoop-aliyun Preparing for 3.3.1 development 2020-04-30 13:33:42 +09:00
hadoop-archive-logs Preparing for 3.3.1 development 2020-04-30 13:33:42 +09:00
hadoop-archives Preparing for 3.3.1 development 2020-04-30 13:33:42 +09:00
hadoop-aws HADOOP-17050. S3A to support additional token issuers 2020-06-09 14:43:02 +01:00
hadoop-azure HADOOP-16916: ABFS: Delegation SAS generator for integration with Ranger 2020-06-19 19:00:46 +00:00
hadoop-azure-datalake Preparing for 3.3.1 development 2020-04-30 13:33:42 +09:00
hadoop-datajoin Preparing for 3.3.1 development 2020-04-30 13:33:42 +09:00
hadoop-distcp Revert "HADOOP-14557. Document HADOOP-8143 (Change distcp to have -pb on by default)." 2020-05-14 19:20:34 +01:00
hadoop-dynamometer Preparing for 3.3.1 development 2020-04-30 13:33:42 +09:00
hadoop-extras Preparing for 3.3.1 development 2020-04-30 13:33:42 +09:00
hadoop-fs2img Preparing for 3.3.1 development 2020-04-30 13:33:42 +09:00
hadoop-gridmix Preparing for 3.3.1 development 2020-04-30 13:33:42 +09:00
hadoop-kafka Preparing for 3.3.1 development 2020-04-30 13:33:42 +09:00
hadoop-openstack Preparing for 3.3.1 development 2020-04-30 13:33:42 +09:00
hadoop-pipes Preparing for 3.3.1 development 2020-04-30 13:33:42 +09:00
hadoop-resourceestimator Preparing for 3.3.1 development 2020-04-30 13:33:42 +09:00
hadoop-rumen Preparing for 3.3.1 development 2020-04-30 13:33:42 +09:00
hadoop-sls Preparing for 3.3.1 development 2020-04-30 13:33:42 +09:00
hadoop-streaming Preparing for 3.3.1 development 2020-04-30 13:33:42 +09:00
hadoop-tools-dist Preparing for 3.3.1 development 2020-04-30 13:33:42 +09:00
pom.xml Preparing for 3.3.1 development 2020-04-30 13:33:42 +09:00