Thomas Marquardt b214bbd2d9
HADOOP-16916: ABFS: Delegation SAS generator for integration with Ranger
Contributed by Thomas Marquardt.

DETAILS:

Previously we had a SASGenerator class which generated Service SAS, but we need to add DelegationSASGenerator.
I separated SASGenerator into a base class and two subclasses ServiceSASGenerator and DelegationSASGenreator.  The
code in ServiceSASGenerator is copied from SASGenerator but the DelegationSASGenrator code is new.  The
DelegationSASGenerator code demonstrates how to use Delegation SAS with minimal permissions, as would be used
by an authorization service such as Apache Ranger.  Adding this to the tests helps us lock in this behavior.

Added a MockDelegationSASTokenProvider for testing User Delegation SAS.

Fixed the ITestAzureBlobFileSystemCheckAccess tests to assume oauth client ID so that they are ignored when that
is not configured.

To improve performance, AbfsInputStream/AbfsOutputStream re-use SAS tokens until the expiry is within 120 seconds.
After this a new SAS will be requested.  The default period of 120 seconds can be changed using the configuration
setting "fs.azure.sas.token.renew.period.for.streams".

The SASTokenProvider operation names were updated to correspond better with the ADLS Gen2 REST API, since these
operations must be provided tokens with appropriate SAS parameters to succeed.

Support for the version 2.0 AAD authentication endpoint was added to AzureADAuthenticator.

The getFileStatus method was mistakenly calling the ADLS Gen2 Get Properties API which requires read permission
while the getFileStatus call only requires execute permission.  ADLS Gen2 Get Status API is supposed to be used
for this purpose, so the underlying AbfsClient.getPathStatus API was updated with a includeProperties
parameter which is set to false for getFileStatus and true for getXAttr.

Added SASTokenProvider support for delete recursive.

Fixed bugs in AzureBlobFileSystem where public methods were not validating the Path by calling makeQualified.  This is
necessary to avoid passing null paths and to convert relative paths into absolute paths.

Canonicalized the path used for root path internally so that root path can be used with SAS tokens, which requires
that the path in the URL and the path in the SAS token match.  Internally the code was using
"//" instead of "/" for the root path, sometimes.  Also related to this, the AzureBlobFileSystemStore.getRelativePath
API was updated so that we no longer remove and then add back a preceding forward / to paths.

To run ITestAzureBlobFileSystemDelegationSAS tests follow the instructions in testing_azure.md under the heading
"To run Delegation SAS test cases".  You also need to set "fs.azure.enable.check.access" to true.

TEST RESULTS:

namespace.enabled=true
auth.type=SharedKey
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 0, Skipped: 41
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24

namespace.enabled=false
auth.type=SharedKey
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 0, Skipped: 244
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24

namespace.enabled=true
auth.type=SharedKey
sas.token.provider.type=MockDelegationSASTokenProvider
enable.check.access=true
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 0, Skipped: 33
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24

namespace.enabled=true
auth.type=OAuth
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 1, Skipped: 74
Tests run: 206, Failures: 0, Errors: 0, Skipped: 140
2020-05-12 18:35:38 +00:00

78 lines
3.2 KiB
XML

<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<FindBugsFilter>
<!-- This reference equality check is an intentional light weight
check to avoid re-validating the token when re-used. -->
<Match>
<Class name="org.apache.hadoop.fs.azurebfs.utils.CachedSASToken" />
<Method name="update" />
<Bug pattern="ES_COMPARING_PARAMETER_STRING_WITH_EQ" />
</Match>
<!-- This is intentional. The unsynchronized field access is safe
and only synchronized access is used when using the sasToken
for authorization. -->
<Match>
<Class name="org.apache.hadoop.fs.azurebfs.utils.CachedSASToken" />
<Field name="sasToken" />
<Bug pattern="IS2_INCONSISTENT_SYNC" />
</Match>
<!-- It is okay to skip up to end of file. No need to check return value. -->
<Match>
<Class name="org.apache.hadoop.fs.azure.AzureNativeFileSystemStore" />
<Method name="retrieve" />
<Bug pattern="SR_NOT_CHECKED" />
<Priority value="2" />
</Match>
<!-- Returning fully loaded array to iterate through is a convenience
and helps performance. -->
<Match>
<Class name="org.apache.hadoop.fs.azure.NativeAzureFileSystem$FolderRenamePending" />
<Method name="getFiles" />
<Bug pattern="EI_EXPOSE_REP" />
<Priority value="2" />
</Match>
<!-- Need to start keep-alive thread for SelfRenewingLease in constructor. -->
<Match>
<Class name="org.apache.hadoop.fs.azure.SelfRenewingLease" />
<Bug pattern="SC_START_IN_CTOR" />
<Priority value="2" />
</Match>
<!-- Using a key set iterator is fine because this is not a performance-critical
method. -->
<Match>
<Class name="org.apache.hadoop.fs.azure.PageBlobOutputStream" />
<Method name="logAllStackTraces" />
<Bug pattern="WMI_WRONG_MAP_ITERATOR" />
<Priority value="2" />
</Match>
<!-- FileMetadata is used internally for storing metadata but also
subclasses FileStatus to reduce allocations when listing a large number
of files. When it is returned to an external caller as a FileStatus, the
extra metadata is no longer useful and we want the equals and hashCode
methods of FileStatus to be used. -->
<Match>
<Class name="org.apache.hadoop.fs.azure.FileMetadata" />
<Bug pattern="EQ_DOESNT_OVERRIDE_EQUALS" />
</Match>
</FindBugsFilter>