Commit Graph

24084 Commits

Author SHA1 Message Date
Mehakmeet Singh
0d88ed2794
HADOOP-17129. Validating storage keys in ABFS correctly (#2141)
Contributed by Mehakmeet Singh

Change-Id: I8016ee2f9ffbc86ea867f4a3d960b134e507d099
2020-07-16 18:11:52 +01:00
Ahmed Hussein
9e7266df6c HADOOP-17099. Replace Guava Predicate with Java8+ Predicate
Signed-off-by: Jonathan Eagles <jeagles@gmail.com>
(cherry picked from commit 1f71c4ae71)
2020-07-15 11:40:13 -05:00
Erik Krogen
67e01ed2ca HADOOP-17127. Use RpcMetrics.TIMEUNIT to initialize rpc queueTime and processingTime. Contributed by Jim Brennan.
(cherry picked from 317fe4584a)
2020-07-15 08:26:38 -07:00
Ahmed Hussein
5969922305 HADOOP-17101. Replace Guava Function with Java8+ Function
Signed-off-by: Jonathan Eagles <jeagles@gmail.com>
(cherry picked from commit 98fcffe93f)
2020-07-15 09:57:36 -05:00
Shashikant Banerjee
292e22578a HDFS-15313. Ensure inodes in active filesytem are not deleted during snapshot delete. Contributed by Shashikant Banerjee.
(cherry picked from commit 82343790ee)
2020-07-15 13:13:27 +01:00
Mukund Thakur
8b601ad7e6 HADOOP-17022. Tune S3AFileSystem.listFiles() API.
Contributed by Mukund Thakur.

Change-Id: I17f5cfdcd25670ce3ddb62c13378c7e2dc06ba52
2020-07-14 15:28:27 +01:00
Anoop Sam John
cac2fc1f58 HADOOP-16998. WASB : NativeAzureFsOutputStream#close() throwing IllegalArgumentException (#2073)
Contributed by Anoop Sam John.
2020-07-14 14:08:46 +01:00
Eric Badger
41bcef9486 YARN-10348. Allow RM to always cancel tokens after app completes. Contributed by
Jim Brennan

(cherry picked from commit 48f90115b5)
2020-07-13 23:12:18 +00:00
Eric E Payne
7044a007b3 YARN-10297. TestContinuousScheduling#testFairSchedulerContinuousSchedulingInitTime fails intermittently. Contributed by Jim Brennan (Jim_Brennan)
(cherry picked from commit 0427100b75)
2020-07-13 19:02:40 +00:00
jimmy-zuber-amzn
79fc58def3
HADOOP-17105. S3AFS - Do not attempt to resolve symlinks in globStatus (#2113)
Contributed by Jimmy Zuber.

Change-Id: I2f247c2d2ab4f38214073e55f5cfbaa15aeaeb11
2020-07-13 19:09:50 +01:00
Steve Loughran
a51d72f0c6 HDFS-13934. Multipart uploaders to be created through FileSystem/FileContext.
Contributed by Steve Loughran.

Change-Id: Iebd34140c1a0aa71f44a3f4d0fee85f6bdf123a3
2020-07-13 13:32:04 +01:00
He Xiaoqiao
e7e7a6d503 HDFS-14498 LeaseManager can loop forever on the file for which create has failed. Contributed by Stephen O'Donnell. 2020-07-13 14:31:50 +08:00
Siyao Meng
358934059f HDFS-15462. Add fs.viewfs.overload.scheme.target.ofs.impl to core-default.xml (#2131)
(cherry picked from commit 0e694b20b9)
2020-07-09 16:30:58 -07:00
Uma Maheswara Rao G
f85ce2570e HDFS-15394. Add all available fs.viewfs.overload.scheme.target.<scheme>.impl classes in core-default.xml bydefault. Contributed by Uma Maheswara Rao G.
(cherry picked from commit 3ca15292c5)
2020-07-09 16:26:04 -07:00
Brahma Reddy Battula
7b175739a9 YARN-10341. Yarn Service Container Completed event doesn't get processed. Contributed by Bilwa S T.
(cherry picked from commit dfe60392c9)
2020-07-09 12:36:21 +05:30
Akira Ajisaka
0aa2d7d506
YARN-10344. Sync netty versions in hadoop-yarn-csi. (#2126)
(cherry picked from commit 10d218934c)
2020-07-09 15:07:44 +09:00
Masatake Iwasaki
7522b44067 HADOOP-17120. Fix failure of docker image creation due to pip2 install error. (#2130)
(cherry picked from commit 53cfe6090419649dde18e3bf0d6ff757f76d716f)
2020-07-09 03:19:18 +00:00
Sebastian Nagel
f9619b0b97
HADOOP-17117 Fix typos in hadoop-aws documentation (#2127)
(cherry picked from commit 5b1ed2113b)
2020-07-09 00:04:46 +09:00
Shanyu Zhao
10c9df1d0a HDFS-15451. Do not discard non-initial block report for provided storage. (#2119). Contributed by Shanyu Zhao.
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2020-07-07 11:44:42 +08:00
Madhusoodan Pataki
0789ae5b78 HADOOP-17081. MetricsSystem doesn't start the sink adapters on restart (#2089)
Contributed by Madhusoodan P
2020-07-06 16:26:48 +01:00
Akira Ajisaka
20df70a895
HADOOP-17111. Replace Guava Optional with Java8+ Optional. Contributed by Ahmed Hussein.
(cherry picked from commit 639acb6d89)
2020-07-06 16:09:37 +09:00
Ayush Saxena
b8d9218189 HDFS-15446. CreateSnapshotOp fails during edit log loading for /.reserved/raw/path with error java.io.FileNotFoundException: Directory does not exist: /.reserved/raw/path. Contributed by Stephen O'Donnell. 2020-07-04 12:25:29 +05:30
bilaharith
19fb204011
HADOOP-17086. ABFS: Making the ListStatus response ignore unknown properties. (#2101)
Contributed by Bilahari T H.

Change-Id: I82e4683fba8481aef2abab7a6a99e5752f6fffa9
2020-07-03 19:02:21 +01:00
Szilard Nemeth
439c51425e YARN-10330. Add missing test scenarios to TestUserGroupMappingPlacementRule and TestAppNameMappingPlacementRule. Contributed by Peter Bacsko 2020-07-01 17:42:45 +02:00
Szilard Nemeth
cfb2084cba YARN-10325. Document max-parallel-apps for Capacity Scheduler. Contributed by Peter Bacsko 2020-07-01 13:40:36 +02:00
Szilard Nemeth
d88a6eebf2 YARN-10318. ApplicationHistory Web UI incorrect column indexing. Contributed by Andras Gyori 2020-07-01 13:31:35 +02:00
Akira Ajisaka
d572abb818 HADOOP-17090. Increase precommit job timeout from 5 hours to 20 hours. (#2111). Contributed by Akira Ajisaka.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2020-07-01 15:16:27 +05:30
Abhishek Das
047fb3493a HADOOP-17032. Fix getContentSummary in ViewFileSystem to handle multiple children mountpoints pointing to different filesystems (#2060). Contributed by Abhishek Das. 2020-07-01 13:01:01 +05:30
Steve Loughran
7de1ac0547
HADOOP-16798. S3A Committer thread pool shutdown problems. (#1963)
Contributed by Steve Loughran.

Fixes a condition which can cause job commit to fail if a task was
aborted < 60s before the job commit commenced: the task abort
will shut down the thread pool with a hard exit after 60s; the
job commit POST requests would be scheduled through the same pool,
so be interrupted and fail. At present the access is synchronized,
but presumably the executor shutdown code is calling wait() and releasing
locks.

Task abort is triggered from the AM when task attempts succeed but
there are still active speculative task attempts running. Thus it
only surfaces when speculation is enabled and the final tasks are
speculating, which, given they are the stragglers, is not unheard of.

Note: this problem has never been seen in production; it has surfaced
in the hadoop-aws tests on a heavily overloaded desktop

Change-Id: I3b433356d01fcc50d88b4353dbca018484984bc8
2020-06-30 10:52:56 +01:00
Szilard Nemeth
8b482744e9 YARN-10277. CapacityScheduler test TestUserGroupMappingPlacementRule should build proper hierarchy. Contributed by Szilard Nemeth 2020-06-30 11:32:59 +02:00
Akira Ajisaka
aa283fc2c2
YARN-10331. Upgrade node.js to 10.21.0. (#2106)
(cherry picked from commit cd188ea9f0)
2020-06-30 16:54:59 +09:00
Eric E Payne
d7696453a0 YARN-9903: Support reservations continue looking for Node Labels. Contributed by Jim Brennan (Jim_Brennan).
(cherry picked from commit 74fc13cf91)
2020-06-29 18:59:52 +00:00
Akira Ajisaka
3e9422d1c7
HDFS-15421. IBR leak causes standby NN to be stuck in safe mode.
(cherry picked from commit c71ce7ac33)
2020-06-28 16:04:47 +09:00
Virajith Jalaparti
ea97fe250c HDFS-15436. Default mount table name used by ViewFileSystem should be configurable (#2100)
* HDFS-15436. Default mount table name used by ViewFileSystem should be configurable

* Replace Constants.CONFIG_VIEWFS_DEFAULT_MOUNT_TABLE use in tests

* Address Uma's comments on PR#2100

* Sort lists in test to match without concern to order

* Address comments, fix checkstyle and fix failing tests

* Fix checkstyle

(cherry picked from commit bed0a3a374)
2020-06-27 16:22:50 -07:00
Uma Maheswara Rao G
81e33d22a0 HDFS-15429. mkdirs should work when parent dir is an internalDir and fallback configured. Contributed by Uma Maheswara Rao G.
(cherry picked from commit d5e1bb6155)
2020-06-27 15:42:36 -07:00
Uma Maheswara Rao G
29a8ee4be6 HDFS-15427. Merged ListStatus with Fallback target filesystem and InternalDirViewFS. Contributed by Uma Maheswara Rao G.
(cherry picked from commit 7c02d1889b)
2020-06-27 15:42:14 -07:00
Uma Maheswara Rao G
5f67c3f3ca HDFS-15418. ViewFileSystemOverloadScheme should represent mount links as non symlinks. Contributed by Uma Maheswara Rao G.
(cherry picked from commit b27810aa60)
2020-06-27 15:41:48 -07:00
Uma Maheswara Rao G
3cddd0be29 HADOOP-17060. Clarify listStatus and getFileStatus behaviors inconsistent in the case of ViewFs implementation for isDirectory. Contributed by Uma Maheswara Rao G.
(cherry picked from commit 93b121a971)
2020-06-27 15:39:38 -07:00
Ayush Saxena
7b29019eea HDFS-15396. Fix TestViewFileSystemOverloadSchemeHdfsFileSystemContract#testListStatusRootDir. Contributed by Ayush Saxena.
(cherry picked from commit a8610c15c4)
2020-06-27 15:39:08 -07:00
Abhishek Das
c3bef4906c HADOOP-17029. Return correct permission and owner for listing on internal directories in ViewFs. Contributed by Abhishek Das.
(cherry picked from commit e7dd02768b)
2020-06-27 15:38:09 -07:00
Szilard Nemeth
fa41e38450 YARN-10279. Avoid unnecessary QueueMappingEntity creations. Contributed by Marton Hudaky
(cherry picked from commit 6a8fd73b27)
2020-06-25 17:28:48 +02:00
Thomas Marquardt
ee192c4826
HADOOP-17089: WASB: Update azure-storage-java SDK
Contributed by Thomas Marquardt

DETAILS: WASB depends on the Azure Storage Java SDK. There is a concurrency
bug in the Azure Storage Java SDK that can cause the results of a list blobs
operation to appear empty. This causes the Filesystem listStatus and similar
APIs to return empty results. This has been seen in Spark work loads when jobs
use more than one executor core.

See Azure/azure-storage-java#546 for details on the bug in the Azure Storage SDK.

TESTS: A new test was added to validate the fix. All tests are passing:

wasb:
mvn -T 1C -Dparallel-tests=wasb -Dscale -DtestsThreadCount=8 clean verify
Tests run: 248, Failures: 0, Errors: 0, Skipped: 11
Tests run: 651, Failures: 0, Errors: 0, Skipped: 65

abfs:
mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 64, Failures: 0, Errors: 0, Skipped: 0
Tests run: 437, Failures: 0, Errors: 0, Skipped: 33
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24
2020-06-25 05:43:32 +00:00
Szilard Nemeth
480919e42d YARN-10316. FS-CS converter: convert maxAppsDefault, maxRunningApps settings. Contributed by Peter Bacsko 2020-06-23 16:25:33 +02:00
Szilard Nemeth
8f1b70e367 YARN-9930. Support max running app logic for CapacityScheduler. Contributed by Peter Bacsko 2020-06-22 12:00:06 +02:00
Masatake Iwasaki
56d72adbdd MAPREDUCE-7281. Fix NoClassDefFoundError on 'mapred minicluster'. (#2077)
(cherry picked from commit 8fd0fdf889)
2020-06-20 21:39:57 +09:00
Thomas Marquardt
63d236c019
HADOOP-17076: ABFS: Delegation SAS Generator Updates
Contributed by Thomas Marquardt.

DETAILS:
1) The authentication version in the service has been updated from Dec19 to Feb20, so need to update the client.
2) Add support and test cases for getXattr and setXAttr.
3) Update DelegationSASGenerator and related to use Duration instead of int for time periods.
4) Cleanup DelegationSASGenerator switch/case statement that maps operations to permissions.
5) Cleanup SASGenerator classes to use String.equals instead of ==.

TESTS:
Added tests for getXAttr and setXAttr.

All tests are passing against my account in eastus2euap:

 $mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
 Tests run: 76, Failures: 0, Errors: 0, Skipped: 0
 Tests run: 441, Failures: 0, Errors: 0, Skipped: 33
 Tests run: 206, Failures: 0, Errors: 0, Skipped: 24
2020-06-19 19:19:31 +00:00
bilaharith
d639c11986
HADOOP-17004. Fixing a formatting issue
Contributed by Bilahari T H.
2020-06-19 19:11:06 +00:00
bilaharith
11307f3be9
HADOOP-17004. ABFS: Improve the ABFS driver documentation
Contributed by Bilahari T H.
2020-06-19 19:10:22 +00:00
Thomas Marquardt
af98f32f7d
HADOOP-16916: ABFS: Delegation SAS generator for integration with Ranger
Contributed by Thomas Marquardt.

DETAILS:

Previously we had a SASGenerator class which generated Service SAS, but we need to add DelegationSASGenerator.
I separated SASGenerator into a base class and two subclasses ServiceSASGenerator and DelegationSASGenreator.  The
code in ServiceSASGenerator is copied from SASGenerator but the DelegationSASGenrator code is new.  The
DelegationSASGenerator code demonstrates how to use Delegation SAS with minimal permissions, as would be used
by an authorization service such as Apache Ranger.  Adding this to the tests helps us lock in this behavior.

Added a MockDelegationSASTokenProvider for testing User Delegation SAS.

Fixed the ITestAzureBlobFileSystemCheckAccess tests to assume oauth client ID so that they are ignored when that
is not configured.

To improve performance, AbfsInputStream/AbfsOutputStream re-use SAS tokens until the expiry is within 120 seconds.
After this a new SAS will be requested.  The default period of 120 seconds can be changed using the configuration
setting "fs.azure.sas.token.renew.period.for.streams".

The SASTokenProvider operation names were updated to correspond better with the ADLS Gen2 REST API, since these
operations must be provided tokens with appropriate SAS parameters to succeed.

Support for the version 2.0 AAD authentication endpoint was added to AzureADAuthenticator.

The getFileStatus method was mistakenly calling the ADLS Gen2 Get Properties API which requires read permission
while the getFileStatus call only requires execute permission.  ADLS Gen2 Get Status API is supposed to be used
for this purpose, so the underlying AbfsClient.getPathStatus API was updated with a includeProperties
parameter which is set to false for getFileStatus and true for getXAttr.

Added SASTokenProvider support for delete recursive.

Fixed bugs in AzureBlobFileSystem where public methods were not validating the Path by calling makeQualified.  This is
necessary to avoid passing null paths and to convert relative paths into absolute paths.

Canonicalized the path used for root path internally so that root path can be used with SAS tokens, which requires
that the path in the URL and the path in the SAS token match.  Internally the code was using
"//" instead of "/" for the root path, sometimes.  Also related to this, the AzureBlobFileSystemStore.getRelativePath
API was updated so that we no longer remove and then add back a preceding forward / to paths.

To run ITestAzureBlobFileSystemDelegationSAS tests follow the instructions in testing_azure.md under the heading
"To run Delegation SAS test cases".  You also need to set "fs.azure.enable.check.access" to true.

TEST RESULTS:

namespace.enabled=true
auth.type=SharedKey
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 0, Skipped: 41
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24

namespace.enabled=false
auth.type=SharedKey
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 0, Skipped: 244
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24

namespace.enabled=true
auth.type=SharedKey
sas.token.provider.type=MockDelegationSASTokenProvider
enable.check.access=true
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 0, Skipped: 33
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24

namespace.enabled=true
auth.type=OAuth
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 1, Skipped: 74
Tests run: 206, Failures: 0, Errors: 0, Skipped: 140
2020-06-19 19:00:46 +00:00
Mehakmeet Singh
a2f44344c3
HADOOP-17018. Intermittent failing of ITestAbfsStreamStatistics in ABFS (#1990)
Contributed by: Mehakmeet Singh

In some cases, ABFS-prefetch thread runs in the background which returns some bytes from the buffer and gives an extra readOp. Thus, making readOps values arbitrary and giving intermittent failures in some cases. Hence, readOps values of 2 or 3 are seen in different setups.
2020-06-19 19:00:04 +00:00