Commit Graph

1415 Commits

Author SHA1 Message Date
Steve Loughran
9960c01a25
HADOOP-17244. S3A directory delete tombstones dir markers prematurely. (#2280)
This changes directory tree deletion so that only files are incrementally deleted
from S3Guard after the objects are deleted; the directories are left alone
until metadataStore.deleteSubtree(path) is invoke.

This avoids directory tombstones being added above files/child directories,
which stop the treewalk and delete phase from working.

Also:

* Callback to delete objects splits files and dirs so that
any problems deleting the dirs doesn't trigger s3guard updates
* New statistic to measure #of objects deleted, alongside request count.
* Callback listFilesAndEmptyDirectories renamed listFilesAndDirectoryMarkers
  to clarify behavior.
* Test enhancements to replicate the failure and verify the fix

Contributed by Steve Loughran
2020-09-10 17:03:52 +01:00
bilaharith
85119267be
HADOOP-17166. ABFS: configure output stream thread pool (#2179)
Adds the options to control the size of the per-output-stream threadpool
when writing data through the abfs connector

* fs.azure.write.max.concurrent.requests
* fs.azure.write.max.requests.to.queue

Contributed by Bilahari T H
2020-09-09 16:41:36 +01:00
Mehakmeet Singh
0d855159f0
HADOOP-17229. No updation of bytes received counter value after response failure occurs in ABFS (#2264)
Contributed by Mehakmeet Singh
2020-09-08 10:14:23 +01:00
Mehakmeet Singh
84ed6adccc
HADOOP-17158. Test timeout for ITestAbfsInputStreamStatistics#testReadAheadCounters (#2272)
Contributed by: Mehakmeet Singh.
2020-09-08 10:11:06 +01:00
Steve Loughran
5346cc3263
HADOOP-17227. S3A Marker Tool tuning (#2254)
Contributed by Steve Loughran.
2020-09-04 14:58:03 +01:00
Mukund Thakur
139a43e98e
HADOOP-17167 ITestS3AEncryptionWithDefaultS3Settings failing (#2187)
Now skips ITestS3AEncryptionWithDefaultS3Settings.testEncryptionOverRename
when server side encryption is not set to sse:kms

Contributed by Mukund Thakur
2020-09-03 19:35:24 +01:00
Mehakmeet Singh
d1c60a53f6
HADOOP-17194. Adding Context class for AbfsClient in ABFS (#2216)
Contributed by Mehakmeet Singh.
2020-08-27 11:27:00 +01:00
Mukund Thakur
cc641534dc
HADOOP-17074. S3A Listing to be fully asynchronous. (#2207)
Contributed by Mukund Thakur.
2020-08-25 11:29:43 +01:00
bilaharith
64f36b9543
HADOOP-16915. ABFS: Ignoring the test ITestAzureBlobFileSystemRandomRead.testRandomReadPerformance
- Contributed by Bilahari T H
2020-08-24 12:00:55 -07:00
swamirishi
872c2909bd
HADOOP-17122: Preserving Directory Attributes in DistCp with Atomic Copy (#2133)
Contributed by Swaminathan Balachandran
2020-08-22 18:48:21 +01:00
Sneha Vijayarajan
b367942fe4
Upgrade store REST API version to 2019-12-12
- Contributed by Sneha Vijayarajan
2020-08-17 10:17:18 -07:00
Steve Loughran
5092ea62ec HADOOP-13230. S3A to optionally retain directory markers.
This adds an option to disable "empty directory" marker deletion,
so avoid throttling and other scale problems.

This feature is *not* backwards compatible.
Consult the documentation and use with care.

Contributed by Steve Loughran.

Change-Id: I69a61e7584dc36e485d5e39ff25b1e3e559a1958
2020-08-15 12:51:08 +01:00
Mukund Thakur
4a400d3193
HADOOP-17192. ITestS3AHugeFilesSSECDiskBlock failing (#2221)
Contributed by Mukund Thakur
2020-08-13 14:21:49 +01:00
Ayush Saxena
975b6024dd HDFS-15514. Remove useless dfs.webhdfs.enabled. Contributed by Fei Hui. 2020-08-07 22:19:17 +05:30
bilaharith
a2610e21ed
HADOOP-17183. ABFS: Enabling checkaccess on ABFS
- Contributed by Bilahari T H
2020-08-06 14:52:02 -07:00
bilaharith
3f73facd7b
HADOOP-17149. ABFS: Fixing the testcase ITestGetNameSpaceEnabled
- Contributed by Bilahari T H
2020-08-05 10:01:04 -07:00
bilaharith
c566cabd62
HADOOP-17163. ABFS: Adding debug log for rename failures
- Contributed by Bilahari T H
2020-08-05 09:38:13 -07:00
Mukund Thakur
ac697571a1
HADOOP-17186. Fixing javadoc in ListingOperationCallbacks (#2196) 2020-08-05 20:40:49 +09:00
Mukund Thakur
8fd4f5490f
HADOOP-17131. Refactor S3A Listing code for better isolation. (#2148)
Contributed by Mukund Thakur.
2020-08-04 16:00:02 +01:00
Akira Ajisaka
c40cbc57fa
HADOOP-17091. [JDK11] Fix Javadoc errors (#2098) 2020-08-03 10:46:51 +09:00
bilaharith
a7fda2e38f
HADOOP-17137. ABFS: Makes the test cases in ITestAbfsNetworkStatistics agnostic
- Contributed by Bilahari T H
2020-07-31 12:27:57 -07:00
Mehakmeet Singh
48a7c5b6ba
HADOOP-17113. Adding ReadAhead Counters in ABFS (#2154)
Contributed by Mehakmeet Singh
2020-07-22 18:22:30 +01:00
Masatake Iwasaki
1b29c9bfee
HADOOP-17138. Fix spotbugs warnings surfaced after upgrade to 4.0.6. (#2155) 2020-07-22 13:40:20 +09:00
Sneha Vijayarajan
d23cc9d85d
Hadoop 17132. ABFS: Fix Rename and Delete Idempotency check trigger
- Contributed by Sneha Vijayarajan
2020-07-21 09:22:38 -07:00
bilaharith
b4b23ef0d1
HADOOP-17092. ABFS: Making AzureADAuthenticator.getToken() throw HttpException
- Contributed by Bilahari T H
2020-07-21 09:18:54 -07:00
Mukund Thakur
bb459d4dd6
HADOOP-17136. ITestS3ADirectoryPerformance.testListOperations failing (#2153)
A regression caused by HADOOP-17022: the reduction in LIST calls broken an assertion.

Contributed by Mukund Thakur
2020-07-20 16:58:50 +01:00
Steve Loughran
9f407bcc88
HADOOP-17107. hadoop-azure parallel tests not working on recent JDKs (#2118)
Contributed by Steve Loughran.
2020-07-20 10:51:26 +01:00
bilaharith
99655167f3
HADOOP-16682. ABFS: Removing unnecessary toString() invocations
- Contributed by Bilahari T H
2020-07-18 10:00:18 -07:00
Ayush Saxena
6bcb24d269 HADOOP-17100. Replace Guava Supplier with Java8+ Supplier in Hadoop. Contributed by Ahmed Hussein. 2020-07-18 14:33:43 +05:30
Mehakmeet Singh
4083fd57b5
HADOOP-17129. Validating storage keys in ABFS correctly (#2141)
Contributed by Mehakmeet Singh
2020-07-16 17:29:37 +01:00
Mukund Thakur
4647a60430
HADOOP-17022. Tune S3AFileSystem.listFiles() API.
Contributed by Mukund Thakur.

Change-Id: I17f5cfdcd25670ce3ddb62c13378c7e2dc06ba52
2020-07-14 15:27:35 +01:00
Anoop Sam John
380e0f4506
HADOOP-16998. WASB : NativeAzureFsOutputStream#close() throwing IllegalArgumentException (#2073)
Contributed by Anoop Sam John.
2020-07-14 14:07:27 +01:00
jimmy-zuber-amzn
806d84b79c
HADOOP-17105. S3AFS - Do not attempt to resolve symlinks in globStatus (#2113)
Contributed by Jimmy Zuber.
2020-07-13 19:07:48 +01:00
Steve Loughran
b9fa5e0182
HDFS-13934. Multipart uploaders to be created through FileSystem/FileContext.
Contributed by Steve Loughran.

Change-Id: Iebd34140c1a0aa71f44a3f4d0fee85f6bdf123a3
2020-07-13 13:30:02 +01:00
Sebastian Nagel
5b1ed2113b
HADOOP-17117 Fix typos in hadoop-aws documentation (#2127) 2020-07-09 00:03:15 +09:00
ishaniahuja
d20109c171
HADOOP-17058. ABFS: Support for AppendBlob in Hadoop ABFS Driver
- Contributed by Ishani Ahuja
2020-07-04 13:25:14 -07:00
bilaharith
e0cededfbd
HADOOP-17086. ABFS: Making the ListStatus response ignore unknown properties. (#2101)
Contributed by Bilahari T H.
2020-07-03 19:00:22 +01:00
Mehakmeet Singh
3b5c9a90c0
HADOOP-16961. ABFS: Adding metrics to AbfsInputStream (#2076)
Contributed by Mehakmeet Singh.
2020-07-03 11:41:35 +01:00
Yiqun Lin
ff8bb67200 HDFS-15374. Add documentation for fedbalance tool. Contributed by Jinglun. 2020-07-01 14:18:18 +08:00
Yiqun Lin
de2cb86260 HDFS-15410. Add separated config file hdfs-fedbalance-default.xml for fedbalance tool. Contributed by Jinglun. 2020-07-01 14:06:27 +08:00
Steve Loughran
4249c04d45
HADOOP-16798. S3A Committer thread pool shutdown problems. (#1963)
Contributed by Steve Loughran.

Fixes a condition which can cause job commit to fail if a task was
aborted < 60s before the job commit commenced: the task abort
will shut down the thread pool with a hard exit after 60s; the
job commit POST requests would be scheduled through the same pool,
so be interrupted and fail. At present the access is synchronized,
but presumably the executor shutdown code is calling wait() and releasing
locks.

Task abort is triggered from the AM when task attempts succeed but
there are still active speculative task attempts running. Thus it
only surfaces when speculation is enabled and the final tasks are
speculating, which, given they are the stragglers, is not unheard of.

Note: this problem has never been seen in production; it has surfaced
in the hadoop-aws tests on a heavily overloaded desktop
2020-06-30 10:44:51 +01:00
Thomas Marquardt
4b5b54c73f
HADOOP-17089: WASB: Update azure-storage-java SDK
Contributed by Thomas Marquardt

DETAILS: WASB depends on the Azure Storage Java SDK. There is a concurrency
bug in the Azure Storage Java SDK that can cause the results of a list blobs
operation to appear empty. This causes the Filesystem listStatus and similar
APIs to return empty results. This has been seen in Spark work loads when jobs
use more than one executor core.

See Azure/azure-storage-java#546 for details on the bug in the Azure Storage SDK.

TESTS: A new test was added to validate the fix. All tests are passing:

wasb:
mvn -T 1C -Dparallel-tests=wasb -Dscale -DtestsThreadCount=8 clean verify
Tests run: 248, Failures: 0, Errors: 0, Skipped: 11
Tests run: 651, Failures: 0, Errors: 0, Skipped: 65

abfs:
mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 64, Failures: 0, Errors: 0, Skipped: 0
Tests run: 437, Failures: 0, Errors: 0, Skipped: 33
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24
2020-06-25 02:32:42 +00:00
Akira Ajisaka
201d734af3
HDFS-15428. Javadocs fails for hadoop-federation-balance. Contributed by Xieming Li. 2020-06-22 19:43:19 +09:00
Mehakmeet Singh
3472c3efc0
HADOOP-17065. Add Network Counters to ABFS (#2056)
Contributed by Mehakmeet Singh.
2020-06-19 14:03:49 +01:00
Yiqun Lin
9cbd76cc77 HDFS-15346. FedBalance tool implementation. Contributed by Jinglun. 2020-06-18 13:33:25 +08:00
Thomas Marquardt
caf3995ac2
HADOOP-17076: ABFS: Delegation SAS Generator Updates
Contributed by Thomas Marquardt.

DETAILS:
1) The authentication version in the service has been updated from Dec19 to Feb20, so need to update the client.
2) Add support and test cases for getXattr and setXAttr.
3) Update DelegationSASGenerator and related to use Duration instead of int for time periods.
4) Cleanup DelegationSASGenerator switch/case statement that maps operations to permissions.
5) Cleanup SASGenerator classes to use String.equals instead of ==.

TESTS:
Added tests for getXAttr and setXAttr.

All tests are passing against my account in eastus2euap:

 $mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
 Tests run: 76, Failures: 0, Errors: 0, Skipped: 0
 Tests run: 441, Failures: 0, Errors: 0, Skipped: 33
 Tests run: 206, Failures: 0, Errors: 0, Skipped: 24
2020-06-18 02:07:08 +00:00
Steve Loughran
ac5d899d40
HADOOP-17050 S3A to support additional token issuers
Contributed by Steve Loughran.

S3A delegation token providers will be asked for any additional
token issuers, an array can be returned,
each one will be asked for tokens when DelegationTokenIssuer collects
all the tokens for a filesystem.
2020-06-09 14:39:06 +01:00
Steve Loughran
40d63e02f0
HADOOP-16568. S3A FullCredentialsTokenBinding fails if local credentials are unset. (#1441)
Contributed by Steve Loughran.

Move the loading to deployUnbonded (where they are required) and add a safety check when a new DT is requested
2020-06-03 17:07:00 +01:00
Mehakmeet Singh
7f486f0258
HADOOP-17016. Adding Common Counters in ABFS (#1991).
Contributed by: Mehakmeet Singh.

Change-Id: Ib84e7a42f28e064df4c6204fcce33e573360bf42
2020-06-02 18:31:35 +01:00
Karthik Amarnath
b2200a33a6
HDFS-15168: ABFS enhancement to translate AAD to Linux identities. (#1978) 2020-05-28 19:00:23 -07:00