hadoop

Author	SHA1	Message	Date
Sneha Vijayarajan	d5b4d04b0d	HADOOP-17301. ABFS: read-ahead error reporting breaks buffer management (#2369 ) Fixes read-ahead buffer management issues introduced by HADOOP-16852, "ABFS: Send error back to client for Read Ahead request failure". Contributed by Sneha Vijayarajan	2020-10-14 22:29:13 +00:00
Sneha Vijayarajan	da5db6a5a6	HADOOP-17279: ABFS: testNegativeScenariosForCreateOverwriteDisabled fails for non-HNS account. Contributed by Sneha Vijayarajan Testing: namespace.enabled=false auth.type=SharedKey $mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify Tests run: 87, Failures: 0, Errors: 0, Skipped: 0 Tests run: 457, Failures: 0, Errors: 0, Skipped: 246 Tests run: 207, Failures: 0, Errors: 0, Skipped: 24 namespace.enabled=true auth.type=SharedKey $mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify Tests run: 87, Failures: 0, Errors: 0, Skipped: 0 Tests run: 457, Failures: 0, Errors: 0, Skipped: 33 Tests run: 207, Failures: 0, Errors: 0, Skipped: 24 namespace.enabled=true auth.type=OAuth $mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify Tests run: 87, Failures: 0, Errors: 0, Skipped: 0 Tests run: 457, Failures: 0, Errors: 0, Skipped: 74 Tests run: 207, Failures: 0, Errors: 0, Skipped: 140	2020-10-14 22:29:13 +00:00
Sneha Vijayarajan	d166420302	HADOOP-17215: Support for conditional overwrite. Contributed by Sneha Vijayarajan DETAILS: This change adds config key "fs.azure.enable.conditional.create.overwrite" with a default of true. When enabled, if create(path, overwrite: true) is invoked and the file exists, the ABFS driver will first obtain its etag and then attempt to overwrite the file on the condition that the etag matches. The purpose of this is to mitigate the non-idempotency of this method. Specifically, in the event of a network error or similar, the client will retry and this can result in the file being created more than once which may result in data loss. In essense this is like a poor man's file handle, and will be addressed more thoroughly in the future when support for lease is added to ABFS. TEST RESULTS: namespace.enabled=true auth.type=SharedKey ------------------- $mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify Tests run: 87, Failures: 0, Errors: 0, Skipped: 0 Tests run: 457, Failures: 0, Errors: 0, Skipped: 42 Tests run: 207, Failures: 0, Errors: 0, Skipped: 24 namespace.enabled=true auth.type=OAuth ------------------- $mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify Tests run: 87, Failures: 0, Errors: 0, Skipped: 0 Tests run: 457, Failures: 0, Errors: 0, Skipped: 74 Tests run: 207, Failures: 0, Errors: 0, Skipped: 140	2020-10-14 22:29:13 +00:00
bilaharith	f208da286c	HADOOP-17166. ABFS: configure output stream thread pool (#2179 ) Adds the options to control the size of the per-output-stream threadpool when writing data through the abfs connector * fs.azure.write.max.concurrent.requests * fs.azure.write.max.requests.to.queue Contributed by Bilahari T H	2020-10-14 22:29:13 +00:00
bilaharith	cc7350302f	HADOOP-16915. ABFS: Ignoring the test ITestAzureBlobFileSystemRandomRead.testRandomReadPerformance - Contributed by Bilahari T H	2020-10-14 22:29:13 +00:00
Sneha Vijayarajan	4072323de4	Upgrade store REST API version to 2019-12-12 - Contributed by Sneha Vijayarajan	2020-10-14 22:29:13 +00:00
bilaharith	e481d0108a	HADOOP-17149. ABFS: Fixing the testcase ITestGetNameSpaceEnabled - Contributed by Bilahari T H	2020-10-14 22:29:13 +00:00
bilaharith	f73c90f0b0	HADOOP-17163. ABFS: Adding debug log for rename failures - Contributed by Bilahari T H	2020-10-14 22:29:13 +00:00
bilaharith	fbf151ef6f	HADOOP-17137. ABFS: Makes the test cases in ITestAbfsNetworkStatistics agnostic - Contributed by Bilahari T H	2020-10-14 22:29:13 +00:00
Dongjoon Hyun	5032f8abba	HADOOP-17258. Magic S3Guard Committer to overwrite existing pendingSet file on task commit (#2371 ) Contributed by Dongjoon Hyun and Steve Loughran Change-Id: Ibaf8082e60eff5298ff4e6513edc386c5bae0274	2020-10-12 13:42:08 +01:00
Steve Loughran	963793dd48	HADOOP-17293. S3A to always probe S3 in S3A getFileStatus on non-auth paths This reverts changes in HADOOP-13230 to use S3Guard TTL in choosing when to issue a HEAD request; fixing tests to compensate. New org.apache.hadoop.fs.s3a.performance.OperationCost cost, S3GUARD_NONAUTH_FILE_STATUS_PROBE for use in cost tests. Contributed by Steve Loughran. Change-Id: I418d55d2d2562a48b2a14ec7dee369db49b4e29e	2020-10-08 15:38:32 +01:00
Mukund Thakur	475dba1ddf	HADOOP-17281 Implement FileSystem.listStatusIterator() in S3AFileSystem (#2354 ) Contains HADOOP-17300: FileSystem.DirListingIterator.next() call should return NoSuchElementException Contributed by Mukund Thakur Change-Id: I4e7e5c6e295525db9e2de6f416f32bbb81e146d3	2020-10-07 14:00:23 +01:00
bilaharith	d80dfad900	HADOOP-17183. ABFS: Enabling checkaccess on ABFS (#2331 ) Contributed by Bilahari TH Change-Id: If4224697deed733d6db44145994cdd85547c27d1	2020-10-01 21:29:48 +01:00
Mukund Thakur	7e642ec5a3	HADOOP-17023. Tune S3AFileSystem.listStatus() (#2257 ) S3AFileSystem.listStatus() is optimized for invocations where the path supplied is a non-empty directory. The number of S3 requests is significantly reduced, saving time, money, and reducing the risk of S3 throttling. Contributed by Mukund Thakur. Change-Id: I7cc5f87aa16a4819e245e0fbd2aad226bd500f3f	2020-09-21 17:30:15 +01:00
Steve Loughran	aa80bcb1ec	Revert "HADOOP-17244. S3A directory delete tombstones dir markers prematurely. (#2280 )" This reverts commit `0c82eb0324`. Change-Id: I6bd100d9de19660b0f28ee0ab16faf747d6d9f05	2020-09-11 18:07:05 +01:00
Steve Loughran	0c82eb0324	HADOOP-17244. S3A directory delete tombstones dir markers prematurely. (#2280 ) This changes directory tree deletion so that only files are incrementally deleted from S3Guard after the objects are deleted; the directories are left alone until metadataStore.deleteSubtree(path) is invoke. This avoids directory tombstones being added above files/child directories, which stop the treewalk and delete phase from working. Also: * Callback to delete objects splits files and dirs so that any problems deleting the dirs doesn't trigger s3guard updates * New statistic to measure #of objects deleted, alongside request count. * Callback listFilesAndEmptyDirectories renamed listFilesAndDirectoryMarkers to clarify behavior. * Test enhancements to replicate the failure and verify the fix Contributed by Steve Loughran Change-Id: I0e6ea2c35e487267033b1664228c8837279a35c7	2020-09-10 17:29:33 +01:00
Mehakmeet Singh	ccceec8af0	HADOOP-17158. Test timeout for ITestAbfsInputStreamStatistics#testReadAheadCounters (#2272 ) Contributed by: Mehakmeet Singh. Change-Id: I7ebfa5cd1b5d25f7a750f0c645d7d93c81e89240	2020-09-08 14:02:28 +01:00
Mehakmeet Singh	28f1ded9fe	HADOOP-17113. Adding ReadAhead Counters in ABFS (#2154 ) Contributed by Mehakmeet Singh Change-Id: I6bbd8165385a9267ed64831bb1efa18b6554feb1	2020-09-08 14:02:02 +01:00
Mehakmeet Singh	7970710418	HADOOP-17229. No update of bytes received counter value after response failure occurs in ABFS (#2264 ) Contributed by Mehakmeet Singh Change-Id: Ia9ad1b87a460b10d27486bd00ee67c3cedd2b5b5	2020-09-08 13:26:24 +01:00
Mukund Thakur	5236c96ead	HADOOP-17167 ITestS3AEncryptionWithDefaultS3Settings failing (#2187 ) Now skips ITestS3AEncryptionWithDefaultS3Settings.testEncryptionOverRename when server side encryption is not set to sse:kms Contributed by Mukund Thakur Change-Id: Ifd83d353e9c7c6f7e1195a2c2f138d85cf876bb1	2020-09-04 15:00:30 +01:00
Steve Loughran	38354006f8	HADOOP-17227. S3A Marker Tool tuning (#2254 ) Contributed by Steve Loughran.	2020-09-04 14:58:54 +01:00
Mehakmeet Singh	f6e1ed4f6b	HADOOP-17194. Adding Context class for AbfsClient in ABFS (#2216 ) Contributed by Mehakmeet Singh. Change-Id: I120c9a068d758d8e5d071c878a3b7fbeb95e4de6	2020-08-27 11:28:37 +01:00
Mukund Thakur	0840c0c1f3	HADOOP-17074. S3A Listing to be fully asynchronous. (#2207 ) Contributed by Mukund Thakur. Change-Id: I1b0574a0c9ebc0805f285dd5280a00e5add081f1	2020-08-25 11:30:42 +01:00
swamirishi	ba4f7fb332	HADOOP-17122: Preserving Directory Attributes in DistCp with Atomic Copy (#2133 ) Contributed by Swaminathan Balachandran Change-Id: I86f956dd4ab0b278d923fe7b70037e6b929a8aa1	2020-08-22 18:51:10 +01:00
Steve Loughran	49f8ae965e	HADOOP-13230. S3A to optionally retain directory markers. This adds an option to disable "empty directory" marker deletion, so avoid throttling and other scale problems. This feature is not backwards compatible. Consult the documentation and use with care. Contributed by Steve Loughran. Change-Id: I69a61e7584dc36e485d5e39ff25b1e3e559a1958	2020-08-15 20:19:49 +01:00
Mukund Thakur	571737f4ac	HADOOP-17192. ITestS3AHugeFilesSSECDiskBlock failing (#2221 ) Contributed by Mukund Thakur	2020-08-13 14:33:27 +01:00
Ayush Saxena	2943e6650f	HDFS-15514. Remove useless dfs.webhdfs.enabled. Contributed by Fei Hui.	2020-08-07 22:20:42 +05:30
Mukund Thakur	251d2d1fa5	HADOOP-17131. Refactor S3A Listing code for better isolation. (#2148 ) Contributed by Mukund Thakur. Change-Id: I79160b236a92fdd67565a4b4974f1862e600c210	2020-08-04 17:13:06 +01:00
Sneha Vijayarajan	18ca80331c	Hadoop 17132. ABFS: Fix Rename and Delete Idempotency check trigger - Contributed by Sneha Vijayarajan	2020-07-25 13:13:18 +00:00
ishaniahuja	f24e2ec487	HADOOP-17058. ABFS: Support for AppendBlob in Hadoop ABFS Driver - Contributed by Ishani Ahuja	2020-07-25 13:12:32 +00:00
Mehakmeet Singh	7c9b459786	HADOOP-16961. ABFS: Adding metrics to AbfsInputStream (#2076 ) Contributed by Mehakmeet Singh.	2020-07-25 13:12:09 +00:00
Mehakmeet Singh	bbd3278d09	HADOOP-17065. Add Network Counters to ABFS (#2056 ) Contributed by Mehakmeet Singh.	2020-07-25 13:11:34 +00:00
Karthik Amarnath	8b7e77443d	HDFS-15168: ABFS enhancement to translate AAD to Linux identities. (#1978 )	2020-07-25 13:10:39 +00:00
Sneha Vijayarajan	903935da0f	HADOOP-17053. ABFS: Fix Account-specific OAuth config setting parsing Contributed by Sneha Vijayarajan	2020-07-25 13:10:30 +00:00
Sneha Vijayarajan	869a68b81e	HADOOP-16852: Report read-ahead error back Contributed by Sneha Vijayarajan	2020-07-25 13:10:19 +00:00
Sneha Vijayarajan	27b20f9689	HADOOP-17054. ABFS: Fix test AbfsClient authentication instance Contributed by Sneha Vijayarajan	2020-07-25 13:09:26 +00:00
Sneha Vijayarajan	eed06b46eb	Hadoop-17015. ABFS: Handling Rename and Delete idempotency Contributed by Sneha Vijayarajan.	2020-07-25 13:08:01 +00:00
bilaharith	1ae72d2438	HADOOP-17092. ABFS: Making AzureADAuthenticator.getToken() throw HttpException - Contributed by Bilahari T H Change-Id: Id9576d9509faaf057bf419ccb1879ac0cef7a07b	2020-07-22 18:26:36 +01:00
Ayush Saxena	e3b8d4eb05	HADOOP-17100. Replace Guava Supplier with Java8+ Supplier in Hadoop. Contributed by Ahmed Hussein.	2020-07-22 18:21:14 +05:30
Steve Loughran	5aa9396a58	HADOOP-17107. hadoop-azure parallel tests not working on recent JDKs (#2118 ) Contributed by Steve Loughran. Change-Id: I972264aed36f384b7ae23e214326ef7870261cf5	2020-07-20 10:54:22 +01:00
bilaharith	e01852181a	HADOOP-16682. ABFS: Removing unnecessary toString() invocations - Contributed by Bilahari T H Change-Id: Id55495b44d81533d1d3654de2553c709f505f7eb	2020-07-20 10:53:59 +01:00
Mehakmeet Singh	0d88ed2794	HADOOP-17129. Validating storage keys in ABFS correctly (#2141 ) Contributed by Mehakmeet Singh Change-Id: I8016ee2f9ffbc86ea867f4a3d960b134e507d099	2020-07-16 18:11:52 +01:00
Mukund Thakur	8b601ad7e6	HADOOP-17022. Tune S3AFileSystem.listFiles() API. Contributed by Mukund Thakur. Change-Id: I17f5cfdcd25670ce3ddb62c13378c7e2dc06ba52	2020-07-14 15:28:27 +01:00
Anoop Sam John	cac2fc1f58	HADOOP-16998. WASB : NativeAzureFsOutputStream#close() throwing IllegalArgumentException (#2073 ) Contributed by Anoop Sam John.	2020-07-14 14:08:46 +01:00
jimmy-zuber-amzn	79fc58def3	HADOOP-17105. S3AFS - Do not attempt to resolve symlinks in globStatus (#2113 ) Contributed by Jimmy Zuber. Change-Id: I2f247c2d2ab4f38214073e55f5cfbaa15aeaeb11	2020-07-13 19:09:50 +01:00
Steve Loughran	a51d72f0c6	HDFS-13934. Multipart uploaders to be created through FileSystem/FileContext. Contributed by Steve Loughran. Change-Id: Iebd34140c1a0aa71f44a3f4d0fee85f6bdf123a3	2020-07-13 13:32:04 +01:00
Sebastian Nagel	f9619b0b97	HADOOP-17117 Fix typos in hadoop-aws documentation (#2127 ) (cherry picked from commit `5b1ed2113b`)	2020-07-09 00:04:46 +09:00
bilaharith	19fb204011	HADOOP-17086. ABFS: Making the ListStatus response ignore unknown properties. (#2101 ) Contributed by Bilahari T H. Change-Id: I82e4683fba8481aef2abab7a6a99e5752f6fffa9	2020-07-03 19:02:21 +01:00
Steve Loughran	7de1ac0547	HADOOP-16798. S3A Committer thread pool shutdown problems. (#1963 ) Contributed by Steve Loughran. Fixes a condition which can cause job commit to fail if a task was aborted < 60s before the job commit commenced: the task abort will shut down the thread pool with a hard exit after 60s; the job commit POST requests would be scheduled through the same pool, so be interrupted and fail. At present the access is synchronized, but presumably the executor shutdown code is calling wait() and releasing locks. Task abort is triggered from the AM when task attempts succeed but there are still active speculative task attempts running. Thus it only surfaces when speculation is enabled and the final tasks are speculating, which, given they are the stragglers, is not unheard of. Note: this problem has never been seen in production; it has surfaced in the hadoop-aws tests on a heavily overloaded desktop Change-Id: I3b433356d01fcc50d88b4353dbca018484984bc8	2020-06-30 10:52:56 +01:00
Thomas Marquardt	ee192c4826	HADOOP-17089: WASB: Update azure-storage-java SDK Contributed by Thomas Marquardt DETAILS: WASB depends on the Azure Storage Java SDK. There is a concurrency bug in the Azure Storage Java SDK that can cause the results of a list blobs operation to appear empty. This causes the Filesystem listStatus and similar APIs to return empty results. This has been seen in Spark work loads when jobs use more than one executor core. See Azure/azure-storage-java#546 for details on the bug in the Azure Storage SDK. TESTS: A new test was added to validate the fix. All tests are passing: wasb: mvn -T 1C -Dparallel-tests=wasb -Dscale -DtestsThreadCount=8 clean verify Tests run: 248, Failures: 0, Errors: 0, Skipped: 11 Tests run: 651, Failures: 0, Errors: 0, Skipped: 65 abfs: mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify Tests run: 64, Failures: 0, Errors: 0, Skipped: 0 Tests run: 437, Failures: 0, Errors: 0, Skipped: 33 Tests run: 206, Failures: 0, Errors: 0, Skipped: 24	2020-06-25 05:43:32 +00:00

1 2 3 4 5 ...

1409 Commits