hadoop

Author	SHA1	Message	Date
Anoop Sam John	177d906a67	HADOOP-17770 WASB : Support disabling buffered reads in positional reads (#3149 )	2021-07-13 10:37:12 +05:30
litao	fef53aacc9	HDFS-16122. Fix DistCpContext#toString() (#3191 ). Contributed by tomscut. Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>	2021-07-10 13:55:11 +05:30
Mukund Thakur	93ad7c32f4	HADOOP-17250 Lot of short reads can be merged with readahead. (#3110 ) Introducing fs.azure.readahead.range parameter which can be set by the user. Data will be populated in buffer for random reads as well which leads to fewer remote calls. This patch also changes the seek implementation to perform a lazy seek. The actual seek is done when a read is initiated and data is not present in the buffer else data is returned from the buffer thus reducing the number of remote storage calls. Contributed By: Mukund Thakur	2021-07-05 15:49:13 +05:30
sumangala-patki	35570e414a	HADOOP-17290. ABFS: Add Identifiers to Client Request Header (#2520 ) Contributed by Sumangala Patki.	2021-07-02 19:13:20 +05:30
Mehakmeet Singh	ea259f236c	HADOOP-17774. S3A bytesRead FS statistic showing twice the correct value (#3144 ) Contributed by: Mehakmeet Singh	2021-07-02 14:03:16 +01:00
Masatake Iwasaki	3788fe52da	HDFS-13916. Distcp SnapshotDiff to support WebHDFS. Contributed by Xun REN. Signed-off-by: Masatake Iwasaki <iwasakims@apache.org>	2021-06-26 21:04:56 +00:00
Zamil Majdy	ed5d10ee48	HADOOP-17764. S3AInputStream read does not re-open the input stream on the second read retry attempt (#3109 ) Contributed by Zamil Majdy.	2021-06-25 20:01:48 +01:00
Steve Loughran	5b7f68ac76	HADOOP-17771. S3AFS creation fails "Unable to find a region via the region provider chain." (#3133 ) This addresses the regression in Hadoop 3.3.1 where if no S3 endpoint is set in fs.s3a.endpoint, S3A filesystem creation may fail on non-EC2 deployments, depending on the local host environment setup. * If fs.s3a.endpoint is empty/null, and fs.s3a.endpoint.region is null, the region is set to "us-east-1". * If fs.s3a.endpoint.region is explicitly set to "" then the client falls back to the SDK region resolution chain; this works on EC2 * Details in troubleshooting.md, including a workaround for Hadoop-3.3.1+ * Also contains some minor restructuring of troubleshooting.md Contributed by Steve Loughran.	2021-06-24 16:37:27 +01:00
Takanobu Asanuma	9e7c7ad129	HADOOP-17760. Delete hadoop.ssl.enabled and dfs.https.enable from docs and core-default.xml (#3099 ) Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>	2021-06-17 09:58:47 +09:00
snehavarma	35e4c31fff	HADOOP-17714 ABFS: testBlobBackCompatibility, testRandomRead & WasbAbfsCompatibility tests fail when triggered with default configs (#3035 )	2021-06-13 23:52:29 +05:30
Anoop Sam John	5970c632d4	HADOOP-17645 Fix test failures in org.apache.hadoop.fs.azure.ITestOutputStreamSemantics. (#2926 )	2021-06-13 23:07:10 +05:30
Petre Bogdan Stolojan	de9ca9f155	HADOOP-17547 Magic committer to downgrade abort in cleanup if list uploads fails with access denied (#3051 ) Contributed by Bogdan Stolojan	2021-06-12 17:45:12 +01:00
Anoop Sam John	2cf952baf4	HADOOP-17643 WASB : Make metadata checks case insensitive (#2972 )	2021-06-12 15:25:03 +05:30
Viraj Jasani	4ef27a596f	HADOOP-17753. Keep restrict-imports-enforcer-rule for Guava Lists in top level hadoop-main pom (#3087 )	2021-06-11 12:15:52 +09:00
snehavarma	4c039fafeb	HADOOP-17715 ABFS: Append blob tests with non HNS accounts fail (#3028 )	2021-06-09 10:54:10 +05:30
Viraj Jasani	00d372b663	HADOOP-17725. Improve error message for token providers in ABFS (#3041 ) Contributed by Viraj Jasani.	2021-06-08 22:03:03 +01:00
Akira Ajisaka	57a3613e5d	HDFS-16050. Some dynamometer tests fail. (#3079 ) Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>	2021-06-07 14:37:30 +09:00
Viraj Jasani	f4b24c68e7	HADOOP-17743. Replace Guava Lists usage by Hadoop's own Lists in hadoop-common, hadoop-tools and cloud-storage projects (#3072 )	2021-06-07 13:24:09 +09:00
sumangala-patki	76d92eb2a2	HADOOP-17596. ABFS: Change default Readahead Queue Depth from num(processors) to const (#2795 ) . Contributed by Sumangala Patki.	2021-06-03 14:26:15 +05:30
Akira Ajisaka	9983ab8a99	HDFS-16046. TestBalancerProcedureScheduler and TestDistCpProcedure timeouts. (#3060 ) Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>	2021-05-29 23:04:48 +09:00
zhengchenyu	d5ad181684	MAPREDUCE-7287. Distcp will delete exists file , If we use "-delete and -update" options and distcp file. (#2852 ) Contributed by zhengchenyu	2021-05-28 20:21:37 +01:00
Viraj Jasani	986d0a4f1d	HADOOP-17732. Keep restrict-imports-enforcer-rule for Guava Sets in hadoop-main pom (#3049 ) Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>	2021-05-26 17:14:31 +09:00
Steve Loughran	832a3c6a89	HADOOP-17511. Add audit/telemetry logging to S3A connector (#2807 ) The S3A connector supports "an auditor", a plugin which is invoked at the start of every filesystem API call, and whose issued "audit span" provides a context for all REST operations against the S3 object store. The standard auditor sets the HTTP Referrer header on the requests with information about the API call, such as process ID, operation name, path, and even job ID. If the S3 bucket is configured to log requests, this information will be preserved there and so can be used to analyze and troubleshoot storage IO. Contributed by Steve Loughran.	2021-05-25 10:25:41 +01:00
Mehakmeet Singh	5f400032b6	HADOOP-17705. S3A to add Config to set AWS region (#3020 ) The option `fs.s3a.endpoint.region` can be used to explicitly set the AWS region of a bucket. This is needed when using AWS Private Link, as the region cannot be automatically determined. Contributed by Mehakmeet Singh	2021-05-24 13:08:45 +01:00
Mehakmeet Singh	c665ab02ed	HADOOP-17670. S3AFS and ABFS to log IOStats at DEBUG mode or optionally at INFO level in close() (#2963 ) When the S3A and ABFS filesystems are closed, their IOStatistics are logged at debug in the log: org.apache.hadoop.fs.statistics.IOStatisticsLogging Set `fs.iostatistics.logging.level` to `info` for the statistics to be logged at info. (also: `warn` or `error` for even higher log levels). Contributed by: Mehakmeet Singh	2021-05-24 13:02:11 +01:00
Viraj Jasani	e4062ad027	HADOOP-17115. Replace Guava Sets usage by Hadoop's own Sets in hadoop-common and hadoop-tools (#2985 ) Signed-off-by: Sean Busbey <busbey@apache.org>	2021-05-20 10:47:04 -05:00
Takanobu Asanuma	207210263a	HADOOP-17375. Fix the error of TestDynamometerInfra. (#2471 ) Signed-off-by: Akira Ajisaka <aajisaka@apache.org>	2021-05-07 13:52:17 +09:00
Steve Loughran	68425eb469	HADOOP-16742. NullPointerException in S3A MultiObjectDeleteSupport Contributed by Tor Arvid Lund. Change-Id: Iadfe9b2f355cf373031075bfbe681705a2c65bdc	2021-05-04 11:23:01 +01:00
bilaharith	f54e7646cf	HADOOP-17536. ABFS: Supporting customer provided encryption key (#2707 ) Contributed by bilahari t h	2021-04-27 11:15:52 +01:00
Steve Loughran	88a550bc3a	HADOOP-17112. S3A committers can't handle whitespace in paths. (#2953 ) Contributed by Krzysztof Adamski.	2021-04-25 18:33:55 +01:00
Steve Loughran	027c8fb257	HADOOP-17597. Optionally downgrade on S3A Syncable calls (#2801 ) Followup to HADOOP-13327, which changed S3A output stream hsync/hflush calls to raise an exception. Adds a new option fs.s3a.downgrade.syncable.exceptions When true, calls to Syncable hsync/hflush on S3A output streams will log once at warn (for entire process life, not just the stream), then increment IOStats with the relevant operation counter With the downgrade option false (default) * IOStats are incremented * The UnsupportedOperationException current raised includes a link to the JIRA. Contributed by Steve Loughran.	2021-04-23 18:44:41 +01:00
Ayush Saxena	6800b21e3b	HADOOP-17620. DistCp: Use Iterator for listing target directory as well. (#2861 ). Contributed by Ayush Saxena. Signed-off-by: Vinayakumar B <vinayakumarb@apache.org>	2021-04-23 22:48:15 +05:30
Mehakmeet Singh	6085f09db5	HADOOP-17471. ABFS to collect IOStatistics (#2731 ) The ABFS Filesystem and its input and output streams now implement the IOStatisticSource interface and provide IOStatistics on their interactions with Azure Storage. This includes the min/max/mean durations of all REST API calls. Contributed by Mehakmeet Singh <mehakmeet.singh@cloudera.com>	2021-04-23 10:28:31 +01:00
Steve Loughran	5221322b96	HADOOP-17535. ABFS: ITestAzureBlobFileSystemCheckAccess test failure if no oauth key. (#2920 ) Contributed by Steve Loughran.	2021-04-21 16:06:06 +01:00
Steve Loughran	2dd1e04010	HADOOP-17641. ITestWasbUriAndConfiguration failing. (#2937 ) This moves the mock account name --which is required to never exist-- from "mockAccount" to an account name containing a static UUID. Contributed by Steve Loughran.	2021-04-20 15:32:01 +01:00
billierinaldi	c1fde4fe94	HADOOP-16948. Support infinite lease dirs. (#1925 ) * HADOOP-16948. Support single writer dirs. * HADOOP-16948. Fix findbugs and checkstyle problems. * HADOOP-16948. Fix remaining checkstyle problems. * HADOOP-16948. Add DurationInfo, retry policy for acquiring lease, and javadocs * HADOOP-16948. Convert ABFS client to use an executor for lease ops * HADOOP-16948. Fix ABFS lease test for non-HNS * HADOOP-16948. Fix checkstyle and javadoc * HADOOP-16948. Address review comments * HADOOP-16948. Use daemon threads for ABFS lease ops * HADOOP-16948. Make lease duration configurable * HADOOP-16948. Add error messages to test assertions * HADOOP-16948. Remove extra isSingleWriterKey call * HADOOP-16948. Use only infinite lease duration due to cost of renewal ops * HADOOP-16948. Remove acquire/renew/release lease methods * HADOOP-16948. Rename single writer dirs to infinite lease dirs * HADOOP-16948. Fix checkstyle * HADOOP-16948. Wait for acquire lease future * HADOOP-16948. Add unit test for acquire lease failure	2021-04-12 19:47:59 -04:00
sumangala-patki	6f640abbaf	HADOOP-17576. ABFS: Disable throttling update for auth failures (#2761 ) Contributed by Sumangala Patki	2021-04-09 09:31:23 +05:30
Viraj Jasani	3f2682b92b	HADOOP-17622. Avoid usage of deprecated IOUtils#cleanup API. (#2862 ) Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>	2021-04-06 13:39:10 +09:00
Steve Loughran	85d3bba555	HADOOP-17476. ITestAssumeRole.testAssumeRoleBadInnerAuth failure. (#2777 ) Contributed by Steve Loughran.	2021-03-24 16:47:55 +00:00
Steve Loughran	04880f076d	HADOOP-13551. AWS metrics wire-up (#2778 ) Moves to the builder API for AWS S3 client creation, and offers a similar style of API to the S3A FileSystem and tests, hiding the details of which options are client, which are in AWS Conf, and doing the wiring up of S3A statistics interfaces to the AWS SDK internals. S3A Statistics, including IOStatistics, should now count throttling events handled in the AWS SDK itself. This patch restores endpoint determination by probes to US-East-1 if the client isn't configured with fs.s3a.endpoint. Explicitly setting the endpoint will save the cost of these probe HTTP requests. Contributed by Steve Loughran.	2021-03-24 13:32:54 +00:00
Ayush Saxena	03cfc85279	HADOOP-17531. DistCp: Reduce memory usage on copying huge directories. (#2732 ). Contributed by Ayush Saxena. Signed-off-by: Steve Loughran <stevel@apache.org>	2021-03-24 02:36:26 +05:30
Jack Jiang	d8ec8ab965	HADOOP-17599. Remove NULL checks before instanceof (#2804 )	2021-03-23 08:46:11 -07:00
Ayush Saxena	4781761dc2	HADOOP-17594. DistCp: Expose the JobId for applications executing through run method (#2786 ). Contributed by Ayush Saxena. Signed-off-by: Mingliang Liu <liuml07@apache.org> Signed-off-by: Steve Loughran <stevel@apache.org>	2021-03-19 14:19:49 +05:30
Chao Sun	9b2f812996	HADOOP-17532. Yarn Job execution get failed when LZ4 Compression Codec is used. Contributed Bhavik Patel.	2021-03-14 21:15:08 -07:00
sumangala-patki	fe633d4739	HADOOP-17548. ABFS: Toggle Store Mkdirs request overwrite parameter (#2729 ) Contributed by Sumangala Patki.	2021-03-14 13:35:02 +05:30
Steve Loughran	bcd9c67082	HADOOP-16721. Improve S3A rename resilience (#2742 ) The S3A connector's rename() operation now raises FileNotFoundException if the source doesn't exist; a FileAlreadyExistsException if the destination exists and is unsuitable for the source file/directory. When renaming to a path which does not exist, the connector no longer checks for the destination parent directory existing -instead it simply verifies that there is no file immediately above the destination path. This is needed to avoid race conditions with delete() and rename() calls working on adjacent subdirectories. Contributed by Steve Loughran.	2021-03-11 12:47:39 +00:00
Akira Ajisaka	23b343aed1	HADOOP-16870. Use spotbugs-maven-plugin instead of findbugs-maven-plugin (#2753 ) Removed findbugs from the hadoop build images and added spotbugs instead. Upgraded SpotBugs to 4.2.2 and spotbugs-maven-plugin to 4.2.0. Reviewed-by: Masatake Iwasaki <iwasakims@apache.org>	2021-03-11 10:56:07 +09:00
Pierrick Hymbert	ebfba0b6fa	[HADOOP-17567] typo in MagicCommitTracker (#2749 ) Contributed by Pierrick Hymbert	2021-03-10 15:39:55 +00:00
Chao Sun	176bd88890	HADOOP-16080. hadoop-aws does not work with hadoop-client-api. (#2522 ) Contributed by Chao Sun. (Cherry-picked via PR #2575)	2021-03-09 20:01:29 +00:00
Peter Bacsko	c3aa413ee3	YARN-10679. Better logging of uncaught exceptions throughout SLS. Contributed by Szilard Nemeth.	2021-03-09 14:02:12 +01:00
Peter Bacsko	099f58f8f4	YARN-10681. Fix assertion failure message in BaseSLSRunnerTest. Contributed by Szilard Nemeth.	2021-03-09 13:22:48 +01:00
Peter Bacsko	7f522c92fa	YARN-10677. Logger of SLSFairScheduler is provided with the wrong class. Contributed by Szilard Nemeth.	2021-03-09 12:53:32 +01:00
Peter Bacsko	ea90cd3556	YARN-10678. Try blocks without catch blocks in SLS scheduler classes can swallow other exceptions. Contributed by Szilard Nemeth.	2021-03-09 12:03:53 +01:00
Ahmed Hussein	e04bcb3a06	MAPREDUCE-7320. organize test directories for ClusterMapReduceTestCase (#2722 ). Contributed by Ahmed Hussein	2021-02-26 13:42:33 -06:00
sumangala-patki	7f64030314	HADOOP-17537. ABFS: Correct assertion reversed in HADOOP-13327 Contributed Sumangala Patki.	2021-02-22 11:45:58 +00:00
Akira Ajisaka	9a298d180d	Revert "HADOOP-16870. Use spotbugs-maven-plugin instead of findbugs-maven-plugin (#2454 )" This reverts commit `4cf3531583`.	2021-02-19 11:09:10 +09:00
Akira Ajisaka	4cf3531583	HADOOP-16870. Use spotbugs-maven-plugin instead of findbugs-maven-plugin (#2454 ) Use spotbugs instead of findbugs. Removed findbugs from the hadoop build images, and added spotbugs in the images instead. Reviewed-by: Masatake Iwasaki <iwasakims@apache.org> Reviewed-by: Inigo Goiri <inigoiri@apache.org> Reviewed-by: Dinesh Chitlangia <dineshc@apache.org>	2021-02-17 10:38:20 +09:00
Anoop Sam John	1bb4101b59	HADOOP-17038 Support disabling buffered reads in ABFS positional reads. (#2646 ) - Contributed by @anoopsjohn	2021-02-16 22:27:52 +05:30
Steve Loughran	78905d7e3f	HADOOP-16906. Abortable (#2684 ) Adds an Abortable.abort() interface for streams to enable output streams to be terminated; this is implemented by the S3A connector's output stream. It allows for commit protocols to be implemented which commit/abort work by writing to the final destination and using the abort() call to cancel any write which is not intended to be committed. Consult the specification document for information about the interface and its use. Contributed by Jungtaek Lim and Steve Loughran.	2021-02-11 17:37:20 +00:00
Steve Loughran	798df6d699	HADOOP-13327 Output Stream Specification. (#2587 ) This defines what output streams and especially those which implement Syncable are meant to do, and documents where implementations (HDFS; S3) don't. With tests. The file:// FileSystem now supports Syncable if an application calls FileSystem.setWriteChecksum(false) before creating a file -checksumming and Syncable.hsync() are incompatible. Contributed by Steve Loughran.	2021-02-10 10:28:59 +00:00
bilaharith	5f34271bb1	HADOOP-17475. ABFS : add high performance listStatusIterator (#2548 ) The ABFS connector now implements listStatusIterator() with asynchronous prefetching of the next page(s) of results. For listing large directories this can provide tangible speedups. If for any reason this needs to be disabled, set fs.azure.enable.abfslistiterator to false. Contributed by Bilahari T H.	2021-02-04 13:36:19 +00:00
Steve Loughran	26b9d480e8	HADOOP-17337. S3A NetworkBinding has a runtime dependency on shaded httpclient. (#2599 ) Contributed by Steve Loughran.	2021-02-03 14:29:56 +00:00
Steve Loughran	f37bf65199	HADOOP-15710. ABFS checkException to map 403 to AccessDeniedException. (#2648 ) When 403 is returned from an ABFS HTTP call, an AccessDeniedException is raised. The exception text is unchanged, for any application string matching on the getMessage() contents. Contributed by Steve Loughran.	2021-02-02 18:13:41 +00:00
Steve Loughran	0bb52a42e5	HADOOP-17483. Magic committer is enabled by default. (#2656 ) * core-default.xml updated so that fs.s3a.committer.magic.enabled = true * CommitConstants updated to match * All tests which previously enabled the magic committer now rely on default settings. This helps make sure it is enabled. * Docs cover the switch, mention its enabled and explain why you may want to disable it. Note: this doesn't switch to using the committer -it just enables the path rewriting magic which it depends on. Contributed by Steve Loughran.	2021-01-27 19:04:22 +00:00
Steve Loughran	28cc912a5c	HADOOP-17493. Revert name of DELEGATION_TOKENS_ISSUED constant/statistic (#2649 ) Follow-on to HADOOP-16830/HADOOP-17271. Contributed by Steve Loughran.	2021-01-27 16:39:29 +00:00
Steve Loughran	80c7404b51	HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark (#2530 ) This needs SPARK-33739 in the matching spark branch in order to work Contributed by Steve Loughran.	2021-01-26 19:30:51 +00:00
Ayush Saxena	e40f99f6d5	HDFS-15767. RBF: Router federation rename of directory. Contributed by Jinglun.	2021-01-26 14:25:27 +05:30
Steve Loughran	06a5d3437f	HADOOP-17480. Document that AWS S3 is consistent and that S3Guard is not needed (#2636 ) Contributed by Steve Loughran.	2021-01-25 13:21:34 +00:00
Maksim Bober	e2f8503ebd	HADOOP-17484. Typo in hadop-aws index.md (#2634 ) Contributed by Maksim Bober.	2021-01-21 17:30:58 +00:00
Steve Loughran	68bc721841	HADOOP-17433. Skipping network I/O in S3A getFileStatus(/) breaks ITestAssumeRole. (#2600 ) Contributed by Steve Loughran.	2021-01-19 17:19:27 +00:00
Szilard Nemeth	6cd540e964	YARN-7200. SLS generates a realtimetrack.json file but that file is missing the closing ']'. Contributed by Agshin Kazimli	2021-01-15 22:32:30 +01:00
Steve Loughran	724edb0354	HADOOP-17451. IOStatistics test failures in S3A code. (#2594 ) Caused by HADOOP-16830 and HADOOP-17271. Fixes tests which fail intermittently based on configs and in the case of the HugeFile tests, bulk runs with existing FS instances meant statistic probes sometimes ended up probing those of a previous FS. Contributed by Steve Loughran. Change-Id: I65ba3f44444e59d298df25ac5c8dc5a8781dfb7d	2021-01-12 17:30:32 +00:00
Steve Loughran	05c9c2ed02	Revert "HADOOP-17451. IOStatistics test failures in S3A code. (#2594 )" This reverts commit `d3014e01f3`. (fixing commit text before it is frozen)	2021-01-12 17:29:59 +00:00
Steve Loughran	d3014e01f3	HADOOP-17451. IOStatistics test failures in S3A code. (#2594 ) Caused by HADOOP-16380 and HADOOP-17271. Fixes tests which fail intermittently based on configs and in the case of the HugeFile tests, bulk runs with existing FS instances meant statistic probes sometimes ended up probing those of a previous FS. Contributed by Steve Loughran.	2021-01-12 17:25:14 +00:00
Mehakmeet Singh	0a6ddfa145	HADOOP-17272. ABFS Streams to support IOStatistics API (#2604 ) Contributed by Mehakmeet Singh.	2021-01-12 15:48:09 +00:00
bilaharith	612330661b	HADOOP-17459. ADLS Gen1: Fixes for rename contract tests #2607 Contributed by Bilaharith	2021-01-12 14:00:48 +00:00
Sneha Vijayarajan	b612c310c2	HADOOP-17404. ABFS: Small write - Merge append and flush - Contributed by Sneha Vijayarajan	2021-01-06 10:43:37 -08:00
bilaharith	d21c1c6576	HADOOP-17444. ADLS Gen1: Update adls SDK to 2.3.9 (#2551 ) Contributed by bilaharith	2021-01-06 14:32:13 +00:00
Gabor Bota	42eb9ff68e	HADOOP-17454. [s3a] Disable bucket existence check - set fs.s3a.bucket.probe to 0 (#2593 ) Also fixes HADOOP-16995. ITestS3AConfiguration proxy tests failures when bucket probes == 0 The improvement should include the fix, ebcause the test would fail by default otherwise. Change-Id: I9a7e4b5e6d4391ebba096c15e84461c038a2ec59	2021-01-05 15:43:01 +01:00
Ayush Saxena	77299ae992	HDFS-15748. RBF: Move the router related part from hadoop-federation-balance module to hadoop-hdfs-rbf. Contributed by Jinglun.	2021-01-05 00:05:03 +05:30
bilaharith	1448add08f	HADOOP-17347. ABFS: Read optimizations - Contributed by Bilahari T H	2021-01-02 10:37:10 -08:00
Sneha Vijayarajan	5ca1ea89b3	HADOOP-17407. ABFS: Fix NPE on delete idempotency flow - Contributed by Sneha Vijayarajan	2021-01-02 10:22:10 -08:00
Steve Loughran	617af28e80	HADOOP-17271. S3A connector to support IOStatistics. (#2580 ) S3A connector to support the IOStatistics API of HADOOP-16830, This is a major rework of the S3A Statistics collection to * Embrace the IOStatistics APIs * Move from direct references of S3AInstrumention statistics collectors to interface/implementation classes in new packages. * Ubiquitous support of IOStatistics, including: S3AFileSystem, input and output streams, RemoteIterator instances provided in list calls. * Adoption of new statistic names from hadoop-common Regarding statistic collection, as well as all existing statistics, the connector now records min/max/mean durations of HTTP GET and HEAD requests, and those of LIST operations. Contributed by Steve Loughran.	2020-12-31 21:55:39 +00:00
Sumangala	a35fc3871b	HADOOP-17422: ABFS: Set default ListMaxResults to max server limit (#2535 ) Contributed by Sumangala Patki TEST RESULTS: namespace.enabled=true auth.type=SharedKey ------------------- $mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify Tests run: 90, Failures: 0, Errors: 0, Skipped: 0 Tests run: 462, Failures: 0, Errors: 0, Skipped: 24 Tests run: 208, Failures: 0, Errors: 0, Skipped: 24 namespace.enabled=true auth.type=OAuth ------------------- $mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify Tests run: 90, Failures: 0, Errors: 0, Skipped: 0 Tests run: 462, Failures: 0, Errors: 0, Skipped: 70 Tests run: 208, Failures: 0, Errors: 0, Skipped: 141	2020-12-21 06:40:36 +00:00
yzhangal	3d2193cd64	HADOOP-17338. Intermittent S3AInputStream failures: Premature end of Content-Length delimited message body etc (#2497 ) Yongjun Zhang <yongjunzhang@pinterest.com>	2020-12-18 19:08:10 +00:00
bilaharith	4c033bafa0	HADOOP-17191. ABFS: Run the tests with various combinations of configurations and publish a consolidated results - Contributed by Bilahari T H	2020-12-16 10:34:59 -08:00
Sneha Vijayarajan	5bf977e6b1	Hadoop-17413. Release elastic byte buffer pool at close - Contributed by Sneha Vijayarajan	2020-12-14 20:45:37 -08:00
Ankit Kumar	aaf9e3d320	YARN-10491. Fix deprecation warnings in SLSWebApp.java (#2519 ) Signed-off-by: Akira Ajisaka <ajisaka@apache.org>	2020-12-09 10:52:31 +09:00
Thomas Marquardt	717b835068	HADOOP-17397: ABFS: SAS Test updates for version and permission update DETAILS: The previous commit for HADOOP-17397 was not the correct fix. DelegationSASGenerator.getDelegationSAS should return sp=p for the set-permission and set-acl operations. The tests have also been updated as follows: 1. When saoid and suoid are not specified, skoid must have an RBAC role assignment which grants Microsoft.Storage/storageAccounts/blobServices/containers/blobs/modifyPermissions/action and sp=p to set permissions or set ACL. 2. When saoid or suiod is specified, same as 1) but furthermore the saoid or suoid must be an owner of the file or directory in order for the operation to succeed. 3. When saoid or suiod is specified, the ownership check is bypassed by also including 'o' (ownership) in the SAS permission (for example, sp=op). Note that 'o' grants the saoid or suoid the ability to change the file or directory owner to themself, and they can also change the owning group. Generally speaking, if a trusted authorizer would like to give a user the ability to change the permissions or ACL, then that user should be the file or directory owner. TEST RESULTS: namespace.enabled=true auth.type=SharedKey ------------------- $mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify Tests run: 90, Failures: 0, Errors: 0, Skipped: 0 Tests run: 462, Failures: 0, Errors: 0, Skipped: 24 Tests run: 208, Failures: 0, Errors: 0, Skipped: 24 namespace.enabled=true auth.type=OAuth ------------------- $mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify Tests run: 90, Failures: 0, Errors: 0, Skipped: 0 Tests run: 462, Failures: 0, Errors: 0, Skipped: 70 Tests run: 208, Failures: 0, Errors: 0, Skipped: 141	2020-12-03 13:11:17 +00:00
Sneha Vijayarajan	142941b96e	HADOOP-17296. ABFS: Force reads to be always of buffer size. Contributed by Sneha Vijayarajan.	2020-11-27 14:22:34 +00:00
Mukund Thakur	03b4e98971	HADOOP-17398. Skipping network I/O in S3A getFileStatus(/) breaks some tests (#2493 ) Follow-on to HADOOP-17323. Contributed by Mukund Thakur.	2020-11-26 20:25:32 +00:00
Steve Loughran	67dc0928c1	HADOOP-17385. ITestS3ADeleteCost.testDirMarkersFileCreation failure (#2473 ). Contributed by Steve Loughran The addition of deprecated S3A configuration options in HADOOP-17318 triggered a reload of default (xml resource) configurations, which breaks tests which fail if there's a per-bucket setting inconsistent with test setup. Creating an S3AFS instance before creating the Configuration() instance for test runs gets that reload out the way before test setup takes place. Along with the fix, extra changes in the failing test suite to fail fast when marker policy isn't as expected, and to log FS state better. Rather than create and discard an instance, add a new static method to S3AFS and invoke it in test setup. This forces the load Change-Id: Id52b1c46912c6fedd2ae270e2b1eb2222a360329	2020-11-26 13:50:33 +01:00
Sneha Vijayarajan	cf43a7eaae	HADOOP-17397. ABFS: SAS Test updates for version and permission update (#2492 ) Contributed by Sneha Vijayarajan.	2020-11-26 10:21:01 +00:00
Sneha Vijayarajan	009ce4f02a	HADOOP-17396. ABFS: testRenameFileOverExistingFile fails (#2491 ) Contributed by Sneha Vijayarajan.	2020-11-26 10:11:25 +00:00
Steve Loughran	ac7045b75f	HADOOP-17313. FileSystem.get to support slow-to-instantiate FS clients. (#2396 ) This adds a semaphore to throttle the number of FileSystem instances which can be created simultaneously, set in "fs.creation.parallel.count". This is designed to reduce the impact of many threads in an application calling FileSystem.get() on a filesystem which takes time to instantiate -for example to an object where HTTPS connections are set up during initialization. Many threads trying to do this may create spurious delays by conflicting for access to synchronized blocks, when simply limiting the parallelism diminishes the conflict, so speeds up all threads trying to access the store. The default value, 64, is larger than is likely to deliver any speedup -but it does mean that there should be no adverse effects from the change. If a service appears to be blocking on all threads initializing connections to abfs, s3a or store, try a smaller (possibly significantly smaller) value. Contributed by Steve Loughran.	2020-11-25 14:31:02 +00:00
bilaharith	3193d8c793	HADOOP-17311. ABFS: Logs should redact SAS signature (#2422 ) Contributed by bilaharith.	2020-11-25 14:22:10 +00:00
Mukund Thakur	5fee95076b	HADOOP-17323. S3A getFileStatus("/") to skip IO (#2479 ) Contributed by Mukund Thakur.	2020-11-24 11:06:56 +00:00
Steve Loughran	9b4faf2b51	HADOOP-17332. S3A MarkerTool -min and -max are inverted. (#2425 ) This patch * fixes the inversion * adds a precondition check * if the commands are supplied inverted, swaps them with a warning. This is to stop breaking any tests written to cope with the existing behavior. Contributed by Steve Loughran	2020-11-23 20:49:42 +00:00
Steve Loughran	07b7d07388	HADOOP-17325. WASB Test Failures Contributed by Ayush Saxena and Steve Loughran Change-Id: I4bb76815bc1d11d1804dc67bafde68b6a995b974	2020-11-23 17:22:13 +00:00
Steve Loughran	fb79be932c	HADOOP-17343. Upgrade AWS SDK to 1.11.901 (#2468 ) Contributed by Steve Loughran.	2020-11-23 14:08:12 +00:00
Jungtaek Lim	f3c629c27e	HADOOP-17388. AbstractS3ATokenIdentifier to issue date in UTC. (#2477 ) Followup to HADOOP-17379. Contributed by Jungtaek Lim.	2020-11-20 10:38:42 +00:00
Ahmed Hussein	07050339e0	HADOOP-17367. Add InetAddress api to ProxyUsers.authorize (#2449 ). Contributed by Daryn Sharp and Ahmed Hussein	2020-11-19 14:37:14 -06:00
Steve Loughran	ce7827c82a	HADOOP-17318. Support concurrent S3A commit jobs with same app attempt ID. (#2399 ) See also [SPARK-33402]: Jobs launched in same second have duplicate MapReduce JobIDs Contributed by Steve Loughran. Change-Id: Iae65333cddc84692997aae5d902ad8765b45772a	2020-11-18 13:34:51 +00:00
Steve Loughran	e3c08f285a	HADOOP-17244. S3A directory delete tombstones dir markers prematurely. (#2310 ) This fixes the S3Guard/Directory Marker Retention integration so that when fs.s3a.directory.marker.retention=keep, failures during multipart delete are handled correctly, as are incremental deletes during directory tree operations. In both cases, when a directory marker with children is deleted from S3, the directory entry in S3Guard is not deleted, because it is still critical to representing the structure of the store. Contributed by Steve Loughran. Change-Id: I4ca133a23ea582cd42ec35dbf2dc85b286297d2f	2020-11-18 12:18:11 +00:00
Jungtaek Lim	a7b923c80c	HADOOP-17379. AbstractS3ATokenIdentifier to set issue date == now. (#2466 ) Unless you explicitly set it, the issue date of a delegation token identifier is 0, which confuses spark renewal (SPARK-33440). This patch makes sure that all S3A DT identifiers have the current time as issue date, fixing the problem as far as S3A tokens are concerned. Contributed by Jungtaek Lim.	2020-11-17 14:43:29 +00:00
Doroszlai, Attila	dd85a90da6	HADOOP-17376. ITestS3AContractRename failing against stricter tests. (#2462 ) Contributed by Attila Doroszlai.	2020-11-16 11:24:00 +00:00
jianghuazhu	375900049c	HDFS-15608.Reset the DistCp#CLEANUP variable definition. (#2351 ). Contributed by JiangHua Zhu. Co-authored-by: zhujianghua <zhujianghua@zhujianghuadeMacBook-Pro.local>	2020-11-10 13:02:29 -08:00
Eric E Payne	0461a07c01	YARN-10475: Scale RM-NM heartbeat interval based on node utilization. Contributed by Jim Brennan (Jim_Brennan).	2020-11-02 16:55:06 +00:00
Yiqun Lin	15a5f53673	HDFS-15640. Add diff threshold to FedBalance. Contributed by Jinglun.	2020-10-27 10:41:10 +08:00
Anoop Sam John	7bdf165f62	HADOOP-17308. WASB PageBlobOutputStream.flush succeeds even when flush to storage fails (#2392 ) Contributed by Anoop Sam John.	2020-10-23 10:51:19 +01:00
Mukund Thakur	7f8ef76c48	HADOOP-17305. Fix ITestCustomSigner to work with s3 compatible endpoints (#2395 ) Contributed by Mukund Thakur	2020-10-21 13:01:13 +01:00
Aryan Gupta	d60d5fe43d	HADOOP-17302. Upgrade to jQuery 3.5.1 in hadoop-sls. (#2379 ) Co-authored-by: Aryan Gupta	2020-10-19 18:18:46 +05:30
Ayush Saxena	1e3a6efcef	HADOOP-17288. Use shaded guava from thirdparty. (#2342 ). Contributed by Ayush Saxena.	2020-10-17 12:01:18 +05:30
Adam Antal	bd8cf7fd4c	YARN-10448. SLS should set default user to handle SYNTH format. Contributed by zhuqi	2020-10-13 17:54:15 +02:00
Sneha Vijayarajan	c4fff74cc5	HADOOP-17301. ABFS: read-ahead error reporting breaks buffer management (#2369 ) Fixes read-ahead buffer management issues introduced by HADOOP-16852, "ABFS: Send error back to client for Read Ahead request failure". Contributed by Sneha Vijayarajan	2020-10-13 16:30:34 +01:00
Dongjoon Hyun	b92f72758b	HADOOP-17258. Magic S3Guard Committer to overwrite existing pendingSet file on task commit (#2371 ) Contributed by Dongjoon Hyun and Steve Loughran Change-Id: Ibaf8082e60eff5298ff4e6513edc386c5bae0274	2020-10-12 13:39:15 +01:00
Steve Loughran	f83e07a20f	HADOOP-17293. S3A to always probe S3 in S3A getFileStatus on non-auth paths This reverts changes in HADOOP-13230 to use S3Guard TTL in choosing when to issue a HEAD request; fixing tests to compensate. New org.apache.hadoop.fs.s3a.performance.OperationCost cost, S3GUARD_NONAUTH_FILE_STATUS_PROBE for use in cost tests. Contributed by Steve Loughran. Change-Id: I418d55d2d2562a48b2a14ec7dee369db49b4e29e	2020-10-08 15:35:57 +01:00
Mukund Thakur	82522d60fb	HADOOP-17281 Implement FileSystem.listStatusIterator() in S3AFileSystem (#2354 ) Contains HADOOP-17300: FileSystem.DirListingIterator.next() call should return NoSuchElementException Contributed by Mukund Thakur	2020-10-07 13:59:06 +01:00
Ikko Ashimine	4347a5c955	HADOOP-17294. Fix typos existance to existence (#2357 )	2020-10-06 10:10:44 +09:00
Arpit Agarwal	18fa4397e6	MAPREDUCE-7298. Distcp doesn't close the job after the job is completed. Contributed by Aasha Medhi. Change-Id: I63d249bbb18ccedaeee9f10123a78e32f9e54ed2	2020-10-02 08:29:55 -07:00
bilaharith	51598d8b1b	HADOOP-17183. ABFS: Enabling checkaccess on ABFS (#2331 ) Contributed by Bilahari TH	2020-10-01 21:29:05 +01:00
Sneha Vijayarajan	c3a90dd918	HADOOP-17279: ABFS: testNegativeScenariosForCreateOverwriteDisabled fails for non-HNS account. Contributed by Sneha Vijayarajan Testing: namespace.enabled=false auth.type=SharedKey $mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify Tests run: 87, Failures: 0, Errors: 0, Skipped: 0 Tests run: 457, Failures: 0, Errors: 0, Skipped: 246 Tests run: 207, Failures: 0, Errors: 0, Skipped: 24 namespace.enabled=true auth.type=SharedKey $mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify Tests run: 87, Failures: 0, Errors: 0, Skipped: 0 Tests run: 457, Failures: 0, Errors: 0, Skipped: 33 Tests run: 207, Failures: 0, Errors: 0, Skipped: 24 namespace.enabled=true auth.type=OAuth $mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify Tests run: 87, Failures: 0, Errors: 0, Skipped: 0 Tests run: 457, Failures: 0, Errors: 0, Skipped: 74 Tests run: 207, Failures: 0, Errors: 0, Skipped: 140	2020-09-23 15:59:00 +00:00
Steve Loughran	7fae4133e0	HADOOP-17261. s3a rename() needs s3:deleteObjectVersion permission (#2303 ) Contributed by Steve Loughran.	2020-09-22 17:22:04 +01:00
Mukund Thakur	83c7c2b4c4	HADOOP-17023. Tune S3AFileSystem.listStatus() (#2257 ) S3AFileSystem.listStatus() is optimized for invocations where the path supplied is a non-empty directory. The number of S3 requests is significantly reduced, saving time, money, and reducing the risk of S3 throttling. Contributed by Mukund Thakur.	2020-09-21 17:20:16 +01:00
Sneha Vijayarajan	e31a636e92	HADOOP-17215: Support for conditional overwrite. Contributed by Sneha Vijayarajan DETAILS: This change adds config key "fs.azure.enable.conditional.create.overwrite" with a default of true. When enabled, if create(path, overwrite: true) is invoked and the file exists, the ABFS driver will first obtain its etag and then attempt to overwrite the file on the condition that the etag matches. The purpose of this is to mitigate the non-idempotency of this method. Specifically, in the event of a network error or similar, the client will retry and this can result in the file being created more than once which may result in data loss. In essense this is like a poor man's file handle, and will be addressed more thoroughly in the future when support for lease is added to ABFS. TEST RESULTS: namespace.enabled=true auth.type=SharedKey ------------------- $mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify Tests run: 87, Failures: 0, Errors: 0, Skipped: 0 Tests run: 457, Failures: 0, Errors: 0, Skipped: 42 Tests run: 207, Failures: 0, Errors: 0, Skipped: 24 namespace.enabled=true auth.type=OAuth ------------------- $mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify Tests run: 87, Failures: 0, Errors: 0, Skipped: 0 Tests run: 457, Failures: 0, Errors: 0, Skipped: 74 Tests run: 207, Failures: 0, Errors: 0, Skipped: 140	2020-09-19 01:28:44 +00:00
ThomasMarquardt	0dc54d0247	HADOOP-17203: Revert HADOOP-17183. ABFS: Enabling checkaccess on ABFS This reverts commit `a2610e21ed`.	2020-09-18 17:52:11 -07:00
Steve Loughran	958cab804e	Revert "HADOOP-17244. S3A directory delete tombstones dir markers prematurely. (#2280 )" This reverts commit `9960c01a25`. Change-Id: I820534c3292f2a343693d835f625488c325fb5d6	2020-09-11 18:07:49 +01:00
Steve Loughran	9960c01a25	HADOOP-17244. S3A directory delete tombstones dir markers prematurely. (#2280 ) This changes directory tree deletion so that only files are incrementally deleted from S3Guard after the objects are deleted; the directories are left alone until metadataStore.deleteSubtree(path) is invoke. This avoids directory tombstones being added above files/child directories, which stop the treewalk and delete phase from working. Also: * Callback to delete objects splits files and dirs so that any problems deleting the dirs doesn't trigger s3guard updates * New statistic to measure #of objects deleted, alongside request count. * Callback listFilesAndEmptyDirectories renamed listFilesAndDirectoryMarkers to clarify behavior. * Test enhancements to replicate the failure and verify the fix Contributed by Steve Loughran	2020-09-10 17:03:52 +01:00
bilaharith	85119267be	HADOOP-17166. ABFS: configure output stream thread pool (#2179 ) Adds the options to control the size of the per-output-stream threadpool when writing data through the abfs connector * fs.azure.write.max.concurrent.requests * fs.azure.write.max.requests.to.queue Contributed by Bilahari T H	2020-09-09 16:41:36 +01:00
Mehakmeet Singh	0d855159f0	HADOOP-17229. No updation of bytes received counter value after response failure occurs in ABFS (#2264 ) Contributed by Mehakmeet Singh	2020-09-08 10:14:23 +01:00
Mehakmeet Singh	84ed6adccc	HADOOP-17158. Test timeout for ITestAbfsInputStreamStatistics#testReadAheadCounters (#2272 ) Contributed by: Mehakmeet Singh.	2020-09-08 10:11:06 +01:00
Steve Loughran	5346cc3263	HADOOP-17227. S3A Marker Tool tuning (#2254 ) Contributed by Steve Loughran.	2020-09-04 14:58:03 +01:00
Mukund Thakur	139a43e98e	HADOOP-17167 ITestS3AEncryptionWithDefaultS3Settings failing (#2187 ) Now skips ITestS3AEncryptionWithDefaultS3Settings.testEncryptionOverRename when server side encryption is not set to sse:kms Contributed by Mukund Thakur	2020-09-03 19:35:24 +01:00
Mehakmeet Singh	d1c60a53f6	HADOOP-17194. Adding Context class for AbfsClient in ABFS (#2216 ) Contributed by Mehakmeet Singh.	2020-08-27 11:27:00 +01:00
Mukund Thakur	cc641534dc	HADOOP-17074. S3A Listing to be fully asynchronous. (#2207 ) Contributed by Mukund Thakur.	2020-08-25 11:29:43 +01:00
bilaharith	64f36b9543	HADOOP-16915. ABFS: Ignoring the test ITestAzureBlobFileSystemRandomRead.testRandomReadPerformance - Contributed by Bilahari T H	2020-08-24 12:00:55 -07:00
swamirishi	872c2909bd	HADOOP-17122: Preserving Directory Attributes in DistCp with Atomic Copy (#2133 ) Contributed by Swaminathan Balachandran	2020-08-22 18:48:21 +01:00
Sneha Vijayarajan	b367942fe4	Upgrade store REST API version to 2019-12-12 - Contributed by Sneha Vijayarajan	2020-08-17 10:17:18 -07:00
Steve Loughran	5092ea62ec	HADOOP-13230. S3A to optionally retain directory markers. This adds an option to disable "empty directory" marker deletion, so avoid throttling and other scale problems. This feature is not backwards compatible. Consult the documentation and use with care. Contributed by Steve Loughran. Change-Id: I69a61e7584dc36e485d5e39ff25b1e3e559a1958	2020-08-15 12:51:08 +01:00
Mukund Thakur	4a400d3193	HADOOP-17192. ITestS3AHugeFilesSSECDiskBlock failing (#2221 ) Contributed by Mukund Thakur	2020-08-13 14:21:49 +01:00
Ayush Saxena	975b6024dd	HDFS-15514. Remove useless dfs.webhdfs.enabled. Contributed by Fei Hui.	2020-08-07 22:19:17 +05:30
bilaharith	a2610e21ed	HADOOP-17183. ABFS: Enabling checkaccess on ABFS - Contributed by Bilahari T H	2020-08-06 14:52:02 -07:00
bilaharith	3f73facd7b	HADOOP-17149. ABFS: Fixing the testcase ITestGetNameSpaceEnabled - Contributed by Bilahari T H	2020-08-05 10:01:04 -07:00
bilaharith	c566cabd62	HADOOP-17163. ABFS: Adding debug log for rename failures - Contributed by Bilahari T H	2020-08-05 09:38:13 -07:00
Mukund Thakur	ac697571a1	HADOOP-17186. Fixing javadoc in ListingOperationCallbacks (#2196 )	2020-08-05 20:40:49 +09:00
Mukund Thakur	8fd4f5490f	HADOOP-17131. Refactor S3A Listing code for better isolation. (#2148 ) Contributed by Mukund Thakur.	2020-08-04 16:00:02 +01:00
Akira Ajisaka	c40cbc57fa	HADOOP-17091. [JDK11] Fix Javadoc errors (#2098 )	2020-08-03 10:46:51 +09:00
bilaharith	a7fda2e38f	HADOOP-17137. ABFS: Makes the test cases in ITestAbfsNetworkStatistics agnostic - Contributed by Bilahari T H	2020-07-31 12:27:57 -07:00
Mehakmeet Singh	48a7c5b6ba	HADOOP-17113. Adding ReadAhead Counters in ABFS (#2154 ) Contributed by Mehakmeet Singh	2020-07-22 18:22:30 +01:00
Masatake Iwasaki	1b29c9bfee	HADOOP-17138. Fix spotbugs warnings surfaced after upgrade to 4.0.6. (#2155 )	2020-07-22 13:40:20 +09:00
Sneha Vijayarajan	d23cc9d85d	Hadoop 17132. ABFS: Fix Rename and Delete Idempotency check trigger - Contributed by Sneha Vijayarajan	2020-07-21 09:22:38 -07:00
bilaharith	b4b23ef0d1	HADOOP-17092. ABFS: Making AzureADAuthenticator.getToken() throw HttpException - Contributed by Bilahari T H	2020-07-21 09:18:54 -07:00
Mukund Thakur	bb459d4dd6	HADOOP-17136. ITestS3ADirectoryPerformance.testListOperations failing (#2153 ) A regression caused by HADOOP-17022: the reduction in LIST calls broken an assertion. Contributed by Mukund Thakur	2020-07-20 16:58:50 +01:00
Steve Loughran	9f407bcc88	HADOOP-17107. hadoop-azure parallel tests not working on recent JDKs (#2118 ) Contributed by Steve Loughran.	2020-07-20 10:51:26 +01:00
bilaharith	99655167f3	HADOOP-16682. ABFS: Removing unnecessary toString() invocations - Contributed by Bilahari T H	2020-07-18 10:00:18 -07:00
Ayush Saxena	6bcb24d269	HADOOP-17100. Replace Guava Supplier with Java8+ Supplier in Hadoop. Contributed by Ahmed Hussein.	2020-07-18 14:33:43 +05:30
Mehakmeet Singh	4083fd57b5	HADOOP-17129. Validating storage keys in ABFS correctly (#2141 ) Contributed by Mehakmeet Singh	2020-07-16 17:29:37 +01:00
Mukund Thakur	4647a60430	HADOOP-17022. Tune S3AFileSystem.listFiles() API. Contributed by Mukund Thakur. Change-Id: I17f5cfdcd25670ce3ddb62c13378c7e2dc06ba52	2020-07-14 15:27:35 +01:00
Anoop Sam John	380e0f4506	HADOOP-16998. WASB : NativeAzureFsOutputStream#close() throwing IllegalArgumentException (#2073 ) Contributed by Anoop Sam John.	2020-07-14 14:07:27 +01:00
jimmy-zuber-amzn	806d84b79c	HADOOP-17105. S3AFS - Do not attempt to resolve symlinks in globStatus (#2113 ) Contributed by Jimmy Zuber.	2020-07-13 19:07:48 +01:00
Steve Loughran	b9fa5e0182	HDFS-13934. Multipart uploaders to be created through FileSystem/FileContext. Contributed by Steve Loughran. Change-Id: Iebd34140c1a0aa71f44a3f4d0fee85f6bdf123a3	2020-07-13 13:30:02 +01:00
Sebastian Nagel	5b1ed2113b	HADOOP-17117 Fix typos in hadoop-aws documentation (#2127 )	2020-07-09 00:03:15 +09:00
ishaniahuja	d20109c171	HADOOP-17058. ABFS: Support for AppendBlob in Hadoop ABFS Driver - Contributed by Ishani Ahuja	2020-07-04 13:25:14 -07:00
bilaharith	e0cededfbd	HADOOP-17086. ABFS: Making the ListStatus response ignore unknown properties. (#2101 ) Contributed by Bilahari T H.	2020-07-03 19:00:22 +01:00
Mehakmeet Singh	3b5c9a90c0	HADOOP-16961. ABFS: Adding metrics to AbfsInputStream (#2076 ) Contributed by Mehakmeet Singh.	2020-07-03 11:41:35 +01:00
Yiqun Lin	ff8bb67200	HDFS-15374. Add documentation for fedbalance tool. Contributed by Jinglun.	2020-07-01 14:18:18 +08:00
Yiqun Lin	de2cb86260	HDFS-15410. Add separated config file hdfs-fedbalance-default.xml for fedbalance tool. Contributed by Jinglun.	2020-07-01 14:06:27 +08:00
Steve Loughran	4249c04d45	HADOOP-16798. S3A Committer thread pool shutdown problems. (#1963 ) Contributed by Steve Loughran. Fixes a condition which can cause job commit to fail if a task was aborted < 60s before the job commit commenced: the task abort will shut down the thread pool with a hard exit after 60s; the job commit POST requests would be scheduled through the same pool, so be interrupted and fail. At present the access is synchronized, but presumably the executor shutdown code is calling wait() and releasing locks. Task abort is triggered from the AM when task attempts succeed but there are still active speculative task attempts running. Thus it only surfaces when speculation is enabled and the final tasks are speculating, which, given they are the stragglers, is not unheard of. Note: this problem has never been seen in production; it has surfaced in the hadoop-aws tests on a heavily overloaded desktop	2020-06-30 10:44:51 +01:00
Thomas Marquardt	4b5b54c73f	HADOOP-17089: WASB: Update azure-storage-java SDK Contributed by Thomas Marquardt DETAILS: WASB depends on the Azure Storage Java SDK. There is a concurrency bug in the Azure Storage Java SDK that can cause the results of a list blobs operation to appear empty. This causes the Filesystem listStatus and similar APIs to return empty results. This has been seen in Spark work loads when jobs use more than one executor core. See Azure/azure-storage-java#546 for details on the bug in the Azure Storage SDK. TESTS: A new test was added to validate the fix. All tests are passing: wasb: mvn -T 1C -Dparallel-tests=wasb -Dscale -DtestsThreadCount=8 clean verify Tests run: 248, Failures: 0, Errors: 0, Skipped: 11 Tests run: 651, Failures: 0, Errors: 0, Skipped: 65 abfs: mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify Tests run: 64, Failures: 0, Errors: 0, Skipped: 0 Tests run: 437, Failures: 0, Errors: 0, Skipped: 33 Tests run: 206, Failures: 0, Errors: 0, Skipped: 24	2020-06-25 02:32:42 +00:00
Akira Ajisaka	201d734af3	HDFS-15428. Javadocs fails for hadoop-federation-balance. Contributed by Xieming Li.	2020-06-22 19:43:19 +09:00
Mehakmeet Singh	3472c3efc0	HADOOP-17065. Add Network Counters to ABFS (#2056 ) Contributed by Mehakmeet Singh.	2020-06-19 14:03:49 +01:00
Yiqun Lin	9cbd76cc77	HDFS-15346. FedBalance tool implementation. Contributed by Jinglun.	2020-06-18 13:33:25 +08:00
Thomas Marquardt	caf3995ac2	HADOOP-17076: ABFS: Delegation SAS Generator Updates Contributed by Thomas Marquardt. DETAILS: 1) The authentication version in the service has been updated from Dec19 to Feb20, so need to update the client. 2) Add support and test cases for getXattr and setXAttr. 3) Update DelegationSASGenerator and related to use Duration instead of int for time periods. 4) Cleanup DelegationSASGenerator switch/case statement that maps operations to permissions. 5) Cleanup SASGenerator classes to use String.equals instead of ==. TESTS: Added tests for getXAttr and setXAttr. All tests are passing against my account in eastus2euap: $mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify Tests run: 76, Failures: 0, Errors: 0, Skipped: 0 Tests run: 441, Failures: 0, Errors: 0, Skipped: 33 Tests run: 206, Failures: 0, Errors: 0, Skipped: 24	2020-06-18 02:07:08 +00:00
Steve Loughran	ac5d899d40	HADOOP-17050 S3A to support additional token issuers Contributed by Steve Loughran. S3A delegation token providers will be asked for any additional token issuers, an array can be returned, each one will be asked for tokens when DelegationTokenIssuer collects all the tokens for a filesystem.	2020-06-09 14:39:06 +01:00
Steve Loughran	40d63e02f0	HADOOP-16568. S3A FullCredentialsTokenBinding fails if local credentials are unset. (#1441 ) Contributed by Steve Loughran. Move the loading to deployUnbonded (where they are required) and add a safety check when a new DT is requested	2020-06-03 17:07:00 +01:00
Mehakmeet Singh	7f486f0258	HADOOP-17016. Adding Common Counters in ABFS (#1991 ). Contributed by: Mehakmeet Singh. Change-Id: Ib84e7a42f28e064df4c6204fcce33e573360bf42	2020-06-02 18:31:35 +01:00
Karthik Amarnath	b2200a33a6	HDFS-15168: ABFS enhancement to translate AAD to Linux identities. (#1978 )	2020-05-28 19:00:23 -07:00
Sneha Vijayarajan	4c5cd751e3	HADOOP-17053. ABFS: Fix Account-specific OAuth config setting parsing Contributed by Sneha Vijayarajan	2020-05-27 13:56:09 -07:00
Sneha Vijayarajan	53b993e604	HADOOP-16852: Report read-ahead error back Contributed by Sneha Vijayarajan	2020-05-27 13:51:42 -07:00
Sneha Vijayarajan	37b1b4799d	HADOOP-17054. ABFS: Fix test AbfsClient authentication instance Contributed by Sneha Vijayarajan	2020-05-26 15:26:28 -07:00
Masatake Iwasaki	9685314633	HADOOP-17040. Fix intermittent failure of ITestBlockingThreadPoolExecutorService. (#2020 )	2020-05-22 18:50:19 +09:00
bilaharith	d2f7133c62	HADOOP-17004. Fixing a formatting issue Contributed by Bilahari T H.	2020-05-20 11:51:48 -07:00
Mukund Thakur	29b19cd592	HADOOP-16900. Very large files can be truncated when written through the S3A FileSystem. Contributed by Mukund Thakur and Steve Loughran. This patch ensures that writes to S3A fail when more than 10,000 blocks are written. That upper bound still exists. To write massive files, make sure that the value of fs.s3a.multipart.size is set to a size which is large enough to upload the files in fewer than 10,000 blocks. Change-Id: Icec604e2a357ffd38d7ae7bc3f887ff55f2d721a	2020-05-20 13:42:25 +01:00
Masatake Iwasaki	0b7799bf6e	HADOOP-16586. ITestS3GuardFsck, others fails when run using a local metastore. (#1950 )	2020-05-20 08:47:04 +09:00
Sneha Vijayarajan	8f78aeb250	Hadoop-17015. ABFS: Handling Rename and Delete idempotency Contributed by Sneha Vijayarajan.	2020-05-19 12:30:07 -07:00
bilaharith	bdbd59cfa0	HADOOP-17004. ABFS: Improve the ABFS driver documentation Contributed by Bilahari T H.	2020-05-18 20:45:54 -07:00
Steve Loughran	d08b9e94e3	Revert "HADOOP-14557. Document HADOOP-8143 (Change distcp to have -pb on by default)." This reverts commit `44350fdf49`. It is related to the rollback of HADOOP-8143. Change-Id: If48e3dd670c920ada702dc36461ff398fe9d35cc	2020-05-14 19:04:36 +01:00
Steve Loughran	4486220bb2	Revert "HADOOP-8143. Change distcp to have -pb on by default." This reverts commit `dd65eea74b`. Change-Id: I74180cf59d5bbad8c9f66cb331535addcbea863e	2020-05-14 19:03:56 +01:00
Ayush Saxena	c757cb61eb	HADOOP-14254. Add a Distcp option to preserve Erasure Coding attributes. Contributed by Ayush Saxena.	2020-05-14 00:31:20 +05:30
Thomas Marquardt	b214bbd2d9	HADOOP-16916: ABFS: Delegation SAS generator for integration with Ranger Contributed by Thomas Marquardt. DETAILS: Previously we had a SASGenerator class which generated Service SAS, but we need to add DelegationSASGenerator. I separated SASGenerator into a base class and two subclasses ServiceSASGenerator and DelegationSASGenreator. The code in ServiceSASGenerator is copied from SASGenerator but the DelegationSASGenrator code is new. The DelegationSASGenerator code demonstrates how to use Delegation SAS with minimal permissions, as would be used by an authorization service such as Apache Ranger. Adding this to the tests helps us lock in this behavior. Added a MockDelegationSASTokenProvider for testing User Delegation SAS. Fixed the ITestAzureBlobFileSystemCheckAccess tests to assume oauth client ID so that they are ignored when that is not configured. To improve performance, AbfsInputStream/AbfsOutputStream re-use SAS tokens until the expiry is within 120 seconds. After this a new SAS will be requested. The default period of 120 seconds can be changed using the configuration setting "fs.azure.sas.token.renew.period.for.streams". The SASTokenProvider operation names were updated to correspond better with the ADLS Gen2 REST API, since these operations must be provided tokens with appropriate SAS parameters to succeed. Support for the version 2.0 AAD authentication endpoint was added to AzureADAuthenticator. The getFileStatus method was mistakenly calling the ADLS Gen2 Get Properties API which requires read permission while the getFileStatus call only requires execute permission. ADLS Gen2 Get Status API is supposed to be used for this purpose, so the underlying AbfsClient.getPathStatus API was updated with a includeProperties parameter which is set to false for getFileStatus and true for getXAttr. Added SASTokenProvider support for delete recursive. Fixed bugs in AzureBlobFileSystem where public methods were not validating the Path by calling makeQualified. This is necessary to avoid passing null paths and to convert relative paths into absolute paths. Canonicalized the path used for root path internally so that root path can be used with SAS tokens, which requires that the path in the URL and the path in the SAS token match. Internally the code was using "//" instead of "/" for the root path, sometimes. Also related to this, the AzureBlobFileSystemStore.getRelativePath API was updated so that we no longer remove and then add back a preceding forward / to paths. To run ITestAzureBlobFileSystemDelegationSAS tests follow the instructions in testing_azure.md under the heading "To run Delegation SAS test cases". You also need to set "fs.azure.enable.check.access" to true. TEST RESULTS: namespace.enabled=true auth.type=SharedKey ------------------- $mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify Tests run: 63, Failures: 0, Errors: 0, Skipped: 0 Tests run: 432, Failures: 0, Errors: 0, Skipped: 41 Tests run: 206, Failures: 0, Errors: 0, Skipped: 24 namespace.enabled=false auth.type=SharedKey ------------------- $mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify Tests run: 63, Failures: 0, Errors: 0, Skipped: 0 Tests run: 432, Failures: 0, Errors: 0, Skipped: 244 Tests run: 206, Failures: 0, Errors: 0, Skipped: 24 namespace.enabled=true auth.type=SharedKey sas.token.provider.type=MockDelegationSASTokenProvider enable.check.access=true ------------------- $mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify Tests run: 63, Failures: 0, Errors: 0, Skipped: 0 Tests run: 432, Failures: 0, Errors: 0, Skipped: 33 Tests run: 206, Failures: 0, Errors: 0, Skipped: 24 namespace.enabled=true auth.type=OAuth ------------------- $mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify Tests run: 63, Failures: 0, Errors: 0, Skipped: 0 Tests run: 432, Failures: 0, Errors: 1, Skipped: 74 Tests run: 206, Failures: 0, Errors: 0, Skipped: 140	2020-05-12 18:35:38 +00:00
Inigo Goiri	96bbc3bc97	YARN-9301. Too many InvalidStateTransitionException with SLS. Contributed by Bilwa S T.	2020-05-12 08:24:34 -07:00
Inigo Goiri	9cbd0cd2a9	YARN-9301. Too many InvalidStateTransitionException with SLS. Contributed by Bilwa S T.	2020-05-12 08:20:03 -07:00
Mehakmeet Singh	192cad9ee2	HADOOP-17018. Intermittent failing of ITestAbfsStreamStatistics in ABFS (#1990 ) Contributed by: Mehakmeet Singh In some cases, ABFS-prefetch thread runs in the background which returns some bytes from the buffer and gives an extra readOp. Thus, making readOps values arbitrary and giving intermittent failures in some cases. Hence, readOps values of 2 or 3 are seen in different setups.	2020-05-07 12:15:28 +01:00
Masatake Iwasaki	99840aaba6	HADOOP-17025. Fix invalid metastore configuration in S3GuardTool tests. (#1994 )	2020-05-07 12:00:47 +09:00
Mingliang Liu	263c76b678	HADOOP-17011. Tolerate leading and trailing spaces in fs.defaultFS. Contributed by Ctest Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>	2020-04-30 14:15:28 -07:00
bilaharith	30ef8d0f1a	HADOOP-17002. ABFS: Adding config to determine if the account is HNS enabled or not Contributed by Bilahari T H.	2020-04-23 17:46:18 -07:00
Mehakmeet Singh	459eb2ad6d	HADOOP-16914 Adding Output Stream Counters in ABFS (#1899 ) Contributed by Mehakmeet Singh.There	2020-04-23 13:35:39 +01:00
Sneha Vijayarajan	3d69383c26	Hadoop 16857. ABFS: Stop CustomTokenProvider retry logic to depend on AbfsRestOp retry policy Contributed by Sneha Vijayarajan	2020-04-21 21:39:48 -07:00
bilaharith	264e49c8f2	HADOOP-16922. ABFS: Change User-Agent header (#1938 ) Contributed by Bilahari T H.	2020-04-21 17:37:40 +01:00
Mukund Thakur	8031c66295	HADOOP-16965. Refactor abfs stream configuration. (#1956 ) Contributed by Mukund Thakur.	2020-04-21 17:27:29 +01:00

... 2 3 4 5 6 ...

1692 Commits