hadoop

Author	SHA1	Message	Date
Steve Vaughan	4138661010	HDFS-16625. Check assumption about PMDK availability (#4788 ) Co-authored-by: Steve Vaughan Jr <s_vaughan@apple.com>	2022-08-23 19:35:59 +09:00
Steve Vaughan	a73efb2d55	HDFS-16687. RouterFsckServlet replicates code from DfsServlet base class (#4681 ) (#4790 )	2022-08-22 20:26:03 -07:00
Steve Vaughan	1120cc8485	HDFS-4043. Namenode Kerberos Login does not use proper hostname for host qualified hdfs principal name (#4785 ) Use the existing DomainNameResolver to leverage the pluggable resolution framework. This provides a means to perform a reverse lookup if needed. Update default implementation of DNSDomainNameResolver to protect against returning the IP address as a string from a cached value. Co-authored-by: Steve Vaughan Jr <s_vaughan@apple.com>	2022-08-23 05:34:33 +08:00
jianghuazhu	2123859d60	HDFS-16729. RBF: fix some unreasonably annotated docs. (#4745 ) Reviewed-by: Inigo Goiri <inigoiri@apache.org> Signed-off-by: Akira Ajisaka <aajisaka@apache.org> (cherry picked from commit `7f176d080c`)	2022-08-21 07:31:01 +09:00
Steve Vaughan	cfc11d2e5f	HADOOP-18365. Update the remote address when a change is detected (#4692 ) (#4768 ) Back port to branch-3.3, to avoid reconnecting to the old address after detecting that the address has been updated. * Use a stable hashCode to allow safe IP addr changes * Add test that updated address is used Once the address has been updated, it will be used in future calls. Test verifies that a second request succeeds and that it uses the existing updated address instead of having to re-resolve. Co-authored-by: Steve Vaughan Jr <s_vaughan@apple.com>	2022-08-19 18:56:02 -07:00
Viraj Jasani	51ddd02395	HADOOP-18403. Fix FileSystem leak in ITestS3AAWSCredentialsProvider (#4737 ) Contributed By: Viraj Jasani	2022-08-18 17:45:44 -05:00
Ashutosh Gupta	a5d5d0708a	HADOOP-18385. ITestS3ACannedACLs failure; fixed by adding in a span (#4736 ) Contributed by Ashutosh Gupta	2022-08-18 16:55:46 +01:00
Viraj Jasani	e8a28dc0d7	HADOOP-18371. S3A FS init to log at debug when fs.s3a.create.storage.class is unset (#4730 ) Contributed By: Viraj Jasani	2022-08-16 12:45:59 -05:00
Ashutosh Gupta	3b3bd89084	YARN-11248. Add unit test for FINISHED_CONTAINERS_PULLED_BY_AM event on DECOMMISSIONING (#4721 ) Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com> Signed-off-by: Akira Ajisaka <aajisaka@apache.org> (cherry picked from commit `f02ff1afe2`)	2022-08-16 19:07:42 +09:00
kevins-29	eff292bd5f	HADOOP-18383. Codecs with @DoNotPool annotation are not closed causing memory leak (#4739 )	2022-08-15 10:14:02 -07:00
Steve Loughran	97763619c9	HADOOP-18402. S3A committer NPE in spark job abort (#4735 ) JobID.toString() and TaskID.toString() to only be called when the IDs are not null. This doesn't surface in MapReduce, but Spark SQL can trigger in job abort, where it may invoke abortJob() with an incomplete TaskContext. This patch MUST be applied to branches containing HADOOP-17833. "Improve Magic Committer Performance." Contributed by Steve Loughran.	2022-08-15 11:32:06 +01:00
Viraj Jasani	6b7c1329b2	HADOOP-18397. Shutdown AWSSecurityTokenService when its resources are no longer in use (#4722 ) Contributed by Viraj Jasani.	2022-08-12 15:19:51 +01:00
Mukund Thakur	93c4704b33	HADOOP-18392. Propagate vectored s3a input stream stats to file system stats. (#4704 ) part of HADOOP-18103. Contributed By: Mukund Thakur	2022-08-11 15:24:25 -05:00
Mukund Thakur	09c8084191	HADOOP-18355. Update previous index properly while validating overlapping ranges. (#4647 ) part of HADOOP-18103. Contributed By: Mukund Thakur	2022-08-11 15:24:08 -05:00
Mukund Thakur	147a466c6d	HADOOP-18227. Add input stream IOStats for vectored IO api in S3A. (#4636 ) part of HADOOP-18103. Contributed By: Mukund Thakur	2022-08-11 15:23:57 -05:00
huaxiangsun	1b9135e3b5	HADOOP-18340. deleteOnExit does not work with S3AFileSystem (#4608 ) Contributed by Huaxiang Sun	2022-08-11 20:25:41 +01:00
Yubi Lee	a0e2ab2974	HADOOP-18398. Prevent AvroRecord*.class from being included non-test jar (#4727 ) Contributed by Yubi Lee.	2022-08-11 20:16:52 +01:00
Viraj Jasani	0455769531	HADOOP-18373. IOStatisticsContext tuning (#4705 ) The name of the option to enable/disable thread level statistics is "fs.iostatistics.thread.level.enabled"; There is also an enabled() probe in IOStatisticsContext which can be used to see if the thread level statistics is active. Contributed by Viraj Jasani	2022-08-08 14:37:39 +01:00
Ashutosh Gupta	29ea8ceb49	HADOOP-18390. Fix out of sync import for HADOOP-18321 (#4694 ) Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com> Signed-off-by: Akira Ajisaka <aajisaka@apache.org> (cherry picked from commit `bd0f9a46e1`)	2022-08-07 16:06:09 +09:00
Ashutosh Gupta	3c339a11ec	HADOOP-18321.Fix when to read an additional record from a BZip2 text file split (#4521 ) * HADOOP-18321.Fix when to read an additional record from a BZip2 text file split Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com> and Reviewed by Akira Ajisaka. (cherry picked from commit `a432925f74`)	2022-08-06 21:53:48 +09:00
ahmarsuhail	351a9f732b	HADOOP-18366. ITestS3Select.testSelectSeekFullLandsat is timing out. (#4702 ) Reduces size of data read to 1 MB Contributed by Ahmar Suhail	2022-08-05 14:13:35 +01:00
Steve Loughran	9c5228cf6b	HADOOP-18305. Release Hadoop 3.3.4: upstream changelog and jdiff files Add the r3.3.4 changelog, release notes and jdiff xml files. Change-Id: I98b0fed54da3b810c3f23fe5b12e673937916257	2022-08-05 14:02:28 +01:00
xuzq	e024d1a3f8	HDFS-16712. Fix incorrect placeholder in DataNode.java (#4672 ). Contributed by ZanderXu. Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>	2022-08-03 13:02:42 +05:30
ahmarsuhail	4e842a7ff3	HADOOP-18368. Fixes ITestCustomSigner for access point names with '-' (#4634 ) Contributed By: Ahmar Suhail <ahmarsu@amazon.co.uk>	2022-08-01 15:27:28 -05:00
Steve Loughran	7aebacef77	HADOOP-18344. Upgrade AWS SDK to 1.12.262 (#4637 ) Fixes CVE-2018-7489 in shaded jackson. +Add more commands in testing.md to the CLI tests needed when qualifying a release Contributed by Steve Loughran	2022-07-28 11:39:40 +01:00
ahmarsuhail	c653c58637	HADOOP-18372. ILoadTestS3ABulkDeleteThrottling failing. (#4642 ) Contributed by Ahmar Suhail	2022-07-27 18:04:39 +01:00
Mehakmeet Singh	363f8138d2	HADOOP-17461. Collect thread-level IOStatistics. (#4352 ) This adds a thread-level collector of IOStatistics, IOStatisticsContext, which can be: * Retrieved for a thread and cached for access from other threads. * reset() to record new statistics. * Queried for live statistics through the IOStatisticsSource.getIOStatistics() method. * Queries for a statistics aggregator for use in instrumented classes. * Asked to create a serializable copy in snapshot() The goal is to make it possible for applications with multiple threads performing different work items simultaneously to be able to collect statistics on the individual threads, and so generate aggregate reports on the total work performed for a specific job, query or similar unit of work. Some changes in IOStatistics-gathering classes are needed for this feature * Caching the active context's aggregator in the object's constructor * Updating it in close() Slightly more work is needed in multithreaded code, such as the S3A committers, which collect statistics across all threads used in task and job commit operations. Currently the IOStatisticsContext-aware classes are: * The S3A input stream, output stream and list iterators. * RawLocalFileSystem's input and output streams. * The S3A committers. * The TaskPool class in hadoop-common, which propagates the active context into scheduled worker threads. Collection of statistics in the IOStatisticsContext is disabled process-wide by default until the feature is considered stable. To enable the collection, set the option fs.thread.level.iostatistics.enabled to "true" in core-site.xml; Contributed by Mehakmeet Singh and Steve Loughran	2022-07-27 11:23:06 +01:00
Wei-Chiu Chuang	0c12873487	HADOOP-18079. Upgrade Netty to 4.1.77. (#3977 ) (#4592 ) Upgrade netty to address CVE-2019-20444, CVE-2019-20445 CVE-2022-24823 Contributed by Wei-Chiu Chuang (cherry picked from commit `a55ace7bc0`)	2022-07-27 03:10:20 +08:00
skysiders	1d2a60f623	MAPREDUCE-7372 MapReduce set permission too late in copyJar method (#4026 ). Contributed by Zhang Dongsheng. Reviewed-by: Steve Loughran <stevel@apache.org> Signed-off-by: Chris Nauroth <cnauroth@apache.org> (cherry picked from commit `9fe96238d2`)	2022-07-25 18:39:48 +00:00
ashutoshpant	fffb0bd6db	HADOOP-18330. S3AFileSystem removes Path when calling createS3Client (#4572 ) Adds a new parameter object in s3ClientCreationParameters that holds the full s3a path URI Contributed by Ashutosh Pant	2022-07-25 10:14:36 +01:00
PJ Fanning	36cb8a6a2b	HADOOP-18354. Upgrade reload4j to 1.22.2 due to XXE vulnerability (#4607 ). Contributed by PJ Fanning. Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>	2022-07-24 16:01:47 +05:30
Ayush Saxena	df4e59318f	HDFS-15839. RBF: Cannot get method setBalancerBandwidth on Router Client. Contributed by Yang Yun. Only Prod Changes: Test already cherry-picked as part of HDFS-16310 via (`496657c63f`)	2022-07-23 23:48:02 +05:30
Masatake Iwasaki	0ff544951a	Make upstream aware of 3.2.4 release. (cherry picked from commit 817b8fdd384a5c28f83bc257c389ceedd38070c5)	2022-07-22 04:09:00 +00:00
Masatake Iwasaki	ff13f9ee8b	Make upstream aware of 3.2.4 release. (cherry picked from commit e1637a57dfd41385dbce5de90620c48a45abb263)	2022-07-22 02:31:34 +00:00
PJ Fanning	6733ba56b8	HADOOP-18332. Remove rs-api dependency by downgrading jackson to 2.12.7. (#4552 ) This downgrades jackson from the version switched to in HADOOP-18033 (2.13.0), to Jackson 2.12.7. This removes the dependency on javax.ws.rs-api, so avoiding runtime problems with applications using jersey-core v1 and/or jsr311-api. The 2.12.7 release still contains the fix for CVE-2020-36518. Contributed by PJ Fanning	2022-07-16 18:18:52 +01:00
lmccay	f015c7f1a7	HADOOP-18074 - Partial/Incomplete groups list can be returned in LDAP. (#4503 ) Partial/Incomplete groups list can be returned in LDAP groups lookup. Backported in #4550; minor tuning of parameters needed. Contributed by larry mccay	2022-07-14 13:54:15 +01:00
Ashutosh Gupta	3a6d865a0f	HADOOP-18336. Tag FSDataInputStream.getWrappedStream() @Public/@Stable (#4555 ) Contributed by: Ashutosh Gupta	2022-07-13 12:57:28 +01:00
HerCath	e0abdd8d33	HADOOP-18217. ExitUtil synchronized blocks reduced. #4255 Reduce the ExitUtil synchronized block scopes so System.exit and Runtime.halt calls aren't within their boundaries, so ExitUtil wrappers do not block each other. Enlarged catches to all Throwables (not just Exceptions). Contributed by Remi Catherinot	2022-07-13 12:37:32 +01:00
hchaverr	f4e8a4f36c	HDFS-16591. Setup JaasConfiguration in ZKCuratorManager when SASL is enabled Fixes #4447 Signed-off-by: Owen O'Malley <oomalley@linkedin.com>	2022-07-11 12:55:18 -07:00
Szilard Nemeth	ebe3ed5cd3	YARN-10997. Revisit allocation and reservation logging. Contributed by Andras Gyori (cherry picked from commit `7cb887e6c2`)	2022-07-07 21:04:15 +00:00
Jinhu Wu	690337bca9	HADOOP-18313: AliyunOSSBlockOutputStream should not mark the temporary file for deletion (#4502 ) HADOOP-18313: AliyunOSSBlockOutputStream should not mark the temporary file for deletion. Contributed by wujinhu. (cherry picked from commit `3ec4b932c1`)	2022-07-06 14:31:07 +08:00
Mehakmeet Singh	90b1e737d3	HADOOP-18242. ABFS Rename Failure when tracking metadata is in an incomplete state (#4517 ) ABFS rename fails intermittently when the Storage-blob tracking metadata is in an incomplete state. This surfaces as the error code 404 and an error message of "RenameDestinationParentPathNotFound" To mitigate this issue, when a request fails with this response. the ABFS client issues a HEAD call on the source file and then retries the rename operation again ABFS filesystem statistics track when this occurs with new counters rename_recovery metadata_incomplete_rename_failures rename_path_attempts This is very rare occurrence and appears to be triggered under certain heavy load conditions, just as with HADOOP-18163. Contributed by Mehakmeet Singh.	2022-07-02 01:49:14 +05:30
Mukund Thakur	7eb1c908a0	HADOOP-18322. Yetus build failure in branch-3.3. caused by HADOOP-18103	2022-06-30 15:05:38 -05:00
Masatake Iwasaki	75c739c458	Revert "HADOOP-17196. Fix C/C++ standard warnings (#2208 )" This reverts commit `b4a105a209`.	2022-06-30 00:57:52 +00:00
Mukund Thakur	c517b086f2	HADOOP-18106: Handle memory fragmentation in S3A Vectored IO. (#4445 ) part of HADOOP-18103. Handling memory fragmentation in S3A vectored IO implementation by allocating smaller user range requested size buffers and directly filling them from the remote S3 stream and skipping undesired data in between ranges. This patch also adds aborting active vectored reads when stream is closed or unbuffer() is called. Contributed By: Mukund Thakur Conflicts: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/RawLocalFileSystem.java	2022-06-23 17:34:29 -05:00
Mukund Thakur	bfb7d020d1	HADOOP-18105 Implement buffer pooling with weak references (#4263 ) part of HADOOP-18103. Required for vectored IO feature. None of current buffer pool implementation is complete. ElasticByteBufferPool doesn't use weak references and could lead to memory leak errors and DirectBufferPool doesn't support caller preferences of direct and heap buffers and has only fixed length buffer implementation. Contributed By: Mukund Thakur	2022-06-23 17:11:13 -05:00
Mukund Thakur	bb5a17b177	HADOOP-18107 Adding scale test for vectored reads for large file (#4273 ) part of HADOOP-18103. Contributed By: Mukund Thakur	2022-06-23 17:11:09 -05:00
Mukund Thakur	9f03f87963	HADOOP-18104: S3A: Add configs to configure minSeekForVectorReads and maxReadSizeForVectorReads (#3964 ) Part of HADOOP-18103. Introducing fs.s3a.vectored.read.min.seek.size and fs.s3a.vectored.read.max.merged.size to configure min seek and max read during a vectored IO operation in S3A connector. These properties actually define how the ranges will be merged. To completely disable merging set fs.s3a.max.readsize.vectored.read to 0. Contributed By: Mukund Thakur	2022-06-23 17:11:04 -05:00
Mukund Thakur	5c348c41ab	HADOOP-11867. Add a high-performance vectored read API. (#3904 ) part of HADOOP-18103. Add support for multiple ranged vectored read api in PositionedReadable. The default iterates through the ranges to read each synchronously, but the intent is that FSDataInputStream subclasses can make more efficient readers especially in object stores implementation. Also added implementation in S3A where smaller ranges are merged and sliced byte buffers are returned to the readers. All the merged ranged are fetched from S3 asynchronously. Contributed By: Owen O'Malley and Mukund Thakur Conflicts: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/RawLocalFileSystem.java pom.xml	2022-06-23 17:09:16 -05:00
Viraj Jasani	4ba463069b	HADOOP-18288. Total requests and total requests per sec served by RPC servers (#4485 ) Signed-off-by: Tao Li <tomscut@apache.org>	2022-06-23 17:30:01 +08:00

... 2 3 4 5 6 ...

25161 Commits