hadoop

Author	SHA1	Message	Date
ahmarsuhail	c653c58637	HADOOP-18372. ILoadTestS3ABulkDeleteThrottling failing. (#4642 ) Contributed by Ahmar Suhail	2022-07-27 18:04:39 +01:00
Mehakmeet Singh	363f8138d2	HADOOP-17461. Collect thread-level IOStatistics. (#4352 ) This adds a thread-level collector of IOStatistics, IOStatisticsContext, which can be: * Retrieved for a thread and cached for access from other threads. * reset() to record new statistics. * Queried for live statistics through the IOStatisticsSource.getIOStatistics() method. * Queries for a statistics aggregator for use in instrumented classes. * Asked to create a serializable copy in snapshot() The goal is to make it possible for applications with multiple threads performing different work items simultaneously to be able to collect statistics on the individual threads, and so generate aggregate reports on the total work performed for a specific job, query or similar unit of work. Some changes in IOStatistics-gathering classes are needed for this feature * Caching the active context's aggregator in the object's constructor * Updating it in close() Slightly more work is needed in multithreaded code, such as the S3A committers, which collect statistics across all threads used in task and job commit operations. Currently the IOStatisticsContext-aware classes are: * The S3A input stream, output stream and list iterators. * RawLocalFileSystem's input and output streams. * The S3A committers. * The TaskPool class in hadoop-common, which propagates the active context into scheduled worker threads. Collection of statistics in the IOStatisticsContext is disabled process-wide by default until the feature is considered stable. To enable the collection, set the option fs.thread.level.iostatistics.enabled to "true" in core-site.xml; Contributed by Mehakmeet Singh and Steve Loughran	2022-07-27 11:23:06 +01:00
Wei-Chiu Chuang	0c12873487	HADOOP-18079. Upgrade Netty to 4.1.77. (#3977 ) (#4592 ) Upgrade netty to address CVE-2019-20444, CVE-2019-20445 CVE-2022-24823 Contributed by Wei-Chiu Chuang (cherry picked from commit `a55ace7bc0`)	2022-07-27 03:10:20 +08:00
skysiders	1d2a60f623	MAPREDUCE-7372 MapReduce set permission too late in copyJar method (#4026 ). Contributed by Zhang Dongsheng. Reviewed-by: Steve Loughran <stevel@apache.org> Signed-off-by: Chris Nauroth <cnauroth@apache.org> (cherry picked from commit `9fe96238d2`)	2022-07-25 18:39:48 +00:00
ashutoshpant	fffb0bd6db	HADOOP-18330. S3AFileSystem removes Path when calling createS3Client (#4572 ) Adds a new parameter object in s3ClientCreationParameters that holds the full s3a path URI Contributed by Ashutosh Pant	2022-07-25 10:14:36 +01:00
PJ Fanning	36cb8a6a2b	HADOOP-18354. Upgrade reload4j to 1.22.2 due to XXE vulnerability (#4607 ). Contributed by PJ Fanning. Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>	2022-07-24 16:01:47 +05:30
Ayush Saxena	df4e59318f	HDFS-15839. RBF: Cannot get method setBalancerBandwidth on Router Client. Contributed by Yang Yun. Only Prod Changes: Test already cherry-picked as part of HDFS-16310 via (`496657c63f`)	2022-07-23 23:48:02 +05:30
Masatake Iwasaki	0ff544951a	Make upstream aware of 3.2.4 release. (cherry picked from commit 817b8fdd384a5c28f83bc257c389ceedd38070c5)	2022-07-22 04:09:00 +00:00
Masatake Iwasaki	ff13f9ee8b	Make upstream aware of 3.2.4 release. (cherry picked from commit e1637a57dfd41385dbce5de90620c48a45abb263)	2022-07-22 02:31:34 +00:00
PJ Fanning	6733ba56b8	HADOOP-18332. Remove rs-api dependency by downgrading jackson to 2.12.7. (#4552 ) This downgrades jackson from the version switched to in HADOOP-18033 (2.13.0), to Jackson 2.12.7. This removes the dependency on javax.ws.rs-api, so avoiding runtime problems with applications using jersey-core v1 and/or jsr311-api. The 2.12.7 release still contains the fix for CVE-2020-36518. Contributed by PJ Fanning	2022-07-16 18:18:52 +01:00
lmccay	f015c7f1a7	HADOOP-18074 - Partial/Incomplete groups list can be returned in LDAP. (#4503 ) Partial/Incomplete groups list can be returned in LDAP groups lookup. Backported in #4550; minor tuning of parameters needed. Contributed by larry mccay	2022-07-14 13:54:15 +01:00
Ashutosh Gupta	3a6d865a0f	HADOOP-18336. Tag FSDataInputStream.getWrappedStream() @Public/@Stable (#4555 ) Contributed by: Ashutosh Gupta	2022-07-13 12:57:28 +01:00
HerCath	e0abdd8d33	HADOOP-18217. ExitUtil synchronized blocks reduced. #4255 Reduce the ExitUtil synchronized block scopes so System.exit and Runtime.halt calls aren't within their boundaries, so ExitUtil wrappers do not block each other. Enlarged catches to all Throwables (not just Exceptions). Contributed by Remi Catherinot	2022-07-13 12:37:32 +01:00
hchaverr	f4e8a4f36c	HDFS-16591. Setup JaasConfiguration in ZKCuratorManager when SASL is enabled Fixes #4447 Signed-off-by: Owen O'Malley <oomalley@linkedin.com>	2022-07-11 12:55:18 -07:00
Szilard Nemeth	ebe3ed5cd3	YARN-10997. Revisit allocation and reservation logging. Contributed by Andras Gyori (cherry picked from commit `7cb887e6c2`)	2022-07-07 21:04:15 +00:00
Jinhu Wu	690337bca9	HADOOP-18313: AliyunOSSBlockOutputStream should not mark the temporary file for deletion (#4502 ) HADOOP-18313: AliyunOSSBlockOutputStream should not mark the temporary file for deletion. Contributed by wujinhu. (cherry picked from commit `3ec4b932c1`)	2022-07-06 14:31:07 +08:00
Mehakmeet Singh	90b1e737d3	HADOOP-18242. ABFS Rename Failure when tracking metadata is in an incomplete state (#4517 ) ABFS rename fails intermittently when the Storage-blob tracking metadata is in an incomplete state. This surfaces as the error code 404 and an error message of "RenameDestinationParentPathNotFound" To mitigate this issue, when a request fails with this response. the ABFS client issues a HEAD call on the source file and then retries the rename operation again ABFS filesystem statistics track when this occurs with new counters rename_recovery metadata_incomplete_rename_failures rename_path_attempts This is very rare occurrence and appears to be triggered under certain heavy load conditions, just as with HADOOP-18163. Contributed by Mehakmeet Singh.	2022-07-02 01:49:14 +05:30
Mukund Thakur	7eb1c908a0	HADOOP-18322. Yetus build failure in branch-3.3. caused by HADOOP-18103	2022-06-30 15:05:38 -05:00
Masatake Iwasaki	75c739c458	Revert "HADOOP-17196. Fix C/C++ standard warnings (#2208 )" This reverts commit `b4a105a209`.	2022-06-30 00:57:52 +00:00
Mukund Thakur	c517b086f2	HADOOP-18106: Handle memory fragmentation in S3A Vectored IO. (#4445 ) part of HADOOP-18103. Handling memory fragmentation in S3A vectored IO implementation by allocating smaller user range requested size buffers and directly filling them from the remote S3 stream and skipping undesired data in between ranges. This patch also adds aborting active vectored reads when stream is closed or unbuffer() is called. Contributed By: Mukund Thakur Conflicts: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/RawLocalFileSystem.java	2022-06-23 17:34:29 -05:00
Mukund Thakur	bfb7d020d1	HADOOP-18105 Implement buffer pooling with weak references (#4263 ) part of HADOOP-18103. Required for vectored IO feature. None of current buffer pool implementation is complete. ElasticByteBufferPool doesn't use weak references and could lead to memory leak errors and DirectBufferPool doesn't support caller preferences of direct and heap buffers and has only fixed length buffer implementation. Contributed By: Mukund Thakur	2022-06-23 17:11:13 -05:00
Mukund Thakur	bb5a17b177	HADOOP-18107 Adding scale test for vectored reads for large file (#4273 ) part of HADOOP-18103. Contributed By: Mukund Thakur	2022-06-23 17:11:09 -05:00
Mukund Thakur	9f03f87963	HADOOP-18104: S3A: Add configs to configure minSeekForVectorReads and maxReadSizeForVectorReads (#3964 ) Part of HADOOP-18103. Introducing fs.s3a.vectored.read.min.seek.size and fs.s3a.vectored.read.max.merged.size to configure min seek and max read during a vectored IO operation in S3A connector. These properties actually define how the ranges will be merged. To completely disable merging set fs.s3a.max.readsize.vectored.read to 0. Contributed By: Mukund Thakur	2022-06-23 17:11:04 -05:00
Mukund Thakur	5c348c41ab	HADOOP-11867. Add a high-performance vectored read API. (#3904 ) part of HADOOP-18103. Add support for multiple ranged vectored read api in PositionedReadable. The default iterates through the ranges to read each synchronously, but the intent is that FSDataInputStream subclasses can make more efficient readers especially in object stores implementation. Also added implementation in S3A where smaller ranges are merged and sliced byte buffers are returned to the readers. All the merged ranged are fetched from S3 asynchronously. Contributed By: Owen O'Malley and Mukund Thakur Conflicts: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/RawLocalFileSystem.java pom.xml	2022-06-23 17:09:16 -05:00
Viraj Jasani	4ba463069b	HADOOP-18288. Total requests and total requests per sec served by RPC servers (#4485 ) Signed-off-by: Tao Li <tomscut@apache.org>	2022-06-23 17:30:01 +08:00
Igor Dvorzhak	d41e0a9cc3	HADOOP-18300. Upgrade Gson dependency to version 2.9.0 (#4454 ) Reviewed-by: Ayush Saxena <ayushsaxena@apache.org> Signed-off-by: Chris Nauroth <cnauroth@apache.org> (cherry picked from commit `77d1b194c7`)	2022-06-22 23:42:59 +00:00
Benjamin Teke	838b63d836	YARN-10974. Queue filter in CS UI v1 does not work as expected. Contributed by Chengbing Liu.	2022-06-22 18:20:09 +02:00
Steve Loughran	fb4e8172a0	MAPREDUCE-7391. TestLocalDistributedCacheManager failing after HADOOP-16202 (#4472 ) Fixing a mockito-based test which broke when HADOOP-16202 changed the methods being invoked. Contributed by Steve Loughran	2022-06-22 13:13:24 +01:00
Viraj Jasani	53a530aa88	MAPREDUCE-7371. DistributedCache alternative APIs should not use DistributedCache APIs internally (#3855 ) Contributed by Viraj Jasani	2022-06-22 13:13:05 +01:00
Steve Loughran	9ca4ac0af0	HADOOP-18305. Preparing for 3.3.4 release: branch-3.3 version => 3.3.9 (#4482 ) Updating the hadoop version of branch-3.3 to 3.3.9-SNAPSHOT pending agreement on what number its future release should take. Using 3.3.9-SNAPSHOT puts space in for other incremental releases, while avoiding creating JIRA release ordering and autocompletion confusion the way adding a 3.3.10 or higher version would do. Contributed by Steve Loughran	2022-06-22 13:09:50 +01:00
André Fonseca	49342cffdb	HADOOP-18159. Bump cos_api-bundle to 5.6.69 to update public-suffix-list.txt (#4444 ) Bump cos_api-bundle to 5.6.69 All copies of httpclient, including shaded ones in libraries used by the s3a, gs and cos cloud connectors, turn out to load their TLD list from the same resource mozilla/public-suffix-list.txt Updating the hadoop-cos dependency ensures that its version of public-suffix-list.txt is up to date -and so the s3a connector able to talk to s3 resources if the cos-api-bundle JAR is where the resource is loaded from. Contributed by André Fonseca	2022-06-21 20:17:08 +01:00
Steve Loughran	aeb2a2f860	HADOOP-17833. Improve Magic Committer performance (#3289 ) (#4470 ) Speed up the magic committer with key changes being * Writes under __magic always retain directory markers * File creation under __magic skips all overwrite checks, including the LIST call intended to stop files being created over dirs. * mkdirs under __magic probes the path for existence but does not look any further. Extra parallelism in task and job commit directory scanning Use of createFile and openFile with parameters which all for HEAD checks to be skipped. The committer can write the summary _SUCCESS file to the path `fs.s3a.committer.summary.report.directory`, which can be in a different file system/bucket if desired, using the job id as the filename. Also: HADOOP-15460. S3A FS to add `fs.s3a.create.performance` Application code can set the createFile() option fs.s3a.create.performance to true to disable the same safety checks when writing under magic directories. Use with care. The createFile option prefix `fs.s3a.create.header.` can be used to add custom headers to S3 objects when created. Contributed by Steve Loughran.	2022-06-21 10:49:37 +01:00
Viraj Jasani	7561dbd134	HDFS-16637. TestHDFSCLI#testAll consistently failing (#4466 ). Contributed by Viraj Jasani. Reviewed-by: Tao Li <tomscut@apache.org> Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>	2022-06-21 13:44:30 +05:30
Ashutosh Gupta	4f860f8ac2	MAPREDUCE-7369. Fixed MapReduce tasks timing out when spends more time on MultipleOutputs#close (#4247 ) Contributed by Ravuri Sushma sree. Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com> (cherry picked from commit `36c4be819f`) Conflicts: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/TaskAttemptListenerImpl.java	2022-06-20 08:02:58 +00:00
slfan1989	43f4a0e92d	MAPREDUCE-7387. Fix TestJHSSecurity#testDelegationToken AssertionError due to HDFS-16563 (#4428 ). Contributed by fanshilun. Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>	2022-06-20 12:16:33 +05:30
KevinWikant	33ab84f2e2	HDFS-16064. Determine when to invalidate corrupt replicas based on number of usable replicas (#4410 ) Co-authored-by: Kevin Wikant <wikak@amazon.com> Signed-off-by: Akira Ajisaka <aajisaka@apache.org> (cherry picked from commit `cfceaebde6`)	2022-06-20 11:24:45 +09:00
zhengchenyu	d7de378b22	YARN-11172. Fix TestClientRMTokens#testDelegationToken introduced by HDFS-16563. (#4408 ) Regression caused by HDFS-16563; the hdfs exception text was changed, but because it was a YARN test doing the check, Yetus didn't notice. Contributed by zhengchenyu	2022-06-17 19:51:56 +01:00
jianghuazhu	18a5e843bc	HDFS-16581.Print node status when executing printTopology. (#4321 ) Reviewed-by: Viraj Jasani <vjasani@apache.org> Signed-off-by: Tao Li <tomscut@apache.org>	2022-06-16 19:20:34 +08:00
xuzq	ee3ee98ee5	HDFS-16623. Avoid IllegalArgumentException in LifelineSender (#4409 ) * HDFS-16623. Avoid IllegalArgumentException in LifelineSender Co-authored-by: zengqiang.xu <zengqiang.xu@shopee.com> (cherry picked from commit `af5003a473`)	2022-06-10 19:02:47 +00:00
Steve Loughran	de9f994338	YARN-11173. remove redeclaration of os-maven-plugin.version from yarn-csi (#4417 ) This is a followup to HADOOP-18275 and its upgrade of os-maven-plugin.version When that change is merged in, this MUST follow it. Contributed by Steve Loughran Change-Id: I61d087041561eeb8c9c42b5b7d8f0bb63f296b15	2022-06-10 17:03:25 +01:00
Ashutosh Gupta	bdef321d52	HDFS-16576. Remove unused imports in HDFS project (#4389 ) Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com> Reviewed-by: Tao Li <tomscut@apache.org> Signed-off-by: Akira Ajisaka <aajisaka@apache.org> (cherry picked from commit `6e11c94170`) Conflicts: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/checker/AbstractFuture.java	2022-06-09 22:42:04 +09:00
slfan1989	a2f8a9e5d8	HDFS-16624. Fix flaky unit test TestDFSAdmin#testAllDatanodesReconfig (#4412 ) Reviewed-by: Viraj Jasani <vjasani@apache.org> Signed-off-by: Tao Li <tomscut@apache.org>	2022-06-09 09:59:34 +08:00
monthonk	7ec988d264	HADOOP-12020. Add s3a storage class option fs.s3a.create.storage.class (#3877 ) Adds a new option fs.s3a.create.storage.class which can be used to set the storage class for files created in AWS S3. Consult the documentation for details and instructions on how disable the relevant tests when testing against third-party stores. Contributed by Monthon Klongklaew Change-Id: I8cdebadf294a89fde08d98729ad96f251d58411c	2022-06-08 20:02:07 +01:00
Viraj Jasani	516a2a8e44	HDFS-16618. sync_file_range error should include more volume/file info (#4402 ) Signed-off-by: Tao Li <tomscut@apache.org>	2022-06-07 16:56:07 +08:00
Viraj Jasani	132fbbe228	HDFS-16595. Slow peer metrics - add median, mad and upper latency limits (#4357 ) (#4405 ) Reviewed-by: Tao Li <tomscut@apache.org> Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>	2022-06-07 06:41:16 +08:00
Steve Loughran	03c2941d4b	HADOOP-18275. Update os-maven-plugin to 1.7.0 (#4397 ) Contributed by Steve Loughran Change-Id: Ic4d442a37299dc8098b0bca3cc51beca6f058283	2022-06-06 13:20:00 +01:00
Renukaprasad C	0c15daa77a	HDFS-16563. Namenode WebUI prints sensitive information on Token expiry (#4241 ) Contributed by Renukaprasad C Change-Id: I5cd2cec1dd79917f810207821b3bdf4fe1a5d24c	2022-06-06 11:08:57 +01:00
Samrat	7223a337f6	HDFS-16608. Fix the link in TestClientProtocolForPipelineRecovery (#4379 ) Signed-off-by: Akira Ajisaka <aajisaka@apache.org> (cherry picked from commit `7f08ed0d1d`)	2022-06-06 18:02:44 +09:00
Stephen O'Donnell	7d6b133af3	HDFS-16610. Make fsck read timeout configurable (#4384 ) (cherry picked from commit `34a973a90e`) Conflicts: hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml	2022-06-01 20:54:56 +01:00
Ashutosh Gupta	de4c975710	HADOOP-18238. Fix reentrancy check in SFTPFileSystem.close() (#4330 ) Contributed by Ashutosh Gupta Change-Id: I2742675add74259a93b3762a80c7ab5ee6d08c37	2022-05-30 17:34:45 +01:00

1 2 3 4 5 ...

25086 Commits