Commit Graph

27335 Commits

Author SHA1 Message Date
Cheng Pan
281e2d288d
Revert "HADOOP-16822. Provide source artifacts for hadoop-client-api. Contributed by Karel Kolman." (#6458)
This reverts commit 2c4ab72a60.

Justification: this was making debugging through IDEs worse, rather than better.
2024-04-10 12:03:59 +01:00
Gautham B A
f7bb4f1595
HADOOP-18135. Produce Windows binaries of Hadoop (#6673)
This PR enables one to create the Hadoop
release tarball on Windows, complete with
the native binaries (including winutils.exe).
This PR contains the following changes -

* Prevents splitting during array element
  expansion - this is needed since we need
  to pass the arguments correctly to maven.
* Install Python 3.11.8 and pip to the
  Windows docker image for building
  Hadoop.
* pom file changes to get maven to invoke
  the releasedocmaker script through
  bash.exe on Windows.
2024-04-09 22:15:05 +05:30
Yang Jiandan
3f8af73913
YARN-11670. Add CallerContext in NodeManager (#6688) 2024-04-08 22:50:41 -04:00
slfan1989
8c378d1ea1
YARN-11444. Improve YARN md documentation format. (#6711) Contributed by Shilun Fan.
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2024-04-07 20:50:46 +08:00
ConfX
73e6931ed0
HDFS-17449. Fix ill-formed decommission host name and port pair triggers IndexOutOfBound error (#6691). Contributed by ConfX
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2024-04-06 13:38:09 +05:30
slfan1989
a1ae35e691
HADOOP-19135. Remove Jcache 1.0-alpha. (#6695) Contributed by Shilun Fan.
Reviewed-by: Steve Loughran <stevel@cloudera.com>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2024-04-05 22:09:15 +08:00
Anuj Modi
6ed73896f6
HADOOP-18656. [ABFS] Add Support for Paginated Delete for Large Directories in HNS Account (#6409)
Contributed by Anuj Modi
2024-04-04 19:48:25 +01:00
Dongjoon Hyun
d7157b4aa9
HADOOP-19141. Vector IO: Update default values consistently (#6702)
Contributed by Dongjoon Hyun
2024-04-04 10:56:40 +01:00
Steve Loughran
62182b1f74
HADOOP-19098. Vector IO: test failure followup (#6701)
Revert changes in ITestDelegatedMRJob which came in with HADOOP-19098

Contributed by Steve Loughran
2024-04-03 14:40:41 -05:00
PJ Fanning
eede5b1315
HADOOP-19114. Upgrade to commons-compress 1.26.1 due to CVEs. (#6636)
This addresses two CVEs triggered by malformed archives

Important: Denial of Service CVE-2024-25710
Moderate: Denial of Service CVE-2024-26308

Contributed by PJ Fanning
2024-04-03 19:32:15 +01:00
Steve Loughran
87fb977777
HADOOP-19098. Vector IO: Specify and validate ranges consistently. #6604
Clarifies behaviour of VectorIO methods with contract tests as well as
specification.

* Add precondition range checks to all implementations
* Identify and fix bug where direct buffer reads was broken
  (HADOOP-19101; this surfaced in ABFS contract tests)
* Logging in VectoredReadUtils.
* TestVectoredReadUtils verifies validation logic.
* FileRangeImpl toString() improvements
* CombinedFileRange tracks bytes in range which are wanted;
   toString() output logs this.

HDFS
* Add test TestHDFSContractVectoredRead

ABFS
* Add test ITestAbfsFileSystemContractVectoredRead

S3A
* checks for vector IO being stopped in all iterative
  vector operations, including draining
* maps read() returning -1 to failure
* passes in file length to validation
* Error reporting to only completeExceptionally() those ranges
  which had not yet read data in.
* Improved logging.

readVectored()
* made synchronized. This is only for the invocation;
  the actual async retrieves are unsynchronized.
* closes input stream on invocation
* switches to random IO, so avoids keeping any long-lived connection around.

+ AbstractSTestS3AHugeFiles enhancements.
+ ADDENDUM: test fix in ITestS3AContractVectoredRead

Contains: HADOOP-19101. Vectored Read into off-heap buffer broken in fallback
implementation

Contributed by Steve Loughran

Change-Id: Ia4ed71864c595f175c275aad83a2ff5741693432
2024-04-03 13:17:52 +01:00
Steve Loughran
b4f9d8e6fa
Revert "HADOOP-19098. Vector IO: Specify and validate ranges consistently."
This reverts commit ba7faf90c8.
2024-04-03 13:15:05 +01:00
PJ Fanning
1357bb162d
HADOOP-19123. Update to commons-configuration2 2.10.1 due to CVE (#6661). Contributed by PJ Fanning
Reviewed-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2024-04-03 01:20:00 +05:30
Steve Loughran
ba7faf90c8
HADOOP-19098. Vector IO: Specify and validate ranges consistently.
Clarifies behaviour of VectorIO methods with contract tests as well as specification.

* Add precondition range checks to all implementations
* Identify and fix bug where direct buffer reads was broken
  (HADOOP-19101; this surfaced in ABFS contract tests)
* Logging in VectoredReadUtils.
* TestVectoredReadUtils verifies validation logic.
* FileRangeImpl toString() improvements
* CombinedFileRange tracks bytes in range which are wanted;
   toString() output logs this.

HDFS
* Add test TestHDFSContractVectoredRead

ABFS
* Add test ITestAbfsFileSystemContractVectoredRead

S3A
* checks for vector IO being stopped in all iterative
  vector operations, including draining
* maps read() returning -1 to failure
* passes in file length to validation
* Error reporting to only completeExceptionally() those ranges
  which had not yet read data in.
* Improved logging.  

readVectored()
* made synchronized. This is only for the invocation;
  the actual async retrieves are unsynchronized.
* closes input stream on invocation
* switches to random IO, so avoids keeping any long-lived connection around.

+ AbstractSTestS3AHugeFiles enhancements.

Contains: HADOOP-19101. Vectored Read into off-heap buffer broken in fallback implementation

Contributed by Steve Loughran
2024-04-02 20:16:38 +01:00
Lei313
36c22400b2
HDFS-17408:Reduce the number of quota calculations in FSDirRenameOp (#6653). Contributed by lei w.
Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org>
Reviewed-by: Dinesh Chitlangia <dineshc@apache.org>
Signed-off-by: Shuyan Zhang <zhangshuyan@apache.org>
2024-04-02 10:40:28 +08:00
slfan1989
5f3eb446f7
YARN-11663. [Federation] Add Cache Entity Nums Limit. (#6662) Contributed by Shilun Fan.
Reviewed-by: Dinesh Chitlangia <dineshc@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2024-04-02 07:47:59 +08:00
PJ Fanning
f7d1ec2d9e
HADOOP-19077. Remove use of javax.ws.rs.core.HttpHeaders (#6554). Contributed by PJ Fanning
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2024-04-01 12:43:39 +05:30
PJ Fanning
59976f1be2
HDFS-17450. Add explicit dependency on httpclient jar (#6130). Contributed by PJ Fanning
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2024-03-30 23:24:15 +05:30
huhaiyang
4807815e1c
HDFS-17448. Enhance the stability of the unit test TestDiskBalancerCommand (#6690). Contributed by Haiyang Hu
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2024-03-30 22:51:05 +05:30
PJ Fanning
06db6289cb
HADOOP-19024. Use bouncycastle jdk18 1.77 (#6410). Contributed 2024-03-30 19:58:12 +05:30
ConfX
aabf31f6fa
HDFS-17103. Fix file system cleanup in TestNameEditsConfigs (#6071). Contributed by ConfX.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2024-03-30 15:04:42 +05:30
PJ Fanning
97c5a6efba
HADOOP-19041. Use StandardCharsets in more places (#6449) 2024-03-28 23:17:18 -04:00
slfan1989
347521c95d
HADOOP-19124. Update org.ehcache from 3.3.1 to 3.8.2. (#6665) 2024-03-28 21:56:12 -04:00
Junfan Zhang
f1f2abe641
YARN-11668. Fix RM crash for potential concurrent modification exception when updating node attributes (#6681) Contributed by Junfan Zhang.
Reviewed-by: Dinesh Chitlangia <dineshc@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2024-03-29 09:45:16 +08:00
ConfX
f3f6340746
HDFS-17443. add null check for fileSys and cluster before shutting down (#6683) 2024-03-28 11:09:50 -04:00
xiaojunxiang
8528d5783d
HDFS-17216. Distcp: When handle the small files, the bandwidth parameter will be invalid, fix this bug. (#6138) 2024-03-28 10:31:06 -04:00
PJ Fanning
5bfca65692
HADOOP-19115. Upgrade to nimbus-jose-jwt 9.37.2 due to CVE-2023-52428. (#6637)
Contributed by PJ Fanning
2024-03-27 10:30:55 +00:00
Alex
50370769cd
HDFS-17429. Fixing wrong log file name in datatransfer Sender.java (#6670) Contributed by Zhongkun Wu.
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2024-03-27 06:59:15 +08:00
Syed Shameerur Rahman
032796a0fb
HADOOP-19047: S3A: Support in-memory tracking of Magic Commit data (#6468)
If the option fs.s3a.committer.magic.track.commits.in.memory.enabled
is set to true, then rather than save data about in-progress uploads
to S3, this information is cached in memory.

If the number of files being committed is low, this will save network IO
in both the generation of .pending and marker files, and in the scanning
of task attempt directory trees during task commit.

Contributed by Syed Shameerur Rahman
2024-03-26 15:29:35 +00:00
Viraj Jasani
9fe371aa15
HADOOP-18980. Invalid inputs for getTrimmedStringCollectionSplitByEquals (ADDENDUM) (#6546)
This is a followup to #6406:
HADOOP-18980. S3A credential provider remapping: make extensible

It adds extra validation of key-value pairs in a configuration
option, with tests.

Contributed by Viraj Jasani
2024-03-26 11:18:03 +00:00
Zilong Zhu
37f9ccdc86
HDFS-17368. HA: Standby should exit safemode when resources are available. (#6518). Contributed by Zilong Zhu.
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2024-03-26 17:35:55 +08:00
Takanobu Asanuma
8bc4939ee2
HDFS-17441. Fix junit dependency by adding missing library in hadoop-hdfs-rbf. (#6669)
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
2024-03-26 09:40:28 +09:00
Takanobu Asanuma
c4f7a3625b
HDFS-17435. Fix TestRouterRpc failed (#6650)
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
2024-03-26 09:38:35 +09:00
Gautham B A
76489e579b
HADOOP-19127. Do not run unit tests on Windows pre-commit CI (#6672) 2024-03-25 09:16:03 -07:00
Gautham B A
44a249e32a
Exclude files from Apache RAT (#6671) 2024-03-25 09:13:57 -07:00
PJ Fanning
7653f968e5
HADOOP-19116. Update to zookeeper client 3.8.4 due to CVE-2024-23944. (#6638)
Updated ZK client dependency to 3.8.4 to address  CVE-2024-23944.

Contributed by PJ Fanning
2024-03-25 15:10:56 +00:00
Anuj Modi
c4fa1b65fb
HADOOP-19089: [ABFS] Reverting Back Support of setXAttr() and getXAttr() on root path (#6592)
This reverts most of
HADOOP-18869: [ABFS] Fix behavior of a File System APIs on root path (#6003).

Calling getXAttr("/") or setXAttr("/") on an abfs container will fail with

`Operation failed: "The request URI is invalid.", HTTP 400 Bad Request`

 
This change is to ensure:
* Consistency across ADLS clients
* Consistency across authentication mechanisms.

Contributed by Anuj Modi
2024-03-25 14:13:24 +00:00
Alex
5c7e40f910
HADOOP-19111. Removing redundant debug message about client info (#6666). Contributed by Zhongkun Wu.
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2024-03-25 11:44:33 +08:00
yu liang
55dca911cc
HADOOP-19052.Hadoop use Shell command to get the count of the hard link which takes a lot of time (#6587) Contributed by liangyu.
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2024-03-24 10:14:27 +08:00
LiuGuH
a60b5e2de3
MAPREDUCE-7469. NNBench createControlFiles should use thread pool to improve performance. (#6463) Contributed by liuguanghua.
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2024-03-23 06:15:56 +08:00
huhaiyang
8cd4704e0a
HDFS-17430. RecoveringBlock will skip no live replicas when get block recovery command. (#6635) 2024-03-22 09:43:12 -04:00
XiaobaoWu
062836c020
HDFS-17436. Supplement log information for AccessControlException (#6651) 2024-03-22 11:21:03 +08:00
XiaobaoWu
a375ef8cfa
YARN-11626. Optimize ResourceManager's operations on Zookeeper metadata (#6616)
Co-authored-by: wuxiaobao <xbaowu@163.com>
2024-03-21 03:12:14 -04:00
Takanobu Asanuma
adab3a22aa
HDFS-17432. Fix junit dependency to enable JUnit4 tests to run in hadoop-hdfs-rbf (#6639)
Reviewed-by: Dinesh Chitlangia <dineshc@apache.org>
2024-03-21 14:53:02 +09:00
hfutatzhanghb
2eb7246ea7
HDFS-17433. metrics sumOfActorCommandQueueLength should only record valid commands. (#6644) 2024-03-20 23:41:35 -04:00
huhaiyang
77c600d769
HDFS-17426. Remove Invalid FileSystemECReadStats logic in DFSInputStream (#6628) 2024-03-21 10:33:41 +08:00
Peter Szucs
a957cd5049
YARN-5305. Allow log aggregation to discard expired delegation tokens (#6625) 2024-03-20 15:33:10 +01:00
huhaiyang
12a26d8b19
HDFS-17431. Fix log format for BlockRecoveryWorker#recoverBlocks (#6643) 2024-03-19 23:22:45 -04:00
Adnan Hemani
8b2058a4e7
HADOOP-19050. S3A: Support S3 Access Grants (#6544)
This adds support for Amazon S3 Access Grants to the S3A connector.

For more information, see:
* https://aws.amazon.com/s3/features/access-grants/
* https://github.com/aws/aws-s3-accessgrants-plugin-java-v2/

Contributed by Adnan Hemani
2024-03-19 17:49:51 +00:00
Steve Loughran
705fb8323b
HADOOP-19119. Spotbugs: possible NPE in org.apache.hadoop.crypto.key.kms.ValueQueue.getSize() (#6642)
Spotbugs is mistaken here as it doesn't observer the read/write locks used
to manage exclusive access to the maps.

* cache the value between checks
* tag as @VisibleForTesting

Contributed by Steve Loughran
2024-03-19 17:18:07 +00:00