Commit Graph

10128 Commits

Author SHA1 Message Date
Mukund Thakur
47be1ab3b6
HADOOP-18679. Add API for bulk/paged delete of files (#6726)
Applications can create a BulkDelete instance from a
BulkDeleteSource; the BulkDelete interface provides
the pageSize(): the maximum number of entries which can be
deleted, and a bulkDelete(Collection paths)
method which can take a collection up to pageSize() long.

This is optimized for object stores with bulk delete APIs;
the S3A connector will offer the page size of
fs.s3a.bulk.delete.page.size unless bulk delete has
been disabled.

Even with a page size of 1, the S3A implementation is
more efficient than delete(path)
as there are no safety checks for the path being a directory
or probes for the need to recreate directories.

The interface BulkDeleteSource is implemented by
all FileSystem implementations, with a page size
of 1 and mapped to delete(pathToDelete, false).
This means that callers do not need to have special
case handling for object stores versus classic filesystems.

To aid use through reflection APIs, the class
org.apache.hadoop.io.wrappedio.WrappedIO
has been created with "reflection friendly" methods.

Contributed by Mukund Thakur and Steve Loughran
2024-05-20 17:05:25 +01:00
LiuGuH
8f92cda35c
HDFS-17509. RBF: Fix ClientProtocol.concat will throw NPE if tgr is a empty file. (#6784) 2024-05-17 10:37:50 +08:00
ZanderXu
cab0f4c9ec
HDFS-17520. [BugFix] TestDFSAdmin.testAllDatanodesReconfig and TestDFSAdmin.testDecommissionDataNodesReconfig failed (#6812) Contributed by Zengqiang Xu.
Reviewed-by: Vinayakumar B <vinayakumarb@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2024-05-15 07:55:24 +08:00
Simbarashe Dzinamarira
6a4f0be854
HDFS-17514: RBF: Routers should unset cached stateID when namenode does not set stateID in RPC response header. (#6804) 2024-05-14 08:09:56 -07:00
ConfX
8d9d58dfc8
HDFS-17099. Fix Null Pointer Exception when stop namesystem in HDFS.(#6034). Contributed by ConfX.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2024-05-14 11:14:55 +08:00
zhihui wang
12e0ca6b24
HDFS-17522. JournalNode web interfaces lack configs for X-FRAME-OPTIONS protection (#6814). Contributed by wangzhihui.
Signed-off-by: Vinayakumar B <vinayakumarb@apache.org>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2024-05-13 22:10:08 +08:00
Felix Nguyen
fb0519253d
HDFS-17488. DN can fail IBRs with NPE when a volume is removed (#6759) 2024-05-11 15:37:43 +08:00
Zilong Zhu
700b3e4800
HDFS-17503. Unreleased volume references because of OOM. (#6782) 2024-05-10 10:34:40 +08:00
kulkabhay
edf985e269
HDFS-17500: Add missing operation name while authorizing some operations (#6776). Contributed by kulkabhay.
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2024-05-06 12:44:30 +08:00
dannytbecker
881034ad45
CachedRecordStore should check if the record state is expired (#6783) 2024-05-01 13:56:53 -07:00
fuchaohong
0c9e0b4398
HDFS-17456. Fix the incorrect dfsused statistics of datanode when appending a file. (#6713). Contributed by fuchaohong.
Reviewed-by: ZanderXu <zanderxu@apache.org>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2024-04-30 12:22:53 +08:00
fuchaohong
ddb805951e
HDFS-17471. Correct the percentage of sample range. (#6742). Contributed by fuchaohong.
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2024-04-30 12:18:47 +08:00
Tsz-Wo Nicholas Sze
78987a71a6
HADOOP-19151. Support configurable SASL mechanism. (#6740) 2024-04-29 10:02:23 -07:00
zhtttylz
daafc8a0b8
HDFS-17367. Add PercentUsed for Different StorageTypes in JMX (#6735) Contributed by Hualong Zhang.
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2024-04-27 20:36:11 +08:00
dannytbecker
027b4c3259
Remove empty queues from the queueByBlockId map (#6772) 2024-04-26 14:25:15 -07:00
cxzl25
23286b0632
HDFS-17469. Audit log for reportBadBlocks RPC (#6731) 2024-04-24 09:39:57 +08:00
Jian Zhang
782c501441
HDFS-17451. RBF: fix spotbugs for redundant nullcheck of dns. (#6697) 2024-04-23 19:11:51 +08:00
Madhan Neethiraj
e8b2c28dec
HDFS-17478. FSPermissionChecker optimization by initializing AccessControlEnforcer in constructor (#6749) 2024-04-18 15:43:31 -07:00
dannytbecker
0c35cf0982
HDFS-17477. IncrementalBlockReport race condition additional edge cases (#6748) 2024-04-18 09:04:08 -07:00
章锡平
87cc2f1a1f
HDFS-17465. RBF: Use ProportionRouterRpcFairnessPolicyController get “'ava.Lang. Error: Maximum permit count exceeded' (#6727) 2024-04-15 09:28:05 -07:00
Lei313
f49a4df797
HDFS-17383:Datanode current block token should come from active NameNode in HA mode (#6562). Contributed by lei w.
Reviewed-by: Shuyan Zhang <zhangshuyan@apache.org>
Signed-off-by: Shuyan Zhang <zhangshuyan@apache.org>
2024-04-15 18:35:53 +08:00
huhaiyang
6ccb223c9c
HDFS-17461. Fix spotbugs in PeerCache#getInternal (#6721). Contributed by Haiyang Hu.
Reviewed-by: ZanderXu <zanderxu@apache.org>
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2024-04-12 19:53:25 +08:00
huhaiyang
81b05977f2
HDFS-17455. Fix Client throw IndexOutOfBoundsException in DFSInputStream#fetchBlockAt (#6710). Contributed by Haiyang Hu.
Reviewed-by: ZanderXu <zanderxu@apache.org>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2024-04-11 18:04:57 +08:00
dannytbecker
05964ad07a
HDFS-17453. IncrementalBlockReport can have race condition with Edit Log Tailer (#6708) 2024-04-10 09:30:24 -07:00
ConfX
73e6931ed0
HDFS-17449. Fix ill-formed decommission host name and port pair triggers IndexOutOfBound error (#6691). Contributed by ConfX
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2024-04-06 13:38:09 +05:30
Steve Loughran
87fb977777
HADOOP-19098. Vector IO: Specify and validate ranges consistently. #6604
Clarifies behaviour of VectorIO methods with contract tests as well as
specification.

* Add precondition range checks to all implementations
* Identify and fix bug where direct buffer reads was broken
  (HADOOP-19101; this surfaced in ABFS contract tests)
* Logging in VectoredReadUtils.
* TestVectoredReadUtils verifies validation logic.
* FileRangeImpl toString() improvements
* CombinedFileRange tracks bytes in range which are wanted;
   toString() output logs this.

HDFS
* Add test TestHDFSContractVectoredRead

ABFS
* Add test ITestAbfsFileSystemContractVectoredRead

S3A
* checks for vector IO being stopped in all iterative
  vector operations, including draining
* maps read() returning -1 to failure
* passes in file length to validation
* Error reporting to only completeExceptionally() those ranges
  which had not yet read data in.
* Improved logging.

readVectored()
* made synchronized. This is only for the invocation;
  the actual async retrieves are unsynchronized.
* closes input stream on invocation
* switches to random IO, so avoids keeping any long-lived connection around.

+ AbstractSTestS3AHugeFiles enhancements.
+ ADDENDUM: test fix in ITestS3AContractVectoredRead

Contains: HADOOP-19101. Vectored Read into off-heap buffer broken in fallback
implementation

Contributed by Steve Loughran

Change-Id: Ia4ed71864c595f175c275aad83a2ff5741693432
2024-04-03 13:17:52 +01:00
Steve Loughran
b4f9d8e6fa
Revert "HADOOP-19098. Vector IO: Specify and validate ranges consistently."
This reverts commit ba7faf90c8.
2024-04-03 13:15:05 +01:00
Steve Loughran
ba7faf90c8
HADOOP-19098. Vector IO: Specify and validate ranges consistently.
Clarifies behaviour of VectorIO methods with contract tests as well as specification.

* Add precondition range checks to all implementations
* Identify and fix bug where direct buffer reads was broken
  (HADOOP-19101; this surfaced in ABFS contract tests)
* Logging in VectoredReadUtils.
* TestVectoredReadUtils verifies validation logic.
* FileRangeImpl toString() improvements
* CombinedFileRange tracks bytes in range which are wanted;
   toString() output logs this.

HDFS
* Add test TestHDFSContractVectoredRead

ABFS
* Add test ITestAbfsFileSystemContractVectoredRead

S3A
* checks for vector IO being stopped in all iterative
  vector operations, including draining
* maps read() returning -1 to failure
* passes in file length to validation
* Error reporting to only completeExceptionally() those ranges
  which had not yet read data in.
* Improved logging.  

readVectored()
* made synchronized. This is only for the invocation;
  the actual async retrieves are unsynchronized.
* closes input stream on invocation
* switches to random IO, so avoids keeping any long-lived connection around.

+ AbstractSTestS3AHugeFiles enhancements.

Contains: HADOOP-19101. Vectored Read into off-heap buffer broken in fallback implementation

Contributed by Steve Loughran
2024-04-02 20:16:38 +01:00
Lei313
36c22400b2
HDFS-17408:Reduce the number of quota calculations in FSDirRenameOp (#6653). Contributed by lei w.
Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org>
Reviewed-by: Dinesh Chitlangia <dineshc@apache.org>
Signed-off-by: Shuyan Zhang <zhangshuyan@apache.org>
2024-04-02 10:40:28 +08:00
PJ Fanning
f7d1ec2d9e
HADOOP-19077. Remove use of javax.ws.rs.core.HttpHeaders (#6554). Contributed by PJ Fanning
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2024-04-01 12:43:39 +05:30
PJ Fanning
59976f1be2
HDFS-17450. Add explicit dependency on httpclient jar (#6130). Contributed by PJ Fanning
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2024-03-30 23:24:15 +05:30
huhaiyang
4807815e1c
HDFS-17448. Enhance the stability of the unit test TestDiskBalancerCommand (#6690). Contributed by Haiyang Hu
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2024-03-30 22:51:05 +05:30
PJ Fanning
06db6289cb
HADOOP-19024. Use bouncycastle jdk18 1.77 (#6410). Contributed 2024-03-30 19:58:12 +05:30
PJ Fanning
97c5a6efba
HADOOP-19041. Use StandardCharsets in more places (#6449) 2024-03-28 23:17:18 -04:00
ConfX
f3f6340746
HDFS-17443. add null check for fileSys and cluster before shutting down (#6683) 2024-03-28 11:09:50 -04:00
Alex
50370769cd
HDFS-17429. Fixing wrong log file name in datatransfer Sender.java (#6670) Contributed by Zhongkun Wu.
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2024-03-27 06:59:15 +08:00
Zilong Zhu
37f9ccdc86
HDFS-17368. HA: Standby should exit safemode when resources are available. (#6518). Contributed by Zilong Zhu.
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2024-03-26 17:35:55 +08:00
Takanobu Asanuma
8bc4939ee2
HDFS-17441. Fix junit dependency by adding missing library in hadoop-hdfs-rbf. (#6669)
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
2024-03-26 09:40:28 +09:00
Takanobu Asanuma
c4f7a3625b
HDFS-17435. Fix TestRouterRpc failed (#6650)
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
2024-03-26 09:38:35 +09:00
huhaiyang
8cd4704e0a
HDFS-17430. RecoveringBlock will skip no live replicas when get block recovery command. (#6635) 2024-03-22 09:43:12 -04:00
XiaobaoWu
062836c020
HDFS-17436. Supplement log information for AccessControlException (#6651) 2024-03-22 11:21:03 +08:00
Takanobu Asanuma
adab3a22aa
HDFS-17432. Fix junit dependency to enable JUnit4 tests to run in hadoop-hdfs-rbf (#6639)
Reviewed-by: Dinesh Chitlangia <dineshc@apache.org>
2024-03-21 14:53:02 +09:00
hfutatzhanghb
2eb7246ea7
HDFS-17433. metrics sumOfActorCommandQueueLength should only record valid commands. (#6644) 2024-03-20 23:41:35 -04:00
huhaiyang
77c600d769
HDFS-17426. Remove Invalid FileSystemECReadStats logic in DFSInputStream (#6628) 2024-03-21 10:33:41 +08:00
huhaiyang
12a26d8b19
HDFS-17431. Fix log format for BlockRecoveryWorker#recoverBlocks (#6643) 2024-03-19 23:22:45 -04:00
slfan1989
ff3f2255d2
HADOOP-19112. Hadoop 3.4.0 release wrap-up. (#6640) Contributed by Shilun Fan.
Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2024-03-19 20:08:03 +08:00
Tsz-Wo Nicholas Sze
b25b28e5bb
HDFS-17380. FsImageValidation: remove inaccessible nodes. (#6549). Contributed by Tsz-wo Sze.
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2024-03-18 11:52:17 +08:00
Lei313
e211f6f83d
HDFS-17391. Adjust the checkpoint io buffer size to the chunk size (#6594). Contributed by lei w.
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2024-03-12 18:36:43 +08:00
Lei313
dbf08c872a
HDFS-17422. Enhance the stability of the unit test TestDFSAdmin (#6621). Contributed by lei w and Hualong Zhang.
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2024-03-12 12:25:33 +08:00
ritegarg
58afe43769
HDFS-17299. Adding rack failure tolerance when creating a new file (#6566) 2024-03-06 13:08:05 -08:00