hadoop

Author	SHA1	Message	Date
Steve Loughran	57e62ae07f	Revert "YARN-11664. Remove HDFS Binaries/Jars Dependency From Yarn (#6631 )" This reverts commit `6c01490f14`.	2024-09-05 14:35:50 +01:00
Syed Shameerur Rahman	6c01490f14	YARN-11664. Remove HDFS Binaries/Jars Dependency From Yarn (#6631 ) To support YARN deployments in clusters without HDFS some changes have been made in packaging * new hadoop-common class org.apache.hadoop.fs.HdfsCommonConstants * hdfs class org.apache.hadoop.hdfs.protocol.datatransfer.IOStreamPair moved from hdfs-client to hadoop-common * YARN handlers for DSQuotaExceededException replaced by use of superclass ClusterStorageCapacityExceededException. Contributed by Syed Shameerur Rahman	2024-09-04 13:26:42 +01:00
Cheng Pan	9486844610	HADOOP-16928. Make javadoc work on Java 17 (#6976 ) Contributed by Cheng Pan	2024-09-04 11:50:59 +01:00
litao	a962aa37e0	HDFS-17599. EC: Fix the mismatch between locations and indices for mover (#6980 )	2024-08-30 12:56:33 +08:00
Ayush Saxena	0837c84a9f	Revert "HADOOP-19231. Add JacksonUtil to manage Jackson classes (#6953 )" This reverts commit `fa9bb0d1ac`.	2024-08-29 14:42:03 +05:30
Sung Dong Kim	89e38f08ae	HDFS-17573. Allow turn on both FSImage parallelization and compression (#6929 ). Contributed by Sung Dong Kim. Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>	2024-08-25 17:51:14 +08:00
Heagan A	f6c45e0bcf	HDFS-17546. Follow-up backport from branch3.3 (#6908 ) HDFS-17546. Follow-up backport from branch3.3	2024-08-21 13:11:32 -07:00
Tsz-Wo Nicholas Sze	012ae9d1aa	HDFS-17606. Do not require implementing CustomizedCallbackHandler. (#7005 )	2024-08-20 16:32:53 -07:00
Stephen O'Donnell	df08e0de41	HDFS-17605. Reduce memory overhead of TestBPOfferService (#6996 )	2024-08-19 11:35:11 +01:00
PJ Fanning	59dba6e1bd	HADOOP-19134. Use StringBuilder instead of StringBuffer. (#6692 ). Contributed by PJ Fanning	2024-08-18 21:29:12 +05:30
PJ Fanning	fa9bb0d1ac	HADOOP-19231. Add JacksonUtil to manage Jackson classes (#6953 ) New class org.apache.hadoop.util.JacksonUtil centralizes construction of Jackson ObjectMappers and JsonFactories. Contributed by PJ Fanning	2024-08-15 16:44:54 +01:00
Steve Loughran	55a576906d	HADOOP-19131. Assist reflection IO with WrappedOperations class (#6686 ) 1. The class WrappedIO has been extended with more filesystem operations - openFile() - PathCapabilities - StreamCapabilities - ByteBufferPositionedReadable All these static methods raise UncheckedIOExceptions rather than checked ones. 2. The adjacent class org.apache.hadoop.io.wrappedio.WrappedStatistics provides similar access to IOStatistics/IOStatisticsContext classes and operations. Allows callers to: * Get a serializable IOStatisticsSnapshot from an IOStatisticsSource or IOStatistics instance * Save an IOStatisticsSnapshot to file * Convert an IOStatisticsSnapshot to JSON * Given an object which may be an IOStatisticsSource, return an object whose toString() value is a dynamically generated, human readable summary. This is for logging. * Separate getters to the different sections of IOStatistics. * Mean values are returned as a Map.Pair<Long, Long> of (samples, sum) from which means may be calculated. There are examples of the dynamic bindings to these classes in: org.apache.hadoop.io.wrappedio.impl.DynamicWrappedIO org.apache.hadoop.io.wrappedio.impl.DynamicWrappedStatistics These use DynMethods and other classes in the package org.apache.hadoop.util.dynamic which are based on the Apache Parquet equivalents. This makes re-implementing these in that library and others which their own fork of the classes (example: Apache Iceberg) 3. The openFile() option "fs.option.openfile.read.policy" has added specific file format policies for the core filetypes * avro * columnar * csv * hbase * json * orc * parquet S3A chooses the appropriate sequential/random policy as a A policy `parquet, columnar, vector, random, adaptive` will use the parquet policy for any filesystem aware of it, falling back to the first entry in the list which the specific version of the filesystem recognizes 4. New Path capability fs.capability.virtual.block.locations Indicates that locations are generated client side and don't refer to real hosts. Contributed by Steve Loughran	2024-08-14 14:43:00 +01:00
Tsz-Wo Nicholas Sze	b189ef8197	HDFS-17575. SaslDataTransferClient should use SaslParticipant to create messages. (#6954 )	2024-08-05 10:42:12 -07:00
Aswin M Prabhu	e2a0dca43b	HDFS-16690. Automatically format unformatted JNs with JournalNodeSyncer (#6925 ). Contributed by Aswin M Prabhu. Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>	2024-07-23 20:55:57 +08:00
Viraj Jasani	e000cbf277	HADOOP-19218. Addendum. Update TestFSNamesystemLockReport to exclude hostname resolution from regex. (#6951 ). Contributed by Viraj Jasani. Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>	2024-07-23 20:47:36 +08:00
Tsz-Wo Nicholas Sze	a5eb5e9611	HDFS-17576. Support user defined auth Callback. (#6945 )	2024-07-20 15:21:06 +08:00
gavin.wang	5730656660	HDFS-17574. Make NNThroughputBenchmark support human-friendly units about blocksize. (#6931 ). Contributed by wangzhongwei. Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>	2024-07-16 20:57:50 +08:00
zhengchenyu	8913d379fd	HDFS-17566. Got wrong sorted block order when StorageType is considered. (#6934 ). Contributed by Chenyu Zheng. Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>	2024-07-11 17:41:24 +08:00
gavin.wang	783a852029	HDFS-17555. Fix NumberFormatException of NNThroughputBenchmark when configured dfs.blocksize. (#6894 ). Contributed by wangzhongwei Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org> Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>	2024-07-09 13:52:15 +05:30
huhaiyang	8ca4627a0d	HDFS-17557. Fix bug for TestRedundancyMonitor#testChooseTargetWhenAllDataNodesStop (#6897 ). Contributed by Haiyang Hu. Some checks failed website / build (push) Has been cancelled Details Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>	2024-07-06 13:18:12 +05:30
huhaiyang	5a8f70a72e	HDFS-17559. Fix the uuid as null in NameNodeMXBean (#6906 ). Contributed by Haiyang Hu. Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>	2024-07-06 13:16:25 +05:30
huhaiyang	ae76e9475c	HDFS-17564. EC: Fix the issue of inaccurate metrics when decommission mark busy DN. (#6911 ). Contributed by Haiyang Hu. Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>	2024-07-05 20:45:01 +08:00
Yu Zhang	b4ddb2d3bb	HDFS-13603: do not propagate ExecutionException and add maxRetries limit to NameNode edek cache warmup (#6774 )	2024-06-24 09:34:52 -07:00
Hexiaoqiao	6545b7eeef	HDFS-17098. DatanodeManager does not handle null storage type properly. (#6840 ). Contributed by ConfX. Signed-off-by: Shilun Fan <slfan1989@apache.org>	2024-06-19 20:58:57 +08:00
Tsz-Wo Nicholas Sze	1e6411c9ec	HDFS-17528. FsImageValidation: set txid when saving a new image (#6828 )	2024-06-19 11:38:17 +08:00
Fateh Singh	90024d8cb1	HDFS-17439. Support -nonSuperUser for NNThroughputBenchmark: useful for testing auth frameworks such as Ranger (#6677 )	2024-06-18 13:52:24 +01:00
Heagan A	2fbbfe3cc9	HDFS-17546. Implementing HostsFileReader timeout (#6873 )	2024-06-14 20:47:21 -07:00
hfutatzhanghb	4b1b16a846	HDFS-17551. Fix unit test failure caused by HDFS-17464. (#6883 ). Contributed by farmmamba.	2024-06-12 22:21:15 +05:30
Felix Nguyen	776c0a3ab9	HDFS-17539. Make TestFileChecksum fields static (#6853 )	2024-06-11 15:26:21 +08:00
hfutatzhanghb	fb156e8f05	HDFS-17464. Improve some logs output in class FsDatasetImpl (#6724 )	2024-05-21 09:46:21 +08:00
Mukund Thakur	47be1ab3b6	HADOOP-18679. Add API for bulk/paged delete of files (#6726 ) Applications can create a BulkDelete instance from a BulkDeleteSource; the BulkDelete interface provides the pageSize(): the maximum number of entries which can be deleted, and a bulkDelete(Collection paths) method which can take a collection up to pageSize() long. This is optimized for object stores with bulk delete APIs; the S3A connector will offer the page size of fs.s3a.bulk.delete.page.size unless bulk delete has been disabled. Even with a page size of 1, the S3A implementation is more efficient than delete(path) as there are no safety checks for the path being a directory or probes for the need to recreate directories. The interface BulkDeleteSource is implemented by all FileSystem implementations, with a page size of 1 and mapped to delete(pathToDelete, false). This means that callers do not need to have special case handling for object stores versus classic filesystems. To aid use through reflection APIs, the class org.apache.hadoop.io.wrappedio.WrappedIO has been created with "reflection friendly" methods. Contributed by Mukund Thakur and Steve Loughran	2024-05-20 17:05:25 +01:00
ZanderXu	cab0f4c9ec	HDFS-17520. [BugFix] TestDFSAdmin.testAllDatanodesReconfig and TestDFSAdmin.testDecommissionDataNodesReconfig failed (#6812 ) Contributed by Zengqiang Xu. Reviewed-by: Vinayakumar B <vinayakumarb@apache.org> Signed-off-by: Shilun Fan <slfan1989@apache.org>	2024-05-15 07:55:24 +08:00
ConfX	8d9d58dfc8	HDFS-17099. Fix Null Pointer Exception when stop namesystem in HDFS.(#6034 ). Contributed by ConfX. Signed-off-by: Ayush Saxena <ayushsaxena@apache.org> Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>	2024-05-14 11:14:55 +08:00
zhihui wang	12e0ca6b24	HDFS-17522. JournalNode web interfaces lack configs for X-FRAME-OPTIONS protection (#6814 ). Contributed by wangzhihui. Signed-off-by: Vinayakumar B <vinayakumarb@apache.org> Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>	2024-05-13 22:10:08 +08:00
Felix Nguyen	fb0519253d	HDFS-17488. DN can fail IBRs with NPE when a volume is removed (#6759 )	2024-05-11 15:37:43 +08:00
Zilong Zhu	700b3e4800	HDFS-17503. Unreleased volume references because of OOM. (#6782 )	2024-05-10 10:34:40 +08:00
kulkabhay	edf985e269	HDFS-17500: Add missing operation name while authorizing some operations (#6776 ). Contributed by kulkabhay. Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>	2024-05-06 12:44:30 +08:00
fuchaohong	0c9e0b4398	HDFS-17456. Fix the incorrect dfsused statistics of datanode when appending a file. (#6713 ). Contributed by fuchaohong. Reviewed-by: ZanderXu <zanderxu@apache.org> Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>	2024-04-30 12:22:53 +08:00
fuchaohong	ddb805951e	HDFS-17471. Correct the percentage of sample range. (#6742 ). Contributed by fuchaohong. Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>	2024-04-30 12:18:47 +08:00
Tsz-Wo Nicholas Sze	78987a71a6	HADOOP-19151. Support configurable SASL mechanism. (#6740 )	2024-04-29 10:02:23 -07:00
zhtttylz	daafc8a0b8	HDFS-17367. Add PercentUsed for Different StorageTypes in JMX (#6735 ) Contributed by Hualong Zhang. Signed-off-by: Shilun Fan <slfan1989@apache.org>	2024-04-27 20:36:11 +08:00
dannytbecker	027b4c3259	Remove empty queues from the queueByBlockId map (#6772 )	2024-04-26 14:25:15 -07:00
cxzl25	23286b0632	HDFS-17469. Audit log for reportBadBlocks RPC (#6731 )	2024-04-24 09:39:57 +08:00
Madhan Neethiraj	e8b2c28dec	HDFS-17478. FSPermissionChecker optimization by initializing AccessControlEnforcer in constructor (#6749 )	2024-04-18 15:43:31 -07:00
dannytbecker	0c35cf0982	HDFS-17477. IncrementalBlockReport race condition additional edge cases (#6748 )	2024-04-18 09:04:08 -07:00
Lei313	f49a4df797	HDFS-17383:Datanode current block token should come from active NameNode in HA mode (#6562 ). Contributed by lei w. Reviewed-by: Shuyan Zhang <zhangshuyan@apache.org> Signed-off-by: Shuyan Zhang <zhangshuyan@apache.org>	2024-04-15 18:35:53 +08:00
huhaiyang	81b05977f2	HDFS-17455. Fix Client throw IndexOutOfBoundsException in DFSInputStream#fetchBlockAt (#6710 ). Contributed by Haiyang Hu. Reviewed-by: ZanderXu <zanderxu@apache.org> Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>	2024-04-11 18:04:57 +08:00
dannytbecker	05964ad07a	HDFS-17453. IncrementalBlockReport can have race condition with Edit Log Tailer (#6708 )	2024-04-10 09:30:24 -07:00
ConfX	73e6931ed0	HDFS-17449. Fix ill-formed decommission host name and port pair triggers IndexOutOfBound error (#6691 ). Contributed by ConfX Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>	2024-04-06 13:38:09 +05:30
Steve Loughran	87fb977777	HADOOP-19098. Vector IO: Specify and validate ranges consistently. #6604 Clarifies behaviour of VectorIO methods with contract tests as well as specification. * Add precondition range checks to all implementations * Identify and fix bug where direct buffer reads was broken (HADOOP-19101; this surfaced in ABFS contract tests) * Logging in VectoredReadUtils. * TestVectoredReadUtils verifies validation logic. * FileRangeImpl toString() improvements * CombinedFileRange tracks bytes in range which are wanted; toString() output logs this. HDFS * Add test TestHDFSContractVectoredRead ABFS * Add test ITestAbfsFileSystemContractVectoredRead S3A * checks for vector IO being stopped in all iterative vector operations, including draining * maps read() returning -1 to failure * passes in file length to validation * Error reporting to only completeExceptionally() those ranges which had not yet read data in. * Improved logging. readVectored() * made synchronized. This is only for the invocation; the actual async retrieves are unsynchronized. * closes input stream on invocation * switches to random IO, so avoids keeping any long-lived connection around. + AbstractSTestS3AHugeFiles enhancements. + ADDENDUM: test fix in ITestS3AContractVectoredRead Contains: HADOOP-19101. Vectored Read into off-heap buffer broken in fallback implementation Contributed by Steve Loughran Change-Id: Ia4ed71864c595f175c275aad83a2ff5741693432	2024-04-03 13:17:52 +01:00

1 2 3 4 5 ...

9155 Commits