hadoop

Author	SHA1	Message	Date
Cheng Pan	d8d3d538e4	HADOOP-19193. Create orphan commit for website deployment (#6864 ) This stop gh-pages deployments from increasing the size of the git repository on every run Contributed by Cheng Pan	2024-06-05 15:25:48 +01:00
Mukund Thakur	f92a8ab8ae	HADOOP-19190. Skip ITestS3AEncryptionWithDefaultS3Settings.testEncryptionFileAttributes when bucket not encrypted with sse-kms (#6859 ) Follow up of HADOOP-19190	2024-06-03 12:00:31 -05:00
Yu Zhang	f1e2ceb823	HDFS-13603: Do not propagate ExecutionException while initializing EDEK queues for keys. (#6860 )	2024-06-03 09:10:06 -07:00
Yang Jiandan	167d4c8447	YARN-11699. Diagnostics lacks userlimit info when user capacity has reached its maximum limit (#6849 ) Contributed by Jiandan Yang. Signed-off-by: Shilun Fan <slfan1989@apache.org>	2024-06-01 06:18:28 +08:00
slfan1989	9f6c997662	YARN-11471. [Federation] FederationStateStoreFacade Cache Support Caffeine. (#6795 ) Contributed by Shilun Fan. Reviewed-by: Inigo Goiri <inigoiri@apache.org> Signed-off-by: Shilun Fan <slfan1989@apache.org>	2024-06-01 06:15:20 +08:00
Anuj Modi	d8b485a512	HADOOP-18516: [ABFS][Authentication] Support Fixed SAS Token for ABFS Authentication (#6552 ) Contributed by Anuj Modi	2024-05-30 20:46:19 +01:00
Steve Loughran	d00b3acd5e	HADOOP-18679. Followup: change method name case (#6854 ) WrappedIO.bulkDelete_PageSize() => bulkDelete_pageSize() Makes it consistent with the HADOOP-19131 naming scheme. The name needs to be fixed before invoking it through reflection, as once that is attempted the binding won't work at run time, though compilation will be happy. Contributed by Steve Loughran	2024-05-30 19:34:30 +01:00
Mukund Thakur	d107931fc7	HADOOP-19188. Fix TestHarFileSystem and TestFilterFileSystem failing after bulk delete API got added. (#6848 ) Follow up to: HADOOP-18679 Add API for bulk/paged delete of files and objects Contributed by Mukund Thakur	2024-05-29 17:27:09 +01:00
K0K0V0K	ccb8ff4360	YARN-11687. CGroupV2 resource calculator (#6835 ) Co-authored-by: Benjamin Teke <brumi1024@users.noreply.github.com>	2024-05-29 17:20:23 +02:00
刘斌	6c08e8e2aa	HADOOP-19156. ZooKeeper based state stores use different ZK address configs. (#6767 ). Contributed by liu bin. Signed-off-by: Ayush Saxena <ayushsaxena@apache.org> Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>	2024-05-29 20:44:36 +08:00
Mukund Thakur	f4fde40524	HADOOP-19184. S3A Fix TestStagingCommitter.testJobCommitFailure (#6843 ) Follow up on HADOOP-18679 Contributed by: Mukund Thakur	2024-05-28 11:27:33 -05:00
Felix Nguyen	74d30a5dce	HDFS-17532. RBF: Allow router state store cache update to overwrite and delete in parallel (#6839 )	2024-05-28 11:17:08 +08:00
Murali Krishna	1baf0e889f	HADOOP-18962. Upgrade kafka to 3.4.0 (#6247 ) Upgrade Kafka Client due to CVEs * CVE-2023-25194 * CVE-2021-38153 * CVE-2018-17196 Contributed by Murali Krishna	2024-05-24 17:40:37 +01:00
Felix Nguyen	f5c5d35eb0	HDFS-17529. RBF: Improve router state store cache entry deletion (#6833 )	2024-05-24 09:41:08 +08:00
Anmol Asrani	d168d3ffee	HADOOP-18325: ABFS: Add correlated metric support for ABFS operations (#6314 ) Adds support for metric collection at the filesystem instance level. Metrics are pushed to the store upon the closure of a filesystem instance, encompassing all operations that utilized that specific instance. Collected Metrics: - Number of successful requests without any retries. - Count of requests that succeeded after a specified number of retries (x retries). - Request count subjected to throttling. - Number of requests that failed despite exhausting all retry attempts. etc. Implementation Details: Incorporated logic in the AbfsClient to facilitate metric pushing through an additional request. This occurs in scenarios where no requests are sent to the backend for a defined idle period. By implementing these enhancements, we ensure comprehensive monitoring and analysis of filesystem interactions, enabling a deeper understanding of success rates, retry scenarios, throttling instances, and exhaustive failure scenarios. Additionally, the AbfsClient logic ensures that metrics are proactively pushed even during idle periods, maintaining a continuous and accurate representation of filesystem performance. Contributed by Anmol Asrani	2024-05-23 15:10:10 +01:00
Benjamin Teke	d876505b67	YARN-11681. Update the cgroup documentation with v2 support (#6834 ) Co-authored-by: Benjamin Teke <bteke@cloudera.com> Co-authored-by: K0K0V0K <109747532+K0K0V0K@users.noreply.github.com>	2024-05-21 17:41:32 +02:00
hfutatzhanghb	fb156e8f05	HDFS-17464. Improve some logs output in class FsDatasetImpl (#6724 )	2024-05-21 09:46:21 +08:00
slfan1989	be28467374	Revert "Bump org.apache.derby:derby in /hadoop-project (#6816 )" (#6841 ) This reverts commit `b5a90d9500`.	2024-05-21 08:46:14 +08:00
Sebb	f11a8cfa6e	HADOOP-13147. Constructors must not call overrideable methods in PureJavaCrc32C (#6408 ). Contributed by Sebb.	2024-05-21 00:08:08 +05:30
Mukund Thakur	47be1ab3b6	HADOOP-18679. Add API for bulk/paged delete of files (#6726 ) Applications can create a BulkDelete instance from a BulkDeleteSource; the BulkDelete interface provides the pageSize(): the maximum number of entries which can be deleted, and a bulkDelete(Collection paths) method which can take a collection up to pageSize() long. This is optimized for object stores with bulk delete APIs; the S3A connector will offer the page size of fs.s3a.bulk.delete.page.size unless bulk delete has been disabled. Even with a page size of 1, the S3A implementation is more efficient than delete(path) as there are no safety checks for the path being a directory or probes for the need to recreate directories. The interface BulkDeleteSource is implemented by all FileSystem implementations, with a page size of 1 and mapped to delete(pathToDelete, false). This means that callers do not need to have special case handling for object stores versus classic filesystems. To aid use through reflection APIs, the class org.apache.hadoop.io.wrappedio.WrappedIO has been created with "reflection friendly" methods. Contributed by Mukund Thakur and Steve Loughran	2024-05-20 17:05:25 +01:00
Kaiyao Ke	41eacf4914	MAPREDUCE-7475. Fix non-idempotent unit tests (#6785 ) Contributed by Kaiyao Ke	2024-05-17 14:51:47 +01:00
LiuGuH	8f92cda35c	HDFS-17509. RBF: Fix ClientProtocol.concat will throw NPE if tgr is a empty file. (#6784 )	2024-05-17 10:37:50 +08:00
skyskyhu	3c00093cb5	HADOOP-19167 Bug Fix: Change of Codec configuration does not work (#6807 )	2024-05-17 10:27:39 +08:00
Vikas Kumar	f8dce6c501	HADOOP-18851. Performance improvement for DelegationTokenSecretManager (#6803 )	2024-05-16 12:30:52 +08:00
Mukund Thakur	a97e3022de	HADOOP-19013. Adding x-amz-server-side-encryption-aws-kms-key-id in the get file attributes for S3A. (#6646 ) Contributed by: Mukund Thakur	2024-05-15 11:54:54 -05:00
Peter Szucs	129d91f7b2	YARN-11692. Support mixed cgroup v1/v2 controller structure (#6821 )	2024-05-15 16:32:49 +02:00
Steve Loughran	cfdf1f5e8e	HADOOP-19172. S3A: upgrade AWS v1 sdk to 1.12.720 (#6823 ) +remove reference in LICENSE-binary as it is no longer shipped Contributed by Steve Loughran	2024-05-15 14:40:39 +01:00
xuzifu666	cf9559eb27	HADOOP-19073 WASB: Fix connection leak in FolderRenamePending (#6534 ) Contributed by xuyu	2024-05-15 14:38:06 +01:00
ZanderXu	cab0f4c9ec	HDFS-17520. [BugFix] TestDFSAdmin.testAllDatanodesReconfig and TestDFSAdmin.testDecommissionDataNodesReconfig failed (#6812 ) Contributed by Zengqiang Xu. Reviewed-by: Vinayakumar B <vinayakumarb@apache.org> Signed-off-by: Shilun Fan <slfan1989@apache.org>	2024-05-15 07:55:24 +08:00
Christopher Tubbs	2e77b7b02c	[HADOOP-18786] Use CDN instead of ASF archive (#5789 ) * Use Yetus 0.14.1 from downloads.apache.org in yetus-wrapper * Use Maven 3.8.8 from downloads.apache.org in Win 10 Dockerfile * Point users to downloads.apache.org for JVSC * Use Solr 8.11.2 from downloads.apache.org in YARN Dockerfile Contributed by Christopher Tubbs	2024-05-14 20:09:52 +01:00
zhihui wang	39dee8ea19	HADOOP-18958. Improve UserGroupInformation debug log. (#6255 ) Contributed by zhihui wang	2024-05-14 20:03:49 +01:00
Tsz-Wo Nicholas Sze	bda7045070	HADOOP-19152. Do not hard code security providers. (#6739 )	2024-05-14 11:19:57 -07:00
Simbarashe Dzinamarira	6a4f0be854	HDFS-17514: RBF: Routers should unset cached stateID when namenode does not set stateID in RPC response header. (#6804 )	2024-05-14 08:09:56 -07:00
ConfX	8d9d58dfc8	HDFS-17099. Fix Null Pointer Exception when stop namesystem in HDFS.(#6034 ). Contributed by ConfX. Signed-off-by: Ayush Saxena <ayushsaxena@apache.org> Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>	2024-05-14 11:14:55 +08:00
zhengchenyu	4cb4d5dd08	HADOOP-19170. Fixes compilation issues on non-Linux systems (#6822 ) Reviewed-by: Steve Loughran <stevel@apache.org> Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org>	2024-05-13 20:04:01 -07:00
Steve Loughran	c9270600b7	MAPREDUCE-7474. Improve Manifest committer resilience (#6716 ) Improve task commit resilience everywhere and add an option to reduce delete IO requests on job cleanup (relevant for ABFS and HDFS). Task Commit Resilience ---------------------- Task manifest saving is re-attempted on failure; the number of attempts made is configurable with the option: mapreduce.manifest.committer.manifest.save.attempts * The default is 5. * The minimum is 1; asking for less is ignored. * A retry policy adds 500ms of sleep per attempt. * Move from classic rename() to commitFile() to rename the file, after calling getFileStatus() to get its length and possibly etag. This becomes a rename() on gcs/hdfs anyway, but on abfs it does reach the ResilientCommitByRename callbacks in abfs, which report on the outcome to the caller...which is then logged at WARN. * New statistic task_stage_save_summary_file to distinguish from other saving operations (job success/report file). This is only saved to the manifest on task commit retries, and provides statistics on all previous unsuccessful attempts to save the manifests + test changes to match the codepath changes, including improvements in fault injection. Directory size for deletion --------------------------- New option mapreduce.manifest.committer.cleanup.parallel.delete.base.first This attempts an initial attempt at deleting the base dir, only falling back to parallel deletes if there's a timeout. This option is disabled by default; Consider enabling it for abfs to reduce IO load. Consult the documentation for more details. Success file printing --------------------- The command to print a JSON _SUCCESS file from this committer and any S3A committer is now something which can be invoked from the mapred command: mapred successfile <path to file> Contributed by Steve Loughran	2024-05-13 21:12:34 +01:00
zhihui wang	12e0ca6b24	HDFS-17522. JournalNode web interfaces lack configs for X-FRAME-OPTIONS protection (#6814 ). Contributed by wangzhihui. Signed-off-by: Vinayakumar B <vinayakumarb@apache.org> Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>	2024-05-13 22:10:08 +08:00
Benjamin Teke	ce7d01fac8	YARN-11689. Update the cgroup v2 init error handling (#6810 )	2024-05-13 12:56:26 +02:00
dependabot[bot]	b5a90d9500	Bump org.apache.derby:derby in /hadoop-project (#6816 ) Bumps org.apache.derby:derby from 10.14.2.0 to 10.17.1.0. --- updated-dependencies: - dependency-name: org.apache.derby:derby dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-05-13 12:47:31 +08:00
dependabot[bot]	1d09a64e34	Bump org.bouncycastle:bcprov-jdk18on in /hadoop-project (#6811 ) Bumps [org.bouncycastle:bcprov-jdk18on](https://github.com/bcgit/bc-java) from 1.77 to 1.78. - [Changelog](https://github.com/bcgit/bc-java/blob/main/docs/releasenotes.html) - [Commits](https://github.com/bcgit/bc-java/commits) --- updated-dependencies: - dependency-name: org.bouncycastle:bcprov-jdk18on dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-05-12 18:38:36 +05:30
Felix Nguyen	fb0519253d	HDFS-17488. DN can fail IBRs with NPE when a volume is removed (#6759 )	2024-05-11 15:37:43 +08:00
Zilong Zhu	700b3e4800	HDFS-17503. Unreleased volume references because of OOM. (#6782 )	2024-05-10 10:34:40 +08:00
Sammi Chen	43e8ca428e	Revert "HADOOP-18851: Performance improvement for DelegationTokenSecretManager. (#6001 ). Contributed by Vikas Kumar." This reverts commit `e283375cdf`.	2024-05-07 13:29:32 +08:00
kulkabhay	edf985e269	HDFS-17500: Add missing operation name while authorizing some operations (#6776 ). Contributed by kulkabhay. Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>	2024-05-06 12:44:30 +08:00
Doroszlai, Attila	2645898450	HADOOP-19160. hadoop-auth should not depend on kerb-simplekdc (#6788 )	2024-05-03 12:57:26 +02:00
dannytbecker	881034ad45	CachedRecordStore should check if the record state is expired (#6783 )	2024-05-01 13:56:53 -07:00
Viraj Jasani	a8a58944bd	HADOOP-19146. S3A: noaa-cors-pds test bucket access with global endpoint fails (#6723 ) HADOOP-19057 switched the hadoop-aws test bucket from landsat-pds to noaa-cors-pds This new bucket isn't accessible if the client configuration sets an fs.s3a.endpoint/region value other than us-east-1. Contributed by Viraj Jasani	2024-04-30 12:16:36 +01:00
Peter Szucs	910cb6b887	YARN-11685. Create a config to enable/disable cgroup v2 functionality (#6770 )	2024-04-30 11:25:16 +02:00
fuchaohong	0c9e0b4398	HDFS-17456. Fix the incorrect dfsused statistics of datanode when appending a file. (#6713 ). Contributed by fuchaohong. Reviewed-by: ZanderXu <zanderxu@apache.org> Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>	2024-04-30 12:22:53 +08:00
fuchaohong	ddb805951e	HDFS-17471. Correct the percentage of sample range. (#6742 ). Contributed by fuchaohong. Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>	2024-04-30 12:18:47 +08:00

1 2 3 4 5 ...

27362 Commits