hadoop

Author	SHA1	Message	Date
Yu Zhang	b4ddb2d3bb	HDFS-13603: do not propagate ExecutionException and add maxRetries limit to NameNode edek cache warmup (#6774 )	2024-06-24 09:34:52 -07:00
HarshitGupta11	d3b98cb1b2	HADOOP-19194:Add test to find unshaded dependencies in the aws sdk (#6865 ) The new test TestAWSV2SDK scans the aws sdk bundle.jar and prints out all classes which are unshaded, so at risk of creating classpath problems It does not fail the test if this holds, because the current SDKs do ship with unshaded classes; the test would always fail. The SDK upgrade process should include inspecting the output of this test to see if it has got worse (do a before/after check). Once the AWS SDK does shade everything, we can have this test fail on any regression Contributed by Harshit Gupta	2024-06-24 10:41:11 +01:00
Steve Loughran	8ac9c1839a	HADOOP-19203. WrappedIO BulkDelete API to raise IOEs as UncheckedIOExceptions (#6885 ) * WrappedIO methods raise UncheckedIOExceptions *New class org.apache.hadoop.util.functional.FunctionalIO with wrap/unwrap and the ability to generate a java.util.function.Supplier around a CallableRaisingIOE. Contributed by Steve Loughran	2024-06-19 18:47:29 +01:00
Hexiaoqiao	6545b7eeef	HDFS-17098. DatanodeManager does not handle null storage type properly. (#6840 ). Contributed by ConfX. Signed-off-by: Shilun Fan <slfan1989@apache.org>	2024-06-19 20:58:57 +08:00
Steve Loughran	56c8aa5f1c	HADOOP-19204. VectorIO regression: empty ranges are now rejected (#6887 ) - restore old outcome: no-op - test this - update spec This is a critical fix for vector IO and MUST be cherrypicked to all branches with that feature Contributed by Steve Loughran	2024-06-19 12:05:24 +01:00
Tsz-Wo Nicholas Sze	1e6411c9ec	HDFS-17528. FsImageValidation: set txid when saving a new image (#6828 )	2024-06-19 11:38:17 +08:00
slfan1989	9710a8d52f	YARN-11701. [Federation] Enhance Federation Cache Clean Conditions. (#6889 ) Contributed by Shilun Fan. Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>	2024-06-19 08:34:19 +08:00
Fateh Singh	90024d8cb1	HDFS-17439. Support -nonSuperUser for NNThroughputBenchmark: useful for testing auth frameworks such as Ranger (#6677 )	2024-06-18 13:52:24 +01:00
Heagan A	2fbbfe3cc9	HDFS-17546. Implementing HostsFileReader timeout (#6873 )	2024-06-14 20:47:21 -07:00
Steve Loughran	2d5fa9e016	HADOOP-18508. S3A: Support parallel integration test runs on same bucket (#5081 ) It is now possible to provide a job ID in the maven "job.id" property hadoop-aws test runs to isolate paths under a the test bucket under which all tests will be executed. This will allow independent builds in different source trees to test against the same bucket in parallel, and is designed for CI testing. Example: mvn verify -Dparallel-tests -Droot.tests.enabled=false -Djob.id=1 mvn verify -Droot.tests.enabled=false -Djob.id=2 - Root tests must be be disabled to stop them cleaning up the test paths of other test runs. - Do still regularly run the root tests just to force cleanup of the output of any interrupted test suites. Contributed by Steve Loughran	2024-06-14 19:34:52 +01:00
Viraj Jasani	240fddcf17	HADOOP-18931. FileSystem.getFileSystemClass() to log the jar the .class came from (#6197 ) Set the log level of logger org.apache.hadoop.fs.FileSystem to DEBUG to see this. Contributed by Viraj Jasani	2024-06-14 19:14:54 +01:00
Cheng Pan	2bde5ccb81	HADOOP-19192. Log level is WARN when fail to load native hadoop libs (#6863 ) Updates the documentation to be consistent with the logging. Contributed by Cheng Pan	2024-06-14 19:05:27 +01:00
Tengting Xu	a1f5dc5865	Minor, fix cpu arch compare to use correct Dockerfile (#6852 )	2024-06-13 00:37:28 +05:30
hfutatzhanghb	4b1b16a846	HDFS-17551. Fix unit test failure caused by HDFS-17464. (#6883 ). Contributed by farmmamba.	2024-06-12 22:21:15 +05:30
Mukund Thakur	06dd3bfee8	HADOOP-19196. Allow base path to be deleted as well using Bulk Delete. (#6872 ) Contributed by: Mukund Thakur	2024-06-11 14:06:53 -05:00
Anuj Modi	005030f7a0	HADOOP-18610: [ABFS] OAuth2 Token Provider support for Azure Workload Identity (#6787 ) Add support for Azure Active Directory (Azure AD) workload identities which integrate with the Kubernetes's native capabilities to federate with any external identity provider. Contributed By: Anuj Modi	2024-06-11 13:06:39 -05:00
PJ Fanning	bb30545583	HADOOP-19163. Use hadoop-shaded-protobuf_3_25 (#6858 ) Contributed by PJ Fanning	2024-06-11 17:10:00 +01:00
Felix Nguyen	776c0a3ab9	HDFS-17539. Make TestFileChecksum fields static (#6853 )	2024-06-11 15:26:21 +08:00
Pranav Saxena	2e1deee87a	HADOOP-19137. [ABFS] Prevent ABFS initialization for non-hierarchal-namespace account if Customer-provided-key configs given. (#6752 ) Customer-provided-keys (CPK) configs are not allowed with non-hierarchal-namespace (non-HNS) accounts for ABFS. This patch aims to prevent ABFS initialization for non-HNS accounts if CPK configs are provided. Contributed by: Pranav Saxena	2024-06-10 15:03:41 -05:00
slfan1989	10df59e421	Revert "HADOOP-19071. Update maven-surefire-plugin from 3.0.0 to 3.2.5. (#6664 )" (#6875 ) This reverts commit `88ad7db80d`. Signed-off-by: Shilun Fan <slfan1989@apache.org>	2024-06-08 14:51:28 +08:00
Steve Loughran	01d257d5aa	HADOOP-19189. ITestS3ACommitterFactory failing (#6857 ) * parameterize the test run rather than do it from within the test suite. * log what the committer factory is up to (and improve its logging) * close all filesystems, then create the test filesystem with cache enabled. The cache is critical, we want the fs from cache to be used when querying filesystem properties, rather than one created from the committer jobconf, which will have the same options as the task committer, so not actually validate the override logic. Contributed by Steve Loughran	2024-06-07 17:34:01 +01:00
Anuj Modi	bbb17e76a7	HADOOP-19178: [WASB Deprecation] Updating Documentation on Upcoming Plans for Hadoop-Azure (#6862 ) Contributed by Anuj Modi	2024-06-07 14:28:24 +01:00
PJ Fanning	2ee0bf9534	HADOOP-19154. Upgrade bouncycastle to 1.78.1 due to CVEs (#6755 ) Addresses * CVE-2024-29857 - Importing an EC certificate with specially crafted F2m parameters can cause high CPU usage during parameter evaluation. * CVE-2024-30171 - Possible timing based leakage in RSA based handshakes due to exception processing eliminated. * CVE-2024-30172 - Crafted signature and public key can be used to trigger an infinite loop in the Ed25519 verification code. * CVE-2024-301XX - When endpoint identification is enabled and an SSL socket is not created with an explicit hostname (as happens with HttpsURLConnection), hostname verification could be performed against a DNS-resolved IP address. Contributed by PJ Fanning	2024-06-05 15:31:23 +01:00
Cheng Pan	d8d3d538e4	HADOOP-19193. Create orphan commit for website deployment (#6864 ) This stop gh-pages deployments from increasing the size of the git repository on every run Contributed by Cheng Pan	2024-06-05 15:25:48 +01:00
Mukund Thakur	f92a8ab8ae	HADOOP-19190. Skip ITestS3AEncryptionWithDefaultS3Settings.testEncryptionFileAttributes when bucket not encrypted with sse-kms (#6859 ) Follow up of HADOOP-19190	2024-06-03 12:00:31 -05:00
Yu Zhang	f1e2ceb823	HDFS-13603: Do not propagate ExecutionException while initializing EDEK queues for keys. (#6860 )	2024-06-03 09:10:06 -07:00
Yang Jiandan	167d4c8447	YARN-11699. Diagnostics lacks userlimit info when user capacity has reached its maximum limit (#6849 ) Contributed by Jiandan Yang. Signed-off-by: Shilun Fan <slfan1989@apache.org>	2024-06-01 06:18:28 +08:00
slfan1989	9f6c997662	YARN-11471. [Federation] FederationStateStoreFacade Cache Support Caffeine. (#6795 ) Contributed by Shilun Fan. Reviewed-by: Inigo Goiri <inigoiri@apache.org> Signed-off-by: Shilun Fan <slfan1989@apache.org>	2024-06-01 06:15:20 +08:00
Anuj Modi	d8b485a512	HADOOP-18516: [ABFS][Authentication] Support Fixed SAS Token for ABFS Authentication (#6552 ) Contributed by Anuj Modi	2024-05-30 20:46:19 +01:00
Steve Loughran	d00b3acd5e	HADOOP-18679. Followup: change method name case (#6854 ) WrappedIO.bulkDelete_PageSize() => bulkDelete_pageSize() Makes it consistent with the HADOOP-19131 naming scheme. The name needs to be fixed before invoking it through reflection, as once that is attempted the binding won't work at run time, though compilation will be happy. Contributed by Steve Loughran	2024-05-30 19:34:30 +01:00
Mukund Thakur	d107931fc7	HADOOP-19188. Fix TestHarFileSystem and TestFilterFileSystem failing after bulk delete API got added. (#6848 ) Follow up to: HADOOP-18679 Add API for bulk/paged delete of files and objects Contributed by Mukund Thakur	2024-05-29 17:27:09 +01:00
K0K0V0K	ccb8ff4360	YARN-11687. CGroupV2 resource calculator (#6835 ) Co-authored-by: Benjamin Teke <brumi1024@users.noreply.github.com>	2024-05-29 17:20:23 +02:00
刘斌	6c08e8e2aa	HADOOP-19156. ZooKeeper based state stores use different ZK address configs. (#6767 ). Contributed by liu bin. Signed-off-by: Ayush Saxena <ayushsaxena@apache.org> Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>	2024-05-29 20:44:36 +08:00
Mukund Thakur	f4fde40524	HADOOP-19184. S3A Fix TestStagingCommitter.testJobCommitFailure (#6843 ) Follow up on HADOOP-18679 Contributed by: Mukund Thakur	2024-05-28 11:27:33 -05:00
Felix Nguyen	74d30a5dce	HDFS-17532. RBF: Allow router state store cache update to overwrite and delete in parallel (#6839 )	2024-05-28 11:17:08 +08:00
Murali Krishna	1baf0e889f	HADOOP-18962. Upgrade kafka to 3.4.0 (#6247 ) Upgrade Kafka Client due to CVEs * CVE-2023-25194 * CVE-2021-38153 * CVE-2018-17196 Contributed by Murali Krishna	2024-05-24 17:40:37 +01:00
Felix Nguyen	f5c5d35eb0	HDFS-17529. RBF: Improve router state store cache entry deletion (#6833 )	2024-05-24 09:41:08 +08:00
Anmol Asrani	d168d3ffee	HADOOP-18325: ABFS: Add correlated metric support for ABFS operations (#6314 ) Adds support for metric collection at the filesystem instance level. Metrics are pushed to the store upon the closure of a filesystem instance, encompassing all operations that utilized that specific instance. Collected Metrics: - Number of successful requests without any retries. - Count of requests that succeeded after a specified number of retries (x retries). - Request count subjected to throttling. - Number of requests that failed despite exhausting all retry attempts. etc. Implementation Details: Incorporated logic in the AbfsClient to facilitate metric pushing through an additional request. This occurs in scenarios where no requests are sent to the backend for a defined idle period. By implementing these enhancements, we ensure comprehensive monitoring and analysis of filesystem interactions, enabling a deeper understanding of success rates, retry scenarios, throttling instances, and exhaustive failure scenarios. Additionally, the AbfsClient logic ensures that metrics are proactively pushed even during idle periods, maintaining a continuous and accurate representation of filesystem performance. Contributed by Anmol Asrani	2024-05-23 15:10:10 +01:00
Benjamin Teke	d876505b67	YARN-11681. Update the cgroup documentation with v2 support (#6834 ) Co-authored-by: Benjamin Teke <bteke@cloudera.com> Co-authored-by: K0K0V0K <109747532+K0K0V0K@users.noreply.github.com>	2024-05-21 17:41:32 +02:00
hfutatzhanghb	fb156e8f05	HDFS-17464. Improve some logs output in class FsDatasetImpl (#6724 )	2024-05-21 09:46:21 +08:00
slfan1989	be28467374	Revert "Bump org.apache.derby:derby in /hadoop-project (#6816 )" (#6841 ) This reverts commit `b5a90d9500`.	2024-05-21 08:46:14 +08:00
Sebb	f11a8cfa6e	HADOOP-13147. Constructors must not call overrideable methods in PureJavaCrc32C (#6408 ). Contributed by Sebb.	2024-05-21 00:08:08 +05:30
Mukund Thakur	47be1ab3b6	HADOOP-18679. Add API for bulk/paged delete of files (#6726 ) Applications can create a BulkDelete instance from a BulkDeleteSource; the BulkDelete interface provides the pageSize(): the maximum number of entries which can be deleted, and a bulkDelete(Collection paths) method which can take a collection up to pageSize() long. This is optimized for object stores with bulk delete APIs; the S3A connector will offer the page size of fs.s3a.bulk.delete.page.size unless bulk delete has been disabled. Even with a page size of 1, the S3A implementation is more efficient than delete(path) as there are no safety checks for the path being a directory or probes for the need to recreate directories. The interface BulkDeleteSource is implemented by all FileSystem implementations, with a page size of 1 and mapped to delete(pathToDelete, false). This means that callers do not need to have special case handling for object stores versus classic filesystems. To aid use through reflection APIs, the class org.apache.hadoop.io.wrappedio.WrappedIO has been created with "reflection friendly" methods. Contributed by Mukund Thakur and Steve Loughran	2024-05-20 17:05:25 +01:00
Kaiyao Ke	41eacf4914	MAPREDUCE-7475. Fix non-idempotent unit tests (#6785 ) Contributed by Kaiyao Ke	2024-05-17 14:51:47 +01:00
LiuGuH	8f92cda35c	HDFS-17509. RBF: Fix ClientProtocol.concat will throw NPE if tgr is a empty file. (#6784 )	2024-05-17 10:37:50 +08:00
skyskyhu	3c00093cb5	HADOOP-19167 Bug Fix: Change of Codec configuration does not work (#6807 )	2024-05-17 10:27:39 +08:00
Vikas Kumar	f8dce6c501	HADOOP-18851. Performance improvement for DelegationTokenSecretManager (#6803 )	2024-05-16 12:30:52 +08:00
Mukund Thakur	a97e3022de	HADOOP-19013. Adding x-amz-server-side-encryption-aws-kms-key-id in the get file attributes for S3A. (#6646 ) Contributed by: Mukund Thakur	2024-05-15 11:54:54 -05:00
Peter Szucs	129d91f7b2	YARN-11692. Support mixed cgroup v1/v2 controller structure (#6821 )	2024-05-15 16:32:49 +02:00
Steve Loughran	cfdf1f5e8e	HADOOP-19172. S3A: upgrade AWS v1 sdk to 1.12.720 (#6823 ) +remove reference in LICENSE-binary as it is no longer shipped Contributed by Steve Loughran	2024-05-15 14:40:39 +01:00

1 2 3 4 5 ...

27385 Commits