Commit Graph

26976 Commits

Author SHA1 Message Date
Viraj Jasani
cf3a4b3bb7
HADOOP-18850. S3A: Enable dual-layer server-side encryption with AWS KMS keys (#6140)
Contributed by Viraj Jasani
2023-11-01 13:30:35 +00:00
Brian Goerlitz
4c04a6768c
YARN-11584. Safely fail on leaf queue with empty name (#6148) 2023-10-31 17:25:41 +01:00
mudit1289
f1ce273150
MAPREDUCE-7457: Added support to limit count of spill files (#6155) Contributed by Mudit Sharma.
Reviewed-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2023-10-31 06:58:22 +08:00
slfan1989
254dbab5a3
YARN-9013. [BackPort] [GPG] fix order of steps cleaning Registry entries in ApplicationCleaner. (#6147) Contributed by Botong Huang, Shilun Fan.
Co-authored-by: Botong Huang <botong@apache.org>
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2023-10-31 06:56:00 +08:00
PJ Fanning
a079f6261d
HADOOP-18917. Addendum. Fix deprecation issues after commons-io upgrade. (#6228). Contributed by PJ Fanning. 2023-10-30 09:35:02 +05:30
ConfX
7c6af6a5f6
HADOOP-18905. Negative timeout in ZKFailovercontroller due to overflow. (#6092). Contributed by ConfX.
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-10-29 13:30:28 +05:30
PJ Fanning
b9c9c42b29
HADOOP-18936. Upgrade to jetty 9.4.53 (#6181). Contributed by PJ Fanning.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-10-29 13:09:12 +05:30
slfan1989
40e8300719
YARN-11592. Add timeout to GPGUtils#invokeRMWebService. (#6189) Contributed by Shilun Fan.
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2023-10-28 07:09:09 +08:00
Junfan Zhang
e4eda40ac9
YARN-11597. Fix NPE when loading static files in SLSWebApp (#6216) Contributed by Junfan Zhang.
Reviewed-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2023-10-27 22:11:01 +08:00
Steve Loughran
7ec636deec
HADOOP-18930. Make fs.s3a.create.performance a bucket-wide setting. (#6168)
If fs.s3a.create.performance is set on a bucket
- All file overwrite checks are skipped, even if the caller says
  otherwise.
- All directory existence checks are skipped.
- Marker deletion is *always* skipped.

This eliminates a HEAD and a LIST for every creation.

* New path capability "fs.s3a.create.performance.enabled" true
  if the option is enabled.
* Parameterize ITestS3AContractCreate to expect the different
  outcomes
* Parameterize ITestCreateFileCost similarly, with
  changed cost assertions there.
* create(/) raises an IOE. existing bug only noticed here.

Contributed by Steve Loughran
2023-10-27 12:23:55 +01:00
Hiroaki Segawa
93a3c6e2cd
HDFS-17024. Potential data race introduced by HDFS-15865 (#6223) 2023-10-26 22:25:00 -07:00
slfan1989
652908519e
YARN-11588. [Federation] [Addendum] Fix uncleaned threads in yarn router thread pool executor. (#6222) 2023-10-26 13:39:06 -07:00
Wei-Chiu Chuang
821ed83873
HDFS-15273. CacheReplicationMonitor hold lock for long time and lead to NN out of service. Contributed by Xiaoqiao He. 2023-10-26 10:35:10 -07:00
slfan1989
d18410221b
YARN-11593. [Federation] Improve command line help information. (#6199) Contributed by Shilun Fan.
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2023-10-26 08:22:18 +08:00
Steve Loughran
8bd1f65efc
HADOOP-18948. S3A. Add option fs.s3a.directory.operations.purge.uploads to purge on rename/delete (#6218)
S3A directory delete and rename will optionally abort all pending multipart uploads
in their under their to-be-deleted paths when.

fs.s3a.directory.operations.purge.upload is true

It is off by default.

The filesystems hasPathCapability("fs.s3a.directory.operations.purge.upload")
probe will return true when this feature is enabled.

Multipart uploads may accrue from interrupted data writes, uncommitted 
staging/magic committer jobs and other operations/applications. On AWS S3
lifecycle rules are the recommended way to clean these; this change improves
support for stores which lack these rules.

Contributed by Steve Loughran
2023-10-25 17:39:16 +01:00
PJ Fanning
bbf905dc99
HADOOP-18933. upgrade to netty 4.1.100 due to CVE (#6173)
Mitigates Netty security advisory GHSA-xpw8-rcwv-8f8p
"HTTP/2 Rapid Reset Attack - DDoS vector in the HTTP/2 protocol due RST frames"

Contributed by PJ Fanning
2023-10-25 14:06:13 +01:00
huhaiyang
f85ac5b60d
HADOOP-18920. RPC Metrics : Optimize logic for log slow RPCs (#6146) 2023-10-25 13:56:39 +08:00
gp1314
a170d58501
HDFS-17231. HA: Safemode should exit when resources are from low to available. (#6207). Contributed by Gu Peng.
Reviewed-by: Xing Lin <xinglin@linkedin.com>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2023-10-25 11:43:12 +08:00
Stephen O'Donnell
882f08b4bc
HDFS-17237. Remove IPCLoggerChannelMetrics when the logger is closed (#6217) 2023-10-24 21:39:03 +01:00
Steve Loughran
8b974bcc1f
HADOOP-18889. Third party storage followup. (#6186)
Followup to HADOOP-18889 third party store support;

Fix some minor review comments which came in after the merge.
2023-10-24 18:17:52 +01:00
PJ Fanning
0042544bf2
HADOOP-18949. upgrade maven dependency plugin due to CVE-2021-26291. (#6219)
Addresses CVE-2021-26291. "Origin Validation Error in Apache Maven"

Contributed by PJ Fanning.
2023-10-24 12:28:40 +01:00
slfan1989
9c7e5b66fa
YARN-11576. Improve FederationInterceptorREST AuditLog. (#6117) Contributed by Shilun Fan.
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2023-10-24 09:36:06 +08:00
slfan1989
80a22a736e
YARN-11500. [Addendum] Fix typos in hadoop-yarn-server-common#federation. (#6212) Contributed by Shilun Fan.
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2023-10-24 09:28:05 +08:00
huhaiyang
9d48af8d70
HADOOP-18868. Optimize the configuration and use of callqueue overflow trigger failover (#5998) 2023-10-23 14:06:02 -07:00
Zita Dombi
4c04818d3d
HADOOP-18919. Zookeeper SSL/TLS support in HDFS ZKFC (#6194) 2023-10-23 11:03:15 -07:00
huhaiyang
5eeab5e1b9
HDFS-17235. Fix javadoc errors in BlockManager (#6214). Contributed by Haiyang Hu. 2023-10-23 20:12:39 +05:30
jianghuazhu
6e13e4addc
HDFS-17228. Improve documentation related to BlockManager. (#6195). Contributed by JiangHua Zhu.
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-10-23 20:05:33 +05:30
Ayush Saxena
fbd653be9b
Revert "HDFS-17228. Improve documentation related to BlockManager. (#6195). Contributed by JiangHua Zhu."
This reverts commit 81ba2e8484.
2023-10-23 19:35:12 +05:30
Steve Loughran
3e0fcda7a5
HADOOP-18945. S3A. IAMInstanceCredentialsProvider failing. (#6202)
This restores asynchronous retrieval/refresh of any AWS credentials provided by the
EC2 instance/container in which the process is running.

Contributed by Steve Loughran
2023-10-23 14:24:30 +01:00
slfan1989
d7d772d684
YARN-11595. Fix hadoop-yarn-client#java.lang.NoClassDefFoundError (#6210) Contributed by Shilun Fan.
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2023-10-22 22:22:14 +08:00
Masatake Iwasaki
24fe1ef4dd HADOOP-18942 addendum. update LICENSE-binary. 2023-10-22 22:22:56 +09:00
Viraj Jasani
acaf8ef3ca
HADOOP-18918. ITestS3GuardTool fails if SSE/DSSE encryption is used (#6165)
HADOOP-18918. ITestS3GuardTool fails if SSE/DSSE encryption is used.

Contributed by Viraj Jasani.
2023-10-20 10:47:44 +01:00
Steve Loughran
215cb15beb
HADOOP-18946. TestErrorTranslation failure (#6205)
Fixes TestErrorTranslation.testMultiObjectExceptionFilledIn() failure
which came in with HADOOP-18939.

Contributed by Steve Loughran
2023-10-20 10:13:05 +01:00
PeterWright
9a411fcf9d
HADOOP-18941. Modify HBase version in BUILDING.txt (#6206) 2023-10-20 16:20:17 +09:00
Masatake Iwasaki
8bf72346a5
HADOOP-18942. Upgrade ZooKeeper to 3.7.2. (#6200)
Signed-off-by: Masatake Iwasaki <iwasakims@apache.org>
2023-10-19 18:47:45 +09:00
GuoPhilipse
615a2a42cf
HDFS-17220. Fix same available space policy in AvailableSpaceVolumeChoosingPolicy (#6174)
Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org>
Reviewed-by: zhangshuyan <zqingchai@gmail.com>
Signed-off-by: Tao Li <tomscut@apache.org>
2023-10-18 13:01:32 +08:00
Masatake Iwasaki
13843f4a88
HADOOP-18867. Upgrade ZooKeeper to 3.6.4. (#5988) 2023-10-18 10:31:41 +09:00
jianghuazhu
81ba2e8484
HDFS-17228. Improve documentation related to BlockManager. (#6195). Contributed by JiangHua Zhu.
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-10-18 05:05:33 +05:30
Steve Loughran
e0563fed50
HADOOP-18908. Improve S3A region handling. (#6187)
S3A region logic improved for better inference and
to be compatible with previous releases

1. If you are using an AWS S3 AccessPoint, its region is determined
   from the ARN itself.
2. If fs.s3a.endpoint.region is set and non-empty, it is used.
3. If fs.s3a.endpoint is an s3.*.amazonaws.com url, 
   the region is determined by by parsing the URL 
   Note: vpce endpoints are not handled by this.
4. If fs.s3a.endpoint.region==null, and none could be determined
   from the endpoint, use us-east-2 as default.
5. If fs.s3a.endpoint.region=="" then it is handed off to
   The default AWS SDK resolution process.

Consult the AWS SDK documentation for the details on its resolution
process, knowing that it is complicated and may use environment variables,
entries in ~/.aws/config, IAM instance information within
EC2 deployments and possibly even JSON resources on the classpath.
Put differently: it is somewhat brittle across deployments.

Contributed by Ahmar Suhail
2023-10-17 15:37:36 +01:00
Steve Loughran
e5eb404bb3
HADOOP-18939. NPE in AWS v2 SDK RetryOnErrorCodeCondition.shouldRetry() (#6193)
MultiObjectDeleteException to fill in the error details

See also: https://github.com/aws/aws-sdk-java-v2/issues/4600

Contributed by Steve Loughran
2023-10-17 15:17:16 +01:00
Steve Loughran
42e695d510
HADOOP-18932. S3A. upgrade AWS v2 SDK to 2.20.160 and v1 to 1.12.565 (#6178)
v1 => 1.12.565
v2 => 2.20.160
Only the v2 one is distributed; v1 is needed in deployments only to support v1 credential providers

Contributed by Steve Loughran
2023-10-17 12:59:50 +01:00
Szilard Nemeth
2736f88561 YARN.11590. RM process stuck after calling confStore.format() when ZK SSL/TLS is enabled, as netty thread waits indefinitely. Contributed by Ferenc Erdelyi 2023-10-16 15:17:58 -04:00
GuoPhilipse
c8abca3004
HDFS-17210. Optimize AvailableSpaceBlockPlacementPolicy. (#6113). Contributed by GuoPhilipse.
Reviewed-by:  He Xiaoqiao <hexiaoqiao@apache.org>
Signed-off-by: Shuyan Zhang <zhangshuyan@apache.org>
2023-10-16 16:34:40 +08:00
slfan1989
00f8cdcb0f
YARN-11571. [GPG] Add Information About YARN GPG in Federation.md (#6158) Contributed by Shilun Fan.
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2023-10-14 10:00:28 +08:00
jianghuazhu
8963b25ab3
HADOOP-18926.Add some comments related to NodeFencer. (#6162) 2023-10-13 15:34:44 -07:00
PJ Fanning
3d7b58d8a5
HADOOP-18916. Exclude all module-info classes from uber jars (#6131)
Removes java9 and java11 from all modules pulled into the hadoop-client
and hadoop-client-minicluster modules.

Contributed by PJ Fanning
2023-10-13 20:01:44 +01:00
Steve Loughran
9bc159f4ac
HADOOP-18487. Make protobuf 2.5 an optional runtime dependency. (#4996)
Protobuf 2.5 JAR is no longer needed at runtime. 

The option common.protobuf.scope defines whether the protobuf 2.5.0
dependency is marked as provided or not.

* New package org.apache.hadoop.ipc.internal for internal only protobuf classes
  ...with a ShadedProtobufHelper in there which has shaded protobuf refs
  only, so guaranteed not to need protobuf-2.5 on the CP
* All uses of org.apache.hadoop.ipc.ProtobufHelper have
  been replaced by uses of org.apache.hadoop.ipc.internal.ShadedProtobufHelper
* The scope of protobuf-2.5 is set by the option common.protobuf2.scope
  In this patch is it is still "compile"
* There is explicit reference to it in modules where it may be needed.
*  The maven scope of the dependency can be set with the common.protobuf2.scope
   option. It can be set to "provided" in a build:
       -Dcommon.protobuf2.scope=provided
* Add new ipc(callable) method to catch and convert shaded protobuf
  exceptions raised during invocation of the supplied lambda expression
* This is adopted in the code where the migration is not traumatically
  over-complex. RouterAdminProtocolTranslatorPB is left alone for this
  reason.

Contributed by Steve Loughran
2023-10-13 13:48:38 +01:00
Steve Loughran
81edbebdd8
HADOOP-18889. S3A v2 SDK third party support (#6141)
Tune AWS v2 SDK changes based on testing with third party stores
including GCS. 

Contains HADOOP-18889. S3A v2 SDK error translations and troubleshooting docs

* Changes needed to work with multiple third party stores
* New third_party_stores document on how to bind to and test
  third party stores, including google gcs (which works!)
* Troubleshooting docs mostly updated for v2 SDK

Exception translation/resilience

* New AWSUnsupportedFeatureException for unsupported/unavailable errors
* Handle 501 method unimplemented as one of these
* Error codes > 500 mapped to the AWSStatus500Exception if no explicit
  handler.
* Precondition errors handled a bit better
* GCS throttle exception also recognized.
* GCS raises 404 on a delete of a file which doesn't exist: swallow it.
* Error translation uses reflection to create IOE of the right type.
  All IOEs at the bottom of an AWS stack chain are regenerated.
  then a new exception of that specific type is created, with the top level ex
  its cause. This is done to retain the whole stack chain.
* Reduce the number of retries within the AWS SDK
* And those of s3a code.
* S3ARetryPolicy explicitly declare SocketException as connectivity failure
  but subclasses BindException
* SocketTimeoutException also considered connectivity  
* Log at debug whenever retry policies looked up
* Reorder exceptions to alphabetical order, with commentary
* Review use of the Invoke.retry() method 

 The reduction in retries is because its clear when you try to create a bucket
 which doesn't resolve that the time for even an UnknownHostException to
 eventually fail over 90s, which then hit the s3a retry code.
 - Reducing the SDK retries means these escalate to our code better.
 - Cutting back on our own retries makes it a bit more responsive for most real
 deployments.
 - maybeTranslateNetworkException() and s3a retry policy means that
   unknown host exception is recognised and fails fast.

Contributed by Steve Loughran
2023-10-12 17:47:44 +01:00
huhaiyang
0ed484ac62
HDFS-17208. Add the metrics PendingAsyncDiskOperations in datanode (#6109). Contributed by Haiyang Hu.
Reviewed-by: Tao Li <tomscut@apache.org>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2023-10-12 23:27:15 +08:00
Kevin Risden
5c22934d90
HADOOP-18922. Race condition in ZKDelegationTokenSecretManager creating znode (#6150). Contributed by Kevin Risden.
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2023-10-12 23:21:26 +08:00