Commit Graph

27211 Commits

Author SHA1 Message Date
LiuGuH
2a1ee8dfcd
HDFS-17311. RBF: ConnectionManager creatorQueue should offer a pool that is not already in creatorQueue. (#6392) Contributed by liuguanghua.
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Reviewed-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2024-01-20 07:55:23 +08:00
slfan1989
15e1789baf
Revert "HDFS-16016. BPServiceActor to provide new thread to handle IBR (#2998)" (#6457) Contributed by Shilun Fan.
This reverts commit c1bf3cb0.

Reviewed-by: Takanobu Asanuma <tasanuma@apache.org>
Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org>
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
Reviewed-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2024-01-20 07:51:55 +08:00
Susheel Gupta
d0df0689b4
YARN-11607: TestTimelineAuthFilterForV2 fails intermittently (#6459) Contributed by Susheel Gupta.
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2024-01-20 07:42:08 +08:00
Jian Zhang
1036544480
HDFS-17302. RBF: ProportionRouterRpcFairnessPolicyController-Sharing and isolation. (#6380) 2024-01-19 14:02:21 -08:00
slfan1989
8444f69511
Preparing for 3.5.0 development (#6411)
Co-authored-by: slfan1989 <slfan1989@apache.org>
2024-01-19 15:05:22 +08:00
Xing Lin
27ecc23ae7
HDFS-17332 DFSInputStream: avoid logging stacktrace until when we really need to fail a read request with a MissingBlockException (#6446)
Print a warn log message for read retries and only print the full stack trace for a read request failure.

Contributed by: Xing Lin
2024-01-18 18:03:28 -08:00
Lei313
cc4c4be1b7
HDFS-17331:Fix Blocks are always -1 and DataNode version are always UNKNOWN in federationhealth.html (#6429). Contributed by lei w.
Signed-off-by: Shuyan Zhang <zhangshuyan@apache.org>
2024-01-18 21:10:54 +08:00
slfan1989
4c3d4e6a57
HADOOP-19038. Improve create-release RUN script. (#6448) Contributed by Shilun Fan.
Reviewed-by: Steve Loughran <stevel@cloudera.com>
Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2024-01-18 19:12:12 +08:00
PJ Fanning
04e447cfa7
YARN-11647. use StandardCharsets.UTF_8 (#6447) Contributed by PJ Fanning.
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2024-01-18 13:53:18 +08:00
hfutatzhanghb
ba6ada73ac
HDFS-17337. RPC RESPONSE time seems not exactly accurate when using FSEditLogAsync. (#6439). Contributed by farmmamba.
Reviewed-by: Tao Li <tomscut@apache.org>
Signed-off-by:  Shuyan Zhang <zhangshuyan@apache.org>
2024-01-18 11:10:05 +08:00
Steve Loughran
eeb657e85f
HADOOP-19033. S3A: disable checksums when fs.s3a.checksum.validation = false (#6441)
Add new option fs.s3a.checksum.validation, default false, which
is used when creating s3 clients to enable/disable checksum
validation.

When false, GET response processing is measurably faster.

Contributed by Steve Loughran.
2024-01-17 18:34:14 +00:00
Hexiaoqiao
9634bd31e6
HADOOP-19031. Enhance access control for RunJar. (#6427). Contributed by He Xiaoqiao.
Signed-off-by: Shuyan Zhang <zhangshuyan@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2024-01-17 15:00:06 +08:00
Mukund Thakur
7b1570e2f1
HADOOP-19015. Increase fs.s3a.connection.maximum to 500 to minimize risk of Timeout waiting for connection from pool. (#6372)
HADOOP-19015.  Increase fs.s3a.connection.maximum to 500 to minimize the risk of Timeout waiting for connection from the pool

Contributed By: Mukund Thakur
2024-01-16 17:06:28 -06:00
Steve Loughran
d378853790
HADOOP-18975 S3A: Add option fs.s3a.endpoint.fips to use AWS FIPS endpoints (#6277)
Adds a new option `fs.s3a.endpoint.fips` to switch the SDK client to use
FIPS endpoints, as an alternative to explicitly declaring them.


* The option is available as a path capability for probes.
* SDK v2 itself doesn't know that some regions don't have FIPS endpoints
* SDK only fails with endpoint + fips flag as a retried exception; wit this
  change the S3A client should fail fast.
  PR fails fast.
* Adds a new "connecting.md" doc; moves existing docs there and restructures.
* New Tests in ITestS3AEndpointRegion

bucket-info command support:

* added to list of path capabilities
* added -fips flag and test for explicit probe
* also now prints bucket region
* and removed some of the obsolete s3guard options
* updated docs

Contributed by Steve Loughran
2024-01-16 14:16:12 +00:00
Steve Loughran
36198b5edf
HADOOP-19027. S3A: S3AInputStream doesn't recover from HTTP/channel exceptions (#6425)
Differentiate from "EOF out of range/end of GET" from
"EOF channel problems" through
two different subclasses of EOFException and input streams to always
retry on http channel errors; out of range GET requests are not retried.
Currently an EOFException is always treated as a fail-fast call in read()

This allows for all existing external code catching EOFException to handle
both, but S3AInputStream to cleanly differentiate range errors (map to -1)
from channel errors (retry)

- HttpChannelEOFException is subclass of EOFException, so all code
  which catches EOFException is still happy.
  retry policy: connectivityFailure
- RangeNotSatisfiableEOFException is the subclass of EOFException
  raised on 416 GET range errors.
  retry policy: fail
- Method ErrorTranslation.maybeExtractChannelException() to create this
  from shaded/unshaded NoHttpResponseException, using string match to
  avoid classpath problems.
- And do this for SdkClientExceptions with OpenSSL error code WFOPENSSL0035.
  We believe this is the OpenSSL equivalent.
- ErrorTranslation.maybeExtractIOException() to perform this translation as
  appropriate.

S3AInputStream.reopen() code retries on EOF, except on
 RangeNotSatisfiableEOFException,
 which is converted to a -1 response to the caller
 as is done historically.

S3AInputStream knows to handle these with
 read(): HttpChannelEOFException: stream aborting close then retry
 lazySeek(): Map RangeNotSatisfiableEOFException to -1, but do not map
  any other EOFException class raised.

This means that
* out of range reads map to -1
* channel problems in reopen are retried
* channel problems in read() abort the failed http connection so it
  isn't recycled

Tests for this using/abusing mocking.

Testing through actually raising 416 exceptions and verifying that
readFully(), char read() and vector reads are all good.

There is no attempt to recover within a readFully(); there's
a boolean constant switch to turn this on, but if anyone does
it a test will spin forever as the inner PositionedReadable.read(position, buffer, len)
downgrades all EOF exceptions to -1.
A new method would need to be added which controls whether to downgrade/rethrow
exceptions.

What does that mean? Possibly reduced resilience to non-retried failures
on the inner stream, even though more channel exceptions are retried on.

Contributed by Steve Loughran
2024-01-16 14:14:03 +00:00
slfan1989
6652922333
HADOOP-19040. mvn site commands fails due to MetricsSystem And MetricsSystemImpl changes. (#6450) Contributed by Shilun Fan.
Reviewed-by: Steve Loughran <stevel@cloudera.com>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2024-01-16 22:11:16 +08:00
slfan1989
827e33601e
YARN-11638. [GPG] GPG Support CLI. (#6396) Contributed by Shilun Fan.
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2024-01-16 21:49:51 +08:00
Benjamin Teke
f6fea5da2a
MAPREDUCE-7468. [Addendum] Fix TestMapReduceChildJVM unit tests. (#6451) 2024-01-15 14:24:56 +01:00
slfan1989
6ebce65ae8
YARN-11634. [Addendum] Speed-up TestTimelineClient. (#6419)
Co-authored-by: slfan1989 <slfan1989@apache.org>
2024-01-15 08:44:17 +01:00
slfan1989
0f8b74b03f
HADOOP-19034. Fix Download Maven Url Not Found. (#6438). Contributed by Shilun Fan.
Reviewed-by: Steve Loughran <stevel@apache.org>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2024-01-14 18:30:40 +08:00
hfutatzhanghb
a30681077b
HDFS-17291. DataNode metric bytesWritten is not totally accurate in some situations. (#6360). Contributed by farmmamba.
Reviewed-by: huangzhaobo <huangzhaobo99@126.com>
Signed-off-by:  Shuyan Zhang <zhangshuyan@apache.org>
2024-01-13 20:45:00 +08:00
hfutatzhanghb
ead7b7f565
HDFS-17289. Considering the size of non-lastBlocks equals to complete block size can cause append failure. (#6357). Contributed by farmmamba.
Reviewed-by: Haiyang Hu <haiyang.hu@shopee.com>
Reviewed-by: huangzhaobo <huangzhaobo99@126.com>
Signed-off-by:  Shuyan Zhang <zhangshuyan@apache.org>
2024-01-13 20:34:02 +08:00
Steve Loughran
2f1e1558b6
HADOOP-19004. S3A: Support Authentication through HttpSigner API (#6324)
Move to the new auth flow based signers for aws. * Implement a new Signer Initialization Chain
* Add a new instantiation method
* Add a new test
* Fix Reflection Code for SignerInitialization

Contributed by Harshit Gupta
2024-01-11 17:13:31 +00:00
Xing Lin
453e264eb4
HADOOP-18981. Move oncrpc and portmap packages to hadoop-common (#6280)
Move the org.apache.hadoop.{oncrpc, portmap} packages from the hadoop-nfs module
to the hadoop-common module.

This allows for use of the protocol beyond just NFS -including within HDFS itself.

Contributed by Xing Lin
2024-01-11 14:06:15 +00:00
Benjamin Teke
ef636c4278
MAPREDUCE-7468: Change add-opens flag's default value from true to false (#6436)
Co-authored-by: Benjamin Teke <bteke@cloudera.com>
2024-01-11 14:51:59 +01:00
hfutatzhanghb
6a053765ee
HDFS-17312. packetsReceived metric should ignore heartbeat packet. (#6394)
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2024-01-11 22:08:37 +09:00
Tamas Domok
55b9f87698
YARN-11646. Do not ignore zero memory capacity config in QueueCapacityConfigParser. (#6433) 2024-01-11 13:47:00 +01:00
slfan1989
bc159b5a87
YARN-10125. [Federation] Kill application from client does not kill Unmanaged AM's and containers launched by Unmanaged AM. (#6363) Contributed by Shilun Fan.
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2024-01-11 20:01:59 +08:00
xuzifu666
99a59ae9e6
HDFS-17317. Improve the resource release for metaOut in DebugAdmin (#6402). Contributed by xy.
Reviewed-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org
2024-01-07 00:59:31 +05:30
slfan1989
64beecb7cb
YARN-11631. [GPG] Add GPGWebServices. (#6354) Contributed by Shilun Fan.
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2024-01-06 17:50:20 +08:00
slfan1989
60033fd581
YARN-11642. Fix Flaky Test TestTimelineAuthFilterForV2#testPutTimelineEntities. (#6417) Contributed by Shilun Fan.
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2024-01-06 16:26:01 +08:00
huangzhaobo
08713665c0
HDFS-17315. Optimize the namenode format code logic. (#6400)
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2024-01-06 01:47:17 +09:00
LiuGuH
5f9932acc4
HDFS-17325. Fix the documentation of fs expunge command in FileSystemShell.md. (#6413) Contributed by liuguanghua.
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2024-01-05 18:42:55 +08:00
LiuGuH
2369f0cddb
HDFS-17309. RBF: Fix Router Safemode check condition error (#6390) Contributed by liuguanghua.
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Reviewed-by: Simbarashe Dzinamarira <sdzinamarira@linkedin.com>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2024-01-05 18:36:28 +08:00
Lei Yang
661c784662
HDFS-17290: Adds disconnected client rpc backoff metrics (#6359) 2024-01-04 20:24:10 -08:00
LiuGuH
7d3b6a36b8
HDFS-17306. RBF: Router should not return nameservices that does not enable observer nodes in RpcResponseHeaderProto (#6385) 2024-01-04 14:43:11 -08:00
hfutatzhanghb
8c26d4e9e0
HDFS-17322. Renames RetryCache#MAX_CAPACITY to be MIN_CAPACITY to fit usage. 2024-01-04 14:31:53 -08:00
hfutatzhanghb
d5468d84ba
HDFS-17283. Change the name of variable SECOND in HdfsClientConfigKeys. (#6339). Contributed by farmmamba.
Reviewed-by: Xing Lin <xinglin@linkedin.com>
Signed-off-by: Shuyan Zhang <zhangshuyan@apache.org>
2024-01-04 19:53:47 +08:00
huhaiyang
7a7db7f0dc
HDFS-17310. DiskBalancer: Enhance the log message for submitPlan (#6391) Contributed by Haiyang Hu.
Reviewed-by: Ashutosh Gupta <ashugpt@amazon.com>
Reviewed-by: Takanobu Asanuma <tasanuma@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2024-01-04 00:07:51 +08:00
LiuGuH
335587df9e
Add synchronized on lockLeakCheck() because threadCountMap is not thread safe. (#6029)
Co-authored-by: lgh <liuguanghua@kanzhun.com>
2024-01-03 21:12:38 +08:00
Anuj Modi
e3c135b0b3
HADOOP-18971. [ABFS] Read and cache file footer with fs.azure.footer.read.request.size (#6270)
The option fs.azure.footer.read.request.size sets the size of the footer to
read and cache; the default value of 524288 has been measured to
be good for most workloads running on parquet, ORC and similar file formats.

Contributed by Anuj Modi
2024-01-03 12:49:52 +00:00
slfan1989
556fbcf025
YARN-11632. [Doc] Add allow-partial-result description to Yarn Federation documentation. (#6340) Contributed by Shilun Fan.
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2024-01-03 07:17:37 +08:00
Pranav Saxena
0b43026cab
HADOOP-17912. ABFS: Support for Encryption Context (#6221)
Contributed by Pranav Saxena and others.
2024-01-01 19:09:44 +00:00
Murali Krishna
9edcf42c78
HADOOP-18540. Upgrade Bouncy Castle to 1.70 (#5166)
This addresses
- [sonatype-2021-4916] CWE-327: Use of a Broken or Risky Cryptographic Algorithm
- [sonatype-2019-0673] CWE-400: Uncontrolled Resource Consumption ('Resource Exhaustion')

Contributed by Murali Krishna
2024-01-01 19:04:06 +00:00
Ayush Saxena
9a4d10763c
HADOOP-19020. Update the year to 2024. (#6397). Contributed by Ayush Saxena.
Reviewed-by: Ashutosh Gupta <ashugpt@amazon.com>
Reviewed-by: Shilun Fan <slfan1989@apache.org>
2024-01-01 12:51:54 +05:30
zzccctv
9f76fba6a4
Delete invalid code logic in namenode format (#6323)
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2023-12-29 22:25:05 +09:00
Gautham B A
98656db736
HADOOP-19017. Setup pre-commit CI for Windows 10 (#5820)
* This PR adds a Jenkinsfile for pre-commit CI
  to validate the Hadoop PRs on Windows 10.
2023-12-29 16:41:10 +05:30
huangzhaobo
e26139beaa
HDFS-17301. Add read and write dataXceiver threads count metrics to datanode. (#6377)
Reviewed-by: hfutatzhanghb <hfutzhanghb@163.com>
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2023-12-29 12:43:46 +09:00
huhaiyang
3aceec711b
HDFS-17297. The NameNode should remove block from the BlocksMap if the block is marked as deleted (#6369)
Reviewed-by: ZanderXu <zanderxu@apache.org>
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2023-12-28 22:13:16 +09:00
zhtttylz
fc71dc3e94
HDFS-17284. Fix int overflow in calculating numEcReplicatedTasks and numReplicationTasks during block recovery (#6348)
Reviewed-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2023-12-27 10:43:10 +09:00