Commit Graph

24805 Commits

Author SHA1 Message Date
Steve Loughran
94a0a04113
HADOOP-18136. Verify FileUtils.unTar() handling of missing .tar files.
Contributed by Steve Loughran

Change-Id: I3856afa821dbc8c2e3cb1cbe33793ec1734e2e24
2022-02-21 17:09:36 +00:00
PJ Fanning
a302a19b48 HADOOP-18126. update junit 5 version due to build issues (#3993)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit 5f6a294fab)
2022-02-17 14:07:57 +09:00
Chentao Yu
d14a7c6ee5 HADOOP-18109. Ensure that default permissions of directories under internal ViewFS directories are the same as directories on target filesystems. Contributed by Chentao Yu. (3953)
(cherry picked from commit 19d90e62fb)
2022-02-15 16:48:13 -08:00
litao
db67952f9f HDFS-16396. Reconfig slow peer parameters for datanode (#3827)
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
(cherry picked from commit 0c194f2157)
2022-02-16 09:45:07 +09:00
Takanobu Asanuma
4c57fb4d6b
HDFS-15745. Make DataNodePeerMetrics#LOW_THRESHOLD_MS and MIN_OUTLIER_DETECTION_NODES configurable. Contributed by Haibin Huang. (#3992)
(cherry picked from commit 1cd96e8dd8)

 Conflicts:
	hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java

Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
2022-02-16 09:42:43 +09:00
Akira Ajisaka
352656999f
YARN-10788. TestCsiClient fails (#3989)
Create unix domain socket in java.io.tmpdir instead of
test.build.dir to avoid 'File name too long' error.

Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
(cherry picked from commit 7fd90cdcbe)
2022-02-15 01:14:31 +09:00
GuoPhilipse
7512714475
HDFS-16449. Fix hadoop web site release notes and changelog not available (#3967)
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit b68964336d)
2022-02-14 05:40:16 +09:00
luoyuan3471
752a7b6d49
HADOOP-18044. Hadoop - Upgrade to jQuery 3.6.0 (#3791)
Co-authored-by: luoyuan <luoyuan@shopee.com>
(cherry picked from commit e2d620192a)
2022-02-11 23:18:25 +08:00
daimin
9071c9646c
Fix thread safety of EC decoding during concurrent preads (#3881)
(cherry picked from commit 0e74f1e467)
2022-02-11 10:20:45 +08:00
Ayush Saxena
5b47b9f360
HADOOP-18096. Distcp: Sync moves filtered file to home directory rather than deleting. (#3940). Contributed by Ayush Saxena.
Reviewed-by: Steve Loughran <stevel@apache.org>
Reviewed-by: stack <stack@apache.org>
2022-02-11 02:05:14 +05:30
Steve Loughran
088684ec60
HADOOP-18091. S3A auditing leaks memory through ThreadLocal references (#3930)
Adds a new map type WeakReferenceMap, which stores weak
references to values, and a WeakReferenceThreadMap subclass
to more closely resemble a thread local type, as it is a
map of threadId to value.

Construct it with a factory method and optional callback
for notification on loss and regeneration.

 WeakReferenceThreadMap<WrappingAuditSpan> activeSpan =
      new WeakReferenceThreadMap<>(
          (k) -> getUnbondedSpan(),
          this::noteSpanReferenceLost);

This is used in ActiveAuditManagerS3A for span tracking.

Relates to
* HADOOP-17511. Add an Audit plugin point for S3A
* HADOOP-18094. Disable S3A auditing by default.

Contributed by Steve Loughran.

Change-Id: Ibf7bb082fd47298f7ebf46d92f56e80ca9b2aaf8
2022-02-10 12:33:40 +00:00
Joey Krabacher
84de16028d
HADOOP-18114. Documentation correction in assumed_roles.md (#3949)
Fixes typo in hadoop-aws/assumed_roles.md

Contributed by Joey Krabacher

Change-Id: I2b77bd7793ae0433196b77042d5f400d0bcbe745
2022-02-09 10:47:24 +00:00
singer-bin
ce7cabb771
HDFS-16437 ReverseXML processor doesn't accept XML files without the … (#3926)
(cherry picked from commit 125e3b6160)
2022-02-06 13:36:57 +08:00
daimin
709e617a84
HDFS-16403. Improve FUSE IO performance by supporting FUSE parameter max_background (#3842)
Reviewed-by: Istvan Fajth <pifta@apache.org>
Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org>
(cherry picked from commit d69938994e)
2022-02-06 13:06:35 +08:00
Abhishek Das
8b03514eaf HADOOP-18100: Change scope of inner classes in InodeTree to make them accessible outside package
Fixes #3950

Signed-off-by: Owen O'Malley <omalley@apache.org>

Cherry-picked from 3684c7f6 by Owen O'Malley
2022-02-04 12:13:10 -08:00
Petre Bogdan Stolojan
87ff57765a
HADOOP-18085. S3 SDK Upgrade causes AccessPoint ARN endpoint mistranslation (#3902)
Part of HADOOP-17198. Support S3 Access Points.

HADOOP-18068. "upgrade AWS SDK to 1.12.132" broke the access point endpoint
translation.

Correct endpoints should start with "s3-accesspoint.", after SDK upgrade they start with
"s3.accesspoint-" which messes up tests + region detection by the SDK.

Contributed by Bogdan Stolojan

Change-Id: I0c0181628ab803afc39036003777eaec79aa378c
2022-02-04 16:22:24 +00:00
Petre Bogdan Stolojan
a8d7acf1a8
HADOOP-17951. Improve S3A checking of S3 Access Point existence (#3516)
Follow-on to HADOOP-17198. Support S3 Access Points

Contributed by Bogdan Stolojan

Change-Id: I0932476c64e1967eb0cb3e0f00060fac5d2bae72
2022-02-04 16:22:04 +00:00
Petre Bogdan Stolojan
664075f35d
HADOOP-17198. Support S3 Access Points (#3260)
Add support for S3 Access Points. This provides extra security as it
ensures applications are not working with buckets belong to third parties.

To bind a bucket to an access point, set the access point (ap) ARN,
which must be done for each specific bucket, using the pattern

fs.s3a.bucket.$BUCKET.accesspoint.arn = ARN

* The global/bucket option `fs.s3a.accesspoint.required` to
mandate that buckets must declare their access point.
* This is not compatible with S3Guard.

Consult the documentation for further details.

Contributed by Bogdan Stolojan

(this commit contains the changes to TestArnResource from HADOOP-18068,
 "upgrade AWS SDK to 1.12.132" so that it works with the later SDK.)

Change-Id: I3fac213e52ca6ec1c813effb8496c353964b8e1b
2022-02-04 16:21:35 +00:00
KevinWikant
7171e2190e HDFS-16443. Fix edge case where DatanodeAdminDefaultMonitor doubly enqueues a DatanodeDescriptor on exception (#3942)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit 089e06de21)
2022-01-31 13:19:40 +09:00
KevinWikant
5e2eac6c41
HDFS-16303. Improve handling of datanode lost while decommissioning (#3921)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-01-31 13:18:36 +09:00
secfree
110104da38 HDFS-16169. Fix TestBlockTokenWithDFSStriped#testEnd2End failure (#3850)
Reviewed-by: Fei Hui <feihui.ustc@gmail.com>
Reviewed-by: litao <tomleescut@gmail.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit 39cad5f28f)
2022-01-28 17:05:32 +09:00
Akira Ajisaka
8032b680fb YARN-10561. Upgrade node.js to 12.22.1 and yarn to 1.22.5 in YARN application catalog webapp (#2591)
Reviewed-by: Masatake Iwasaki <iwasakims@apache.org>
(cherry picked from commit 9cb535caf2)
2022-01-28 15:52:33 +09:00
litao
b5d2e00f81 HDFS-16427. Add debug log for BlockManager#chooseExcessRedundancyStriped (#3888)
(cherry picked from commit 6136d630a3)
2022-01-27 13:44:03 +09:00
Xing Lin
d613776b64
HADOOP-18093. Better exception handling for testFileStatusOnMountLink() in ViewFsBaseTest.java (#3918). Contributed by Xing Lin. (#3929)
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
(cherry picked from commit 0d17b629ff)
2022-01-26 21:55:32 +05:30
litao
ef1a2b478b HDFS-16398. Reconfig block report parameters for datanode (#3831)
(cherry picked from commit c2ff39006f)

 Conflicts:
	hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
2022-01-26 17:36:17 +09:00
Wei-Chiu Chuang
ff3a88b9c2
HDFS-16423. Balancer should not get blocks on stale storages (#3883) (#3924)
Reviewed-by: litao <tomleescut@gmail.com>
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
(cherry picked from commit db2c3200e6)

 Conflicts:
	hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestGetBlocks.java
	hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancerWithHANameNodes.java

Co-authored-by: qinyuren <1476659627@qq.com>
2022-01-26 11:54:13 +08:00
Bryan Beaudreault
bd13d73334 HDFS-16262. Async refresh of cached locations in DFSInputStream (#3527)
(cherry picked from commit 94b884ae55)
2022-01-25 11:43:47 +00:00
daimin
728ed10a7c
HDFS-16430. Add validation to maximum blocks in EC group when adding an EC policy (#3899). Contributed by daimin.
Reviewed-by: tomscut <litao@bigo.sg>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
(cherry picked from commit 5ef335da1e)

 Conflicts:
	hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ErasureCodingPolicyManager.java
2022-01-25 09:58:20 +08:00
Steve Loughran
4fd0389153
HADOOP-18094. Disable S3A auditing by default.
See HADOOP-18091. S3A auditing leaks memory through ThreadLocal references

* Adds a new option fs.s3a.audit.enabled to controls whether or not auditing
is enabled. This is false by default.

* When false, the S3A auditing manager is NoopAuditManagerS3A,
which was formerly only used for unit tests and
during filsystem initialization.

* When true, ActiveAuditManagerS3A is used for managing auditing,
allowing auditing events to be reported.

* updates documentation and tests.

This patch does not fix the underlying leak. When auditing is enabled,
long-lived threads will retain references to the audit managers
of S3A filesystem instances which have already been closed.

Contributed by Steve Loughran.

Change-Id: I671e594cd59e8ca77a1f65be791ad0ae9530b8d9
2022-01-24 14:04:23 +00:00
dependabot[bot]
55192570a1 YARN-11065. Bump follow-redirects from 1.13.3 to 1.14.7 in hadoop-yarn-ui (#3890)
Bumps [follow-redirects](https://github.com/follow-redirects/follow-redirects) from 1.13.3 to 1.14.7.
- [Release notes](https://github.com/follow-redirects/follow-redirects/releases)
- [Commits](https://github.com/follow-redirects/follow-redirects/compare/v1.13.3...v1.14.7)

---
updated-dependencies:
- dependency-name: follow-redirects
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit dae33cf935)
2022-01-20 21:45:53 +09:00
liubingxing
d6ff60df65
HDFS-16352. return the real datanode numBlocks in #getDatanodeStorageReport (#3714). Contributed by liubingxing.
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
(cherry picked from commit d8dea6f52a)
2022-01-20 18:47:00 +08:00
Anmol Asrani
9b221b9599
HADOOP-18084. ABFS: Add testfilePath while verifying test contents are read correctly (#3903)
Contributed by: Anmol Asrani

Change-Id: I6e71bf349a74032f453398c7ae66f9c3305be190
2022-01-19 10:18:05 +00:00
litao
f9c0bc094a HDFS-16399. Reconfig cache report parameters for datanode (#3841)
(cherry picked from commit e355646330)
2022-01-19 18:43:15 +09:00
litao
11fe5279b0 HDFS-16400. Reconfig DataXceiver parameters for datanode (#3843)
Reviewed-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
(cherry picked from commit f02374df92)
2022-01-19 18:42:48 +09:00
litao
cdaf4d89f9 HDFS-16331. Make dfs.blockreport.intervalMsec reconfigurable (#3676)
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
(cherry picked from commit 52ec65fd10)
2022-01-19 18:40:41 +09:00
Viraj Jasani
831c11c47a HDFS-16139. Update BPServiceActor Scheduler's nextBlockReportTime atomically (#3228). Contributed by Viraj Jasani.
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
(cherry picked from commit b038042ece)
2022-01-19 16:01:00 +09:00
qinyuren
1c71d6e9fe HDFS-16426. Fix nextBlockReportTime when trigger full block report force (#3887)
(cherry picked from commit fcb1076699)
2022-01-19 13:44:02 +09:00
Steve Loughran
8ccc586af6
HADOOP-17409. Remove s3guard from S3A module (#3534)
Completely removes S3Guard support from the S3A codebase.

If the connector is configured to use any metastore other than
the null and local stores (i.e. DynamoDB is selected) the s3a client
will raise an exception and refuse to initialize.

This is to ensure that there is no mix of S3Guard enabled and disabled
deployments with the same configuration but different hadoop releases
-it must be turned off completely.

The "hadoop s3guard" command has been retained -but the supported
subcommands have been reduced to those which are not purely S3Guard
related: "bucket-info" and "uploads".

This is major change in terms of the number of files
changed; before cherry picking subsequent s3a patches into
older releases, this patch will probably need backporting
first.

Goodbye S3Guard, your work is done. Time to die.

Contributed by Steve Loughran.
2022-01-18 18:04:48 +00:00
Steve Loughran
47ba977ca9
HADOOP-18068. upgrade AWS SDK to 1.12.132 (#3864)
With this update, the versions of key shaded dependencies are

  jackson    2.12.3
  httpclient 4.5.13

This backport patch does not include the TestArn changes needed
for the test to work with this version of the SDK; it is only
to be applied to branches without HADOOP-17198. "Support S3 Access Points".
If that patch is backported later, that test suite MUST be
updated to the latest version.

Contributed by Steve Loughran

Change-Id: I8d2b71781ee8472b16469531f9cd0de32dd3356f
2022-01-18 12:20:12 +00:00
Viraj Jasani
5e9e779ed2
HADOOP-17152. Provide Hadoop's own Lists utility to reduce dependency on Guava (#3061)
Change-Id: I52e55b9d9826ad661e9ad7dc15f007aa168f0fe1
Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org>
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2022-01-18 11:57:25 +00:00
Gera Shegalov
6c58f83b78 YARN-11055. Add missing newline in cgroups-operations.c (#3851)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit a94e9fcbde)
2022-01-17 16:21:13 +09:00
Xiangyi Zhu
b5e7f59e53
HDFS-16043. Add markedDeleteBlockScrubberThread to delete blocks asynchronously (#3882). Contributed by Xiangyi Zhu.
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2022-01-15 23:18:05 +08:00
Jackson Wang
926222a0d0 HDFS-16420. Avoid deleting unique data blocks when deleting redundancy striped blocks. (#3880)
Reviewed-by: litao <tomleescut@gmail.com>
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
(cherry picked from commit d8862822d2)
2022-01-14 22:40:58 +09:00
ahmarsuhail
6649c2813e
HADOOP-16223. Remove misleading fs.s3a.delegation.tokens.enabled prompt (#3879)
Contributed by Ahmar Suhail

Change-Id: I6a33043831a059325c58b0f76c925e52c6ae14f7
2022-01-12 17:27:53 +00:00
Mukund Thakur
60c1c6d93c HADOOP-18065. ExecutorHelper.logThrowableFromAfterExecute() is too noisy. (#3860)
Downgrading warn logs to debug in case of InterruptedException

Contributed By: Mukund Thakur
2022-01-10 13:52:02 +05:30
monthonk
7dd8e900f8
HADOOP-14334. S3 SSEC tests to downgrade when running against a mandatory encryption object store (#3870)
Contributed by Monthon Klongklaew

Change-Id: Ib275c9690bbc90170c6a442ded198fe006c20bc1
2022-01-09 18:06:27 +00:00
Ayush Saxena
5edb33b5ed
HADOOP-18056. DistCp: Filter duplicates in the source paths. (#3825). Contributed by Ayush Saxena.
Reviewed-by: tomscut <litao@bigo.sg>
Reviewed-by: Steve Loughran <stevel@apache.org>
2022-01-05 23:53:55 +05:30
Ashutosh Gupta
6b83fe4a00 HDFS-16410. Insecure Xml parsing in OfflineEditsXmlLoader (#3854)
Contributed by Ashutosh Gupta
2022-01-05 10:00:23 -08:00
liever18
3a82899493 HDFS-16408. Ensure LeaseRecheckIntervalMs is greater than zero (#3856)
(cherry picked from commit e1d0aa9ee5)
2022-01-05 15:46:21 +00:00
Wei-Chiu Chuang
350b51f287 Make upstream aware of 3.3.1 release 2022-01-04 14:48:49 -08:00