Commit Graph

25251 Commits

Author SHA1 Message Date
Steve Loughran
936e9e15d0
MAPREDUCE-7435. Manifest Committer OOM on abfs (#5519)
This modifies the manifest committer so that the list of files
to rename is passed between stages as a file of
writeable entries on the local filesystem.

The map of directories to create is still passed in memory;
this map is built across all tasks, so even if many tasks
created files, if they all write into the same set of directories
the memory needed is O(directories) with the
task count not a factor.

The _SUCCESS file reports on heap size through gauges.
This should give a warning if there are problems.

Contributed by Steve Loughran
2023-06-12 13:43:43 +01:00
hfutatzhanghb
0fd2b10bdc
HDFS-16898. Remove write lock for processCommandFromActor of DataNode to reduce impact on heartbeat. (#5408). Contributed by ZhangHB.
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2023-06-12 13:42:39 +08:00
Wei-Chiu Chuang
03a548d4e5
HADOOP-18646. Upgrade Netty to 4.1.89.Final to fix CVE-2022-41881 (#5435) (#5729)
This fixes CVE-2022-41881.

This also upgrades io.opencensus dependencies to 0.12.3

Contributed by Aleksandr Nikolaev

(cherry picked from commit 734f7abfb8)

 Conflicts:
	hadoop-project/pom.xml

Change-Id: I26b8961725706370ac5f0fa248d0b0333034a047

Co-authored-by: nao <56360298+nao-it@users.noreply.github.com>
2023-06-10 11:05:44 -07:00
monthonk
30dcd044c3
HADOOP-17386. Change default fs.s3a.buffer.dir to be under Yarn container path on yarn applications (#3908)
Co-authored-by: Monthon Klongklaew <monthonk@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2023-06-09 13:40:11 +01:00
Viraj Jasani
afb863acf4
HADOOP-18740. S3A prefetch cache blocks should be accessed by RW locks (#5675)
Contributed by Viraj Jasani
2023-06-08 16:34:41 +01:00
hfutatzhanghb
a804f37ed5
HDFS-17003. Erasure Coding: invalidate wrong block after reporting bad blocks from datanode (#5643). Contributed by hfutatzhanghb.
Reviewed-by: Stephen O'Donnell <sodonnel@apache.org>
Reviewed-by: zhangshuyan <zqingchai@gmail.com>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
(cherry picked from commit 0e6bd09ae3)
2023-06-08 18:22:57 +08:00
Doroszlai, Attila
60b37bbdf7
HADOOP-18433. Fix main thread name for . (#4838) (#5692)
(cherry picked from commit f68f1a4578)

Co-authored-by: zhengchenyu <zhengchenyu16@gmail.com>
2023-06-05 07:51:53 +02:00
Steve Loughran
c9f2c45209
HADOOP-18755. openFile builder new optLong() methods break hbase-filesystem (#5704)
This is a followup to
HADOOP-18724. Open file fails with NumberFormatException for S3AFileSystem

Contributed by Steve Loughran
2023-06-01 14:32:08 +01:00
hfutatzhanghb
c2385c021b
HDFS-16882. RBF: Add cache hit rate metric in MountTableResolver#getDestinationForPath (#5276) (#5423). Contributed by farmmamba
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Reviewed-by: Tao Li <tomscut@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-05-23 09:10:29 +05:30
huhaiyang
19a6762639
HDFS-17017. Fix the issue of arguments number limit in report command in DFSAdmin (#5667). Contributed by Haiyang Hu.
Reviewed-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-05-21 09:06:53 +05:30
NishthaShah
adb673fdaf
HDFS-17022. Fix the exception message to print the Identifier pattern (#5678). Contributed by Nishtha Shah.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-05-20 14:41:49 +05:30
Patrick GRANDJEAN
9029bba5dc
HADOOP-18652. Path.suffix raises NullPointerException (#5653). Contributed by Patrick Grandjean.
Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-05-19 05:17:40 +05:30
wangzhaohui
5cafa1fdb9
HDFS-17011. Fix the metric of "HttpPort" at DataNodeInfo (#5657). Contributed by Zhaohui Wang.
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-05-18 12:13:56 +05:30
Steve Loughran
ab594ec77e
HADOOP-18724. Open file fails with NumberFormatException for S3AFileSystem (#5611)
This:

1. Adds optLong, optDouble, mustLong and mustDouble
   methods to the FSBuilder interface to let callers explicitly
   passin long and double arguments.
2. The opt() and must() builder calls which take float/double values
   now only set long values instead, so as to avoid problems
   related to overloaded methods resulting in a ".0" being appended
   to a long value.
3. All of the relevant opt/must calls in the hadoop codebase move to
   the new methods
4. And the s3a code is resilient to parse errors in is numeric options
   -it will downgrade to the default.

This is nominally incompatible, but the floating-point builder methods
were never used: nothing currently expects floating point numbers.

For anyone who wants to safely set numeric builder options across all compatible
releases, convert the number to a string and then use the opt(String, String)
and must(String, String) methods.

Contributed by Steve Loughran
2023-05-16 13:41:17 +01:00
Viraj Jasani
949d5ca20b
HADOOP-18688. S3A audit header to include count of items in delete ops (#5621)
The auditor-generated http referrer URL now includes the count of keys
to delete in the "ks" query parameter

Contributed by Viraj Jasani
2023-05-16 10:41:52 +01:00
susheel-gupta
5e8663d0f5
YARN-11312: [UI2] Refresh buttons don't work after EmberJS upgrade (#5654) 2023-05-15 16:08:18 +02:00
Steve Loughran
0f42c311b8
HADOOP-18695. S3A: reject multipart copy requests when disabled (#5548)
Contributed by Steve Loughran.
2023-05-15 14:19:58 +01:00
HarshitGupta11
f312a0c784
HADOOP-18637: S3A to support upload of files greater than 2 GB using DiskBlocks (#5630) (#5641)
Contributed by Harshit Gupta.
2023-05-15 10:46:33 +01:00
Mukund Thakur
86ad35c94c Revert "HADOOP-18637. S3A to support upload of files greater than 2 GB using DiskBlocks (#5630)"
This reverts commit df209dd2e3.

Caused test failures because of incorrect merge conflict resolution.
2023-05-10 14:19:21 -05:00
HarshitGupta11
df209dd2e3
HADOOP-18637. S3A to support upload of files greater than 2 GB using DiskBlocks (#5630)
Contributed By: Harshit Gupta and Steve Loughran
2023-05-10 15:58:56 +01:00
rohit-kb
771c89a83a
HADOOP-18687. Remove json-smart dependency. (#5549 + #5524)
Contains 

* HADOOP-18687. hadoop-auth: remove unnecessary dependency on json-smart (#5524)
 Contributed by Michiel de Jong
* HADOOP-18687. Remove json-smart dependency. (#5549).
  Contributed by PJ Fanning.
2023-05-09 17:34:36 +01:00
Wei-Chiu Chuang
99312bdfdb
HADOOP-18671. Add recoverLease(), setSafeMode(), isFileClosed() as interfaces to hadoop-common (#5553) (#5619)
* HADOOP-18671. Add recoverLease(), setSafeMode(), isFileClosed() as interfaces to hadoop-common (#5553)

The HDFS lease APIs have been replicated as interfaces in hadoop-common so other filesystems can
also implement them.  Applications which use the leasing APIs should migrate to the new
interface where possible.

Contributed by Stephen Wu

(cherry picked from commit 0e46388474)

 Conflicts:
	hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java
	hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterRpc.java
	hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSUpgrade.java
	hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestEncryptionZones.java
	hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestViewDistributedFileSystem.java
	hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSImageWithAcl.java
	hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSImageWithSnapshot.java
	hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeRetryCacheMetrics.java
	hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestFSImageWithOrderedSnapshotDeletion.java
	hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestOrderedSnapshotDeletion.java
	hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotDeletion.java
	hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewer.java
	hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewerForErasureCodingPolicy.java

Co-authored-by: Tak Lon (Stephen) Wu <taklwu@apache.org>
2023-05-09 06:20:56 +08:00
Hexiaoqiao
f6850b89f3
YARN-11482. Fix bug of DRF comparision DominantResourceFairnessComparator2 in fair scheduler. (#5607). Contributed by Xiaoqiao He.
Reviewed-by: Shilun Fan <slfan1989@apache.org>
(cherry picked from commit 70c0aa342e)
2023-05-05 10:42:03 +08:00
Dongjoon Hyun
4670f9e8b0 HADOOP-18727. Fix WriteOperations.listMultipartUploads function description (#5613)
Contributed by Dongjoon Hyun
2023-05-04 13:06:07 +01:00
PJ Fanning
1756b492ca
HADOOP-18658. snakeyaml dependency: upgrade to v2.0 (#5595). Contributed by PJ Fanning.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-05-02 20:53:16 +05:30
Viraj Jasani
0ad7d7c677
HADOOP-18697. S3A prefetch: failure of ITestS3APrefetchingInputStream#testRandomReadLargeFile (#5580)
Contributed by Viraj Jasani
2023-05-02 15:45:37 +01:00
Ayush Saxena
a226016c52
HADOOP-18662. ListFiles with recursive fails with FNF. (#5477). Contributed by Ayush Saxena.
Reviewed-by: Steve Loughran <stevel@apache.org>
2023-05-02 20:12:22 +05:30
Pralabh Kumar
6b6bd82bf0
HADOOP-18715. Add debug log for getting details of tokenKindMap (#5608). Contributed by Pralabh Kumar.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-04-29 17:30:00 +05:30
fanluoo
408c5c53b1
HDFS-16897. Fix abundant Broken pipe exception in BlockSender (#5329). Contributed by fanluo.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-04-29 17:29:55 +05:30
Viraj Jasani
05edfee1f3
HADOOP-18399. S3A Prefetch - SingleFilePerBlockCache to use LocalDirAllocator (#5054)
Contributed by Viraj Jasani
2023-04-28 12:03:30 +01:00
Daniel Carl Jones
0e51a9b55e
HADOOP-18482. ITestS3APrefetchingInputStream to skip if CSV test file unavailable (#4983)
Contributed by Danny Jones
2023-04-28 12:03:30 +01:00
Steve Loughran
8fafc83749
HADOOP-18410. S3AInputStream.unbuffer() does not release http connections -prefetch changes(#4766)
Changes in HADOOP-18410 which are needed for the S3A prefetching stream; needed
as part of the HADOOP-18703 backport

Change-Id: Ib403ca793e29a4416e5d892f9081de5832da3b68
2023-04-28 12:03:30 +01:00
Ankit Saurabh
312b776833
HADOOP-18351. Reduce excess logging of errors during S3A prefetching reads (#5274)
Contributed by Ankit Saurabh
2023-04-28 12:03:30 +01:00
Viraj Jasani
a71c708d17
HADOOP-18189 S3APrefetchingInputStream to support status probes when closed (#5036)
Contributed by Viraj Jasani
2023-04-28 12:03:30 +01:00
Ashutosh Gupta
5ba5980731
HADOOP-18531. Fix assertion failure in ITestS3APrefetchingInputStream (#5149)
This patch MUST be applied to all branches containing HADOOP-18378
so as to ensure reliable test runs.

Contributed by Ashutosh Gupta
2023-04-28 12:03:30 +01:00
Alessandro Passaro
0f1a3f23a5
HADOOP-18378. Implement lazy seek in S3A prefetching. (#4955)
Make S3APrefetchingInputStream.seek() completely lazy. Calls to seek() will not affect the current buffer nor interfere with prefetching, until read() is called.

This change allows various usage patterns to benefit from prefetching, e.g. when calling readFully(position, buffer) in a loop for contiguous positions the intermediate internal calls to seek() will be noops and prefetching will have the same performance as in a sequential read.

Contributed by Alessandro Passaro.
2023-04-28 12:03:30 +01:00
Steve Loughran
bb08c90228
HADOOP-18416. fix ITestS3AIOStatisticsContext test failure (#4931)
Uncomment the S3ATestUtils-side part of the original patch.
2023-04-28 12:03:30 +01:00
Viraj Jasani
0fd36df1d2
HADOOP-18377. hadoop-aws build to add a -prefetch profile to run all tests with prefetching (#4914)
Contributed by Viraj Jasani
2023-04-28 12:03:30 +01:00
Viraj Jasani
76e243aacb
HADOOP-18466. Limit the findbugs suppression IS2_INCONSISTENT_SYNC to S3AFileSystem field (#4926)
Follow-on to HADOOP-18455.

Contributed by Viraj Jasani
2023-04-28 12:03:30 +01:00
Viraj Jasani
f07be3bec2
HADOOP-18455. S3A prefetching executor should be closed (#4879)
follow-on patch to HADOOP-18186. 

Contributed by: Viraj Jasani
2023-04-28 12:03:30 +01:00
Viraj Jasani
1c2c6785a0
HADOOP-18186. s3a prefetching to use SemaphoredDelegatingExecutor for submitting work (#4796)
Contributed by Viraj Jasani
2023-04-28 12:03:30 +01:00
Viraj Jasani
f00d77fda4
HADOOP-18380. fs.s3a.prefetch.block.size to be read through longBytesOption (#4762)
Contributed by Viraj Jasani.
2023-04-28 12:03:30 +01:00
Steve Loughran
4ce763a322
HADOOP-18028. High performance S3A input stream (#4752)
This is the the preview release of the HADOOP-18028 S3A performance input stream.
It is still stabilizing, but ready to test.

Contains

HADOOP-18028. High performance S3A input stream (#4109)
	Contributed by Bhalchandra Pandit.

HADOOP-18180. Replace use of twitter util-core with java futures (#4115)
	Contributed by PJ Fanning.

HADOOP-18177. Document prefetching architecture. (#4205)
	Contributed by Ahmar Suhail

HADOOP-18175. fix test failures with prefetching s3a input stream (#4212)
 Contributed by Monthon Klongklaew

HADOOP-18231.  S3A prefetching: fix failing tests & drain stream async.  (#4386)

	* adds in new test for prefetching input stream
	* creates streamStats before opening stream
	* updates numBlocks calculation method
	* fixes ITestS3AOpenCost.testOpenFileLongerLength
	* drains stream async
	* fixes failing unit test

	Contributed by Ahmar Suhail

HADOOP-18254. Disable S3A prefetching by default. (#4469)
	Contributed by Ahmar Suhail

HADOOP-18190. Collect IOStatistics during S3A prefetching (#4458)

	This adds iOStatisticsConnection to the S3PrefetchingInputStream class, with
	new statistic names in StreamStatistics.

	This stream is not (yet) IOStatisticsContext aware.

	Contributed by Ahmar Suhail

HADOOP-18379 rebase feature/HADOOP-18028-s3a-prefetch to trunk
HADOOP-18187. Convert s3a prefetching to use JavaDoc for fields and enums.
HADOOP-18318. Update class names to be clear they belong to S3A prefetching
	Contributed by Steve Loughran
2023-04-28 12:03:29 +01:00
PJ Fanning
040c23c768
HADOOP-18712. Upgrade to jetty 9.4.51 due to cve. Contributed by PJ Fanning. (#5574) (#5585)
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-04-26 18:51:04 +05:30
cxzl25
4d052a2456
HDFS-16672. Fix lease interval comparison in BlockReportLeaseManager (#4598). Contributed by dzcxzl.
Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org
2023-04-26 02:12:40 +05:30
Sebastian Baunsgaard
919c3f615b
HADOOP-18660. Filesystem Spelling Mistake (#5475).
Contributed by Sebastian Baunsgaard.

Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-04-25 19:59:54 +01:00
Steve Loughran
0b56be3ca4
MAPREDUCE-7437. MR Fetcher class to use an AtomicInteger to generate IDs. (#5579)
...as until now it wasn't thread safe

Contributed by Steve Loughran
2023-04-25 19:56:18 +01:00
Ayush Saxena
d7d36b9d2a
HADOOP-18689. Bump jettison from 1.5.3 to 1.5.4 in /hadoop-project (#5502) (#5586)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-04-25 21:26:59 +05:30
Steve Loughran
21cf507db3
HADOOP-17450. Add Public IOStatistics API -missed backport (#5590)
This cherrypicks SemaphoredDelegatingExecutor HADOOP-17450 changes
from trunk somehow they didn't get into the main IOStatistics backport
to branch-3.3
2023-04-25 15:02:56 +01:00
Tamas Domok
1b59e3123b
HADOOP-18705. ABFS should exclude incompatible credential providers. (#5560)
Contributed by Tamas Domok.
2023-04-24 15:48:02 +01:00