Commit Graph

24899 Commits

Author SHA1 Message Date
Steve Loughran
9db61adeda
HADOOP-16202. Enhanced openFile(): hadoop-aws changes. (#2584/3)
S3A input stream support for the few fs.option.openfile settings.
As well as supporting the read policy option and values,
if the file length is declared in fs.option.openfile.length
then no HEAD request will be issued when opening a file.
This can cut a few tens of milliseconds off the operation.

The patch adds a new openfile parameter/FS configuration option
fs.s3a.input.async.drain.threshold (default: 16000).
It declares the number of bytes remaining in the http input stream
above which any operation to read and discard the rest of the stream,
"draining", is executed asynchronously.
This asynchronous draining offers some performance benefit on seek-heavy
file IO.

Contributed by Steve Loughran.

Change-Id: I9b0626bbe635e9fd97ac0f463f5e7167e0111e39
2022-04-27 19:23:56 +01:00
Steve Loughran
e123de9f19
HADOOP-16202. Enhanced openFile(): mapreduce and YARN changes. (#2584/2)
These changes ensure that sequential files are opened with the
right read policy, and split start/end is passed in.

As well as offering opportunities for filesystem clients to
choose fetch/cache/seek policies, the settings ensure that
processing text files on an s3 bucket where the default policy
is "random" will still be processed efficiently.

This commit depends on the associated hadoop-common patch,
which must be committed first.

Contributed by Steve Loughran.

Change-Id: Ic6713fd752441cf42ebe8739d05c2293a5db9f94
2022-04-27 19:23:25 +01:00
Steve Loughran
75950e47e7
HADOOP-16202. Enhanced openFile(): hadoop-common changes. (#2584/1)
This defines standard option and values for the
openFile() builder API for opening a file:

fs.option.openfile.read.policy
 A list of the desired read policy, in preferred order.
 standard values are
 adaptive, default, random, sequential, vector, whole-file

fs.option.openfile.length
 How long the file is.

fs.option.openfile.split.start
 start of a task's split

fs.option.openfile.split.end
 end of a task's split

These can be used by filesystem connectors to optimize their
reading of the source file, including but not limited to
* skipping existence/length probes when opening a file
* choosing a policy for prefetching/caching data

The hadoop shell commands which read files all declare "whole-file"
and "sequential", as appropriate.

Contributed by Steve Loughran.

Change-Id: Ia290f79ea7973ce8713d4f90f1315b24d7a23da1
2022-04-27 19:23:10 +01:00
sumangala-patki
77eea7a11b
HADOOP-17682. ABFS: Support FileStatus input to OpenFileWithOptions() via OpenFileParameters (#2975)
Change-Id: I039a0c3cb1c9b603f7dd1be0df03f795525d92bc
2022-04-27 19:22:49 +01:00
Viraj Jasani
bb13e228bc
HADOOP-17956. Replace all default Charset usage with UTF-8 (#3529)
Change-Id: I0094a84619ce19acf340d8dd1040cfe9bd88184e
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-04-27 10:30:07 +01:00
hchaverri
4043ef0485 HADOOP-18167. Add metrics to track delegation token secret manager op… (#4092)
* HADOOP-18167. Add metrics to track delegation token secret manager operations
2022-04-26 14:28:25 -07:00
Ashutosh Gupta
f4290055c6 MAPREDUCE-7246. In MapredAppMasterRest#Mapreduce_Application_Master_Info_API, updating the datatype of appId to "string". (#4223)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit fb13c1e4a8)
2022-04-25 14:31:15 +09:00
daimin
b62a460fd9 HDFS-16519. Add throttler to EC reconstruction (#4101)
Reviewed-by: litao <tomleescut@gmail.com>
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
(cherry picked from commit aebd55f788)
2022-04-23 12:47:34 +09:00
litao
9132eeb4dd HDFS-16552. Fix NPE for TestBlockManager (#4210)
(cherry picked from commit 5ebbacc480)
2022-04-23 12:15:52 +09:00
Ashutosh Gupta
261b11f7db HADOOP-17564. Fix typo in UnixShellGuide.html (#4195)
contributed by Ashutosh Gupta
2022-04-22 18:00:58 +01:00
Boyina, Hemanth Kumar
30061940af
HADOOP-17588. CryptoInputStream#close() should be synchronized.
Contributed by RenukaPrasad C

Change-Id: Ia0a19ccc55a67e5434f0be23a500496bc7682f40
2022-04-21 22:11:37 +01:00
Giovambattista Vieri
c4de4add5b
HADOOP-18214. Update BUILDING.txt (#3811)
java-8-openjdk become openjdk-8-jdk (see both ubuntu and debian package's name)

Contributed by Giovambattista Vieri
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>

Change-Id: I63b2bbfdd575cf56d20cd6c8fff33a70cadda7f2
2022-04-21 18:39:51 +01:00
Ashutosh Gupta
2526dbf428
HADOOP-17551. Upgrade maven-site-plugin to 3.11.0 (#4196)
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit 56cfd60617)
2022-04-21 22:16:58 +09:00
S O'Donnell
5e137ac33e Revert "HDFS-16531. Avoid setReplication writing an edit record if old replication equals the new value (#4148). Contributed by Stephen O'Donnell."
This reverts commit 8ae033d1a3.
2022-04-20 20:45:17 +01:00
Xing Lin
7ae328949e HADOOP-18172: Changed scope for isRootInternalDir/getRootFallbackLink for InodeTree (#4106)
Co-authored-by: Xing Lin <xinglin@linkedin.com>
(cherry picked from commit 98b9c435f2)
2022-04-20 10:55:48 -07:00
Steve Loughran
caecec45f5
HADOOP-17650. Bump solr to unblock build failure with Maven 3.8.1 (#2939)
Reviewed-by: Siyao Meng <siyao@apache.org>

Contributed by Viraj Jasani
2022-04-20 16:36:51 +01:00
Dongjoon Hyun
af3558d61a
HADOOP-17341. Upgrade commons-codec to 1.15 (#2428)
Change-Id: Iab26db901570b507ab25ddbf316a9579a9e92620
Reviewed-by: Chao Sun <sunchao@apache.org>
Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org>
2022-04-20 12:29:00 +01:00
qinyuren
2ff91232bc HDFS-16544. EC decoding failed due to invalid buffer (#4179)
(cherry picked from commit 76bbd17374)
2022-04-20 15:07:38 +09:00
Stephen O'Donnell
8ae033d1a3 HDFS-16531. Avoid setReplication writing an edit record if old replication equals the new value (#4148). Contributed by Stephen O'Donnell.
(cherry picked from commit dbeeee0363)
2022-04-19 11:12:36 +01:00
qinyuren
c913dc3072 HDFS-16538. EC decoding failed due to not enough valid inputs (#4167)
Co-authored-by: liubingxing <liubingxing@bigo.sg>
(cherry picked from commit 52e152f8b0)
2022-04-19 13:38:58 +09:00
jianghuazhu
cfe2d8aa79
HDFS-16389.Improve NNThroughputBenchmark test mkdirs. (#3819)
Reviewed-by: Viraj Jasani <vjasani@apache.org>
Reviewed-by: Akira Ajisaka <aajisaka@apache.org>
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
(cherry picked from commit 900682e712)
2022-04-19 11:13:40 +08:00
Steve Loughran
51a532a532
HADOOP-18202. create-release fails fatal: unsafe repository (#4188)
Since April 2022/CVE-2022-24765, git refuses to work in directories
whose owner != the current user, unless explicitly told to trust it.

This patches the create-release script to trust the /build/source
dir mounted from the hosting OS, whose userid is inevitably different
from that of the account in the container running git.

Contributed by: Steve Loughran, Ayush Saxena and the new git error messages

Change-Id: I855a105e6d0ab533468f9436578c8d4f81b0840b
2022-04-18 19:28:44 +01:00
Ashutosh Gupta
1eb4f9ef04
HDFS-16536. TestOfflineImageViewer fails on branch-3.3 (#4182)
Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-04-18 16:05:42 +09:00
Quanlong Huang
9ae903dd1b
HDFS-16535. SlotReleaser should reuse the domain socket based on socket paths (#4158)
Reviewed-by: Lisheng Sun <sunlisheng@apache.org>
(cherry picked from commit 35d4c02bcc)
2022-04-18 10:35:08 +08:00
Daniel Carl Jones
eab586d566
HADOOP-18201. Remove endpoint config overrides for ITestS3ARequesterPays (#4169)
Contributed by Daniel Carl Jones.

Change-Id: Icf99cc878e2b0ba92df630a8aa578616f96e09cc
2022-04-14 16:47:24 +01:00
daimin
0ef1a13f01
HDFS-16509. Fix decommission UnsupportedOperationException (#4077). Contributed by daimin.
(cherry picked from commit c65c383b7e)
2022-04-14 16:16:36 +08:00
Takanobu Asanuma
52abc9f132 HDFS-16479. EC: NameNode should not send a reconstruction work when the source datanodes are insufficient (#4138)
(cherry picked from commit 2efab92959)
2022-04-14 11:50:42 +09:00
Steve Loughran
52c6d77274
HDFS-14478. Add libhdfs APIs for openFile (#4166)
Contributed by Sahil Takiar

Change-Id: I2f9e82a69010df7496704754515b031f2a907167
2022-04-13 16:01:53 +01:00
Akira Ajisaka
2112ef61e0
YARN-10553. Refactor TestDistributedShell (#4159)
(cherry picked from commit 890f2da624)

 Conflicts:
	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/TestDSWithMultipleNodeManager.java
	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/TestDistributedShell.java

Co-authored-by: Ahmed Hussein <50450311+amahussein@users.noreply.github.com>
2022-04-13 12:22:17 +09:00
qinyuren
07dface36a HDFS-16484. [SPS]: Fix an infinite loop bug in SPSPathIdProcessor thread (#4032)
(cherry picked from commit 45394433a1)
2022-04-13 11:48:19 +09:00
Steve Loughran
44e662272f
HADOOP-18198. Preparing for 3.3.4 development
Change-Id: I2bf19beb541739af22fced38c2545f09c4e1bd53
2022-04-12 14:09:08 +01:00
Viraj Jasani
e5516cdfaf HADOOP-18191. Log retry count while handling exceptions in RetryInvocationHandler (#4133)
(cherry picked from commit b69ede7154)
2022-04-11 15:23:55 +09:00
Hanisha Koneru
9da7d80c4e HADOOP-17116. Skip Retry INFO logging on first failover from a proxy
(cherry picked from commit e62d8f8412)
2022-04-11 15:19:18 +09:00
singer-bin
26705bbc60
HDFS-16457. Make fs.getspaceused.classname reconfigurable (apache#4069) (#4156) 2022-04-11 14:59:34 +09:00
Akira Ajisaka
603367c54f
HADOOP-18178. Upgrade jackson to 2.13.2 and jackson-databind to 2.13.2.2 (#4147)
(cherry picked from commit 4b786c797a)

 Conflicts:
	LICENSE-binary

Co-authored-by: PJ Fanning <pjfanning@users.noreply.github.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-04-11 14:58:28 +09:00
Takanobu Asanuma
30afe7ca20 HDFS-16497. EC: Add param comment for liveBusyBlockIndices with HDFS-14768. Contributed by caozhiqiang.
(cherry picked from commit 37650ced81)
2022-04-08 18:39:26 +09:00
Masatake Iwasaki
160b6d106d
HADOOP-18088. Replace log4j 1.x with reload4j. (#4052)
Co-authored-by: Wei-Chiu Chuang <weichiu@apache.org>
2022-04-07 08:33:13 +09:00
Stephen O'Donnell
bd0dbf319a HDFS-16530. setReplication debug log creates a new string even if debug is disabled (#4142)
(cherry picked from commit bbfe3500cf)
2022-04-06 11:54:58 +01:00
Viraj Jasani
b2eee14f2e HDFS-16522. Set Http and Ipc ports for Datanodes in MiniDFSCluster (#4108)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit 7c20602b17)
2022-04-06 18:55:09 +09:00
Viraj Jasani
a6fb77f7eb HDFS-16481. Provide support to set Http and Rpc ports in MiniJournalCluster (#4028). Contributed by Viraj Jasani.
(cherry picked from commit 278568203b)
2022-04-06 18:40:07 +09:00
wangzhaohui
0e621c890d HDFS-16529. Remove unnecessary setObserverRead in TestConsistentReadsObserver (#4131)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit 61bbdfd3a7)
2022-04-06 17:30:39 +09:00
Xing Lin
20483f6dc7
HADOOP-18169. getDelegationTokens in ViewFs should also fetch the token from fallback FS (#4094)
Signed-off-by: Owen O'Malley <oomalley@linkedin.com>
2022-03-31 15:17:26 -07:00
Xing Lin
ecafd38c09
HADOOP-18144: getTrashRoot in ViewFileSystem should return a path in ViewFS. (#4123)
To get the new behavior, define fs.viewfs.trash.force-inside-mount-point to be true.

If the trash root for path p is in the same mount point as path p,
and one of:
* The mount point isn't at the top of the target fs.
* The resolved path of path is root (eg it is the fallback FS).
* The trash root isn't in user's target fs home directory.
get the corresponding viewFS path for the trash root and return it.
Otherwise, use <mnt>/.Trash/<user>.

Signed-off-by: Owen O'Malley <oomalley@linkedin.com>
(cherry picked from commit 8b8158f02d)

Co-authored-by: Xing Lin <xinglin@linkedin.com>
2022-03-31 20:26:09 +00:00
litao
0ecb34f8f6
HDFS-16413. Reconfig dfs usage parameters for datanode (#3863) (#4125) 2022-03-31 19:24:05 +09:00
litao
cfca024190 HDFS-16507. [SBN read] Avoid purging edit log which is in progress (#4082) 2022-03-30 23:03:27 -07:00
Owen O'Malley
e24bd1c15b HDFS-16517 Distance metric is wrong for non-DN machines in 2.10. Fixed in HADOOP-16161, but
this test case adds value to ensure the two getWeight methods stay in sync.

Fixes #4091

Signed-off-by: Owen O'Malley <oomalley@linkedin.com>
2022-03-30 14:58:03 -07:00
Kengo Seki
85843f2158 MAPREDUCE-7373. Building MapReduce NativeTask fails on Fedora 34+ (#4120)
(cherry picked from commit dc4a680da8)
2022-03-30 13:49:45 +00:00
zhongjingxiong
1ee93f7947
HADOOP-18145. Fileutil's unzip method causes unzipped files to lose their original permissions (#4036)
Contributed by jingxiong zhong

Change-Id: Ie252e609798719dc658364f9bae48b34dc72a79c
2022-03-30 12:52:52 +01:00
Lei Yang
4f85c9a73b HDFS-16518: Add shutdownhook to invalidate the KeyProviders in the cache
Fixes #4100
Signed-off-by: Owen O'Malley <oomalley@linkedin.com>
2022-03-28 13:28:38 -07:00
Masatake Iwasaki
f8314cd469 Make upstream aware of 3.2.3 release.
(cherry picked from commit 0fbd96a244)
2022-03-28 08:08:59 +00:00