Commit Graph

2241 Commits

Author SHA1 Message Date
Ashutosh Gupta
0aa08ef543
HADOOP-18363. Fix bug preventing hadoop-metrics2 from emitting metrics to > 1 Ganglia servers (#4627)
* HADOOP-18363. Fix bug preventing hadoop-metrics2 from emitting metrics to > 1 Ganglia servers
2022-08-04 18:26:38 +05:30
Mukund Thakur
66dec9d322
HADOOP-18355. Update previous index properly while validating overlapping ranges. (#4647)
part of HADOOP-18103.

Contributed By: Mukund Thakur
2022-08-04 04:08:04 +05:30
Mukund Thakur
a5b12c8010
HADOOP-18227. Add input stream IOStats for vectored IO api in S3A. (#4636)
part of HADOOP-18103.

Contributed By: Mukund Thakur
2022-07-28 21:57:37 +05:30
KevinWikant
213ea03758
YARN-11210. Fix YARN RMAdminCLI retry logic for non-retryable kerbero… (#4563)
Co-authored-by: Kevin Wikant <wikak@amazon.com>
2022-07-26 09:21:37 +05:30
xuzq
2c96357051
HDFS-15079. RBF: Namenode needs to use the actual client Id and callId when going through RBF proxy. (#4530) 2022-07-23 22:19:37 +08:00
xuzq
8774f17868
HADOOP-13144. Enhancing IPC client throughput via multiple connections per user (#4542) 2022-07-15 14:18:46 -07:00
HerCath
4c4a940da2
HADOOP-18217. ExitUtil synchronized blocks reduced. #4255
Reduce the ExitUtil synchronized block scopes so System.exit 
and Runtime.halt calls aren't within their boundaries,
so ExitUtil wrappers do not block each other.

Enlarged catches to all Throwables (not just Exceptions).

Contributed by Remi Catherinot
2022-07-13 12:35:44 +01:00
lmccay
e11ba5930e
HADOOP-18074 - Partial/Incomplete groups list can be returned in LDAP… (#4503)
* HADOOP-18074 - Partial/Incomplete groups list can be returned in LDAP groups lookup
2022-07-11 01:03:44 -04:00
Ashutosh Gupta
a432925f74
HADOOP-18321.Fix when to read an additional record from a BZip2 text file split (#4521)
* HADOOP-18321.Fix when to read an additional record from a BZip2 text file split

Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com> and Reviewed by Akira Ajisaka.
2022-07-06 10:00:14 +05:30
slfan1989
073b8ea1d5
HADOOP-18284. Remove Unnecessary semicolon ';' (#4422). Contributed by fanshilun. 2022-06-29 15:20:41 +05:30
hchaverr
cf33164857
HDFS-16591. Setup JaasConfiguration in ZKCuratorManager when SASL is enabled
Fixes #4447
Signed-off-by: Owen O'Malley <oomalley@linkedin.com>
2022-06-28 16:44:02 -07:00
Ashutosh Gupta
dd819f7904
HADOOP-18271.Remove unused Imports in Hadoop Common project (#4392)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
2022-06-23 12:30:28 +05:30
Mukund Thakur
4d1f6f9b99 HADOOP-18106: Handle memory fragmentation in S3A Vectored IO. (#4445)
part of HADOOP-18103.
Handling memory fragmentation in S3A vectored IO implementation by
allocating smaller user range requested size buffers and directly
filling them from the remote S3 stream and skipping undesired
data in between ranges.
This patch also adds aborting active vectored reads when stream is
closed or unbuffer() is called.

Contributed By: Mukund Thakur
2022-06-22 17:29:32 +01:00
Mukund Thakur
0d49bd2004 HADOOP-18105 Implement buffer pooling with weak references (#4263)
part of HADOOP-18103.
Required for vectored IO feature. None of current buffer pool
implementation is complete. ElasticByteBufferPool doesn't use
weak references and could lead to memory leak errors and
DirectBufferPool doesn't support caller preferences of direct
and heap buffers and has only fixed length buffer implementation.

Contributed By: Mukund Thakur
2022-06-22 17:29:32 +01:00
Mukund Thakur
1408dd89a7 HADOOP-18107 Adding scale test for vectored reads for large file (#4273)
part of HADOOP-18103.

Contributed By: Mukund Thakur
2022-06-22 17:29:32 +01:00
Mukund Thakur
5db0f34e29 HADOOP-18104: S3A: Add configs to configure minSeekForVectorReads and maxReadSizeForVectorReads (#3964)
Part of HADOOP-18103.
Introducing fs.s3a.vectored.read.min.seek.size and fs.s3a.vectored.read.max.merged.size
to configure min seek and max read during a vectored IO operation in S3A connector.
These properties actually define how the ranges will be merged. To completely
disable merging set fs.s3a.max.readsize.vectored.read to 0.

Contributed By: Mukund Thakur
2022-06-22 17:29:32 +01:00
Mukund Thakur
2daf0a814f HADOOP-11867. Add a high-performance vectored read API. (#3904)
part of HADOOP-18103.
Add support for multiple ranged vectored read api in PositionedReadable.
The default iterates through the ranges to read each synchronously,
but the intent is that FSDataInputStream subclasses can make more
efficient readers especially in object stores implementation.

Also added implementation in S3A where smaller ranges are merged and
sliced byte buffers are returned to the readers. All the merged ranged are
fetched from S3 asynchronously.

Contributed By: Owen O'Malley and Mukund Thakur
2022-06-22 17:29:32 +01:00
Samrat
477b67a335
HADOOP-18266. Using HashSet/ TreeSet Constructor for hadoop-common (#4365)
* HADOOP-18266. Using HashSet/ TreeSet Constructor for hadoop-common

Co-authored-by: Deb <dbsamrat@3c22fba1b03f.ant.amazon.com>
2022-06-20 12:11:04 +05:30
Viraj Jasani
e38e13be03
HADOOP-18288. Total requests and total requests per sec served by RPC servers (#4431)
Reviewed-by: Steve Loughran <stevel@apache.org>
Signed-off-by: Tao Li <tomscut@apache.org>
2022-06-18 12:17:20 +08:00
Steve Loughran
e199da3fae
HADOOP-17833. Improve Magic Committer performance (#3289)
Speed up the magic committer with key changes being

* Writes under __magic always retain directory markers

* File creation under __magic skips all overwrite checks,
  including the LIST call intended to stop files being
	created over dirs.
* mkdirs under __magic probes the path for existence
  but does not look any further.  	

Extra parallelism in task and job commit directory scanning
Use of createFile and openFile with parameters which all for
HEAD checks to be skipped.

The committer can write the summary _SUCCESS file to the path
`fs.s3a.committer.summary.report.directory`, which can be in a
different file system/bucket if desired, using the job id as
the filename. 

Also: HADOOP-15460. S3A FS to add `fs.s3a.create.performance`

Application code can set the createFile() option
fs.s3a.create.performance to true to disable the same
safety checks when writing under magic directories.
Use with care.

The createFile option prefix `fs.s3a.create.header.`
can be used to add custom headers to S3 objects when
created.


Contributed by Steve Loughran.
2022-06-17 19:11:35 +01:00
HanleyYang
835f39cefc
HDFS-15878. RBF: Fix TestRouterWebHDFSContractCreate#testSyncable. (#4340). Contributed by Hanley Yang.
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-05-28 09:56:07 +05:30
Ritesh H Shukla
78008bc0ee
HADOOP-18245 Extend KMS related exceptions that get mapped to ConnectException (#4329) 2022-05-20 04:20:24 +08:00
slfan1989
f6fa5bd1aa
HADOOP-18229. Fix Hadoop-Common JavaDoc Errors (#4292)
Contributed by slfan1989
2022-05-18 12:12:04 +01:00
Lei Yang
6a95c3a039
HADOOP-18193:Support nested mount points in INodeTree
Fixes #4181

Signed-off-by: Owen O'Malley <oomalley@linkedin.com>
2022-05-11 17:01:21 -07:00
hchaverr
99a83fd4bd
HADOOP-18222. Prevent DelegationTokenSecretManagerMetrics from registering multiple times
Fixes #4266

Signed-off-by: Owen O'Malley <oomalley@linkedin.com>
2022-05-10 13:58:39 -07:00
hchaverri
d60262fe00
HADOOP-18167. Add metrics to track delegation token secret manager op… (#4092)
* HADOOP-18167. Add metrics to track delegation token secret manager operations
2022-04-26 16:20:11 +00:00
Steve Loughran
1b4dba99b5
HADOOP-16202. Enhanced openFile(): hadoop-common changes. (#2584/1)
This defines standard option and values for the
openFile() builder API for opening a file:

fs.option.openfile.read.policy
 A list of the desired read policy, in preferred order.
 standard values are
 adaptive, default, random, sequential, vector, whole-file

fs.option.openfile.length
 How long the file is.

fs.option.openfile.split.start
 start of a task's split

fs.option.openfile.split.end
 end of a task's split

These can be used by filesystem connectors to optimize their
reading of the source file, including but not limited to
* skipping existence/length probes when opening a file
* choosing a policy for prefetching/caching data

The hadoop shell commands which read files all declare "whole-file"
and "sequential", as appropriate.

Contributed by Steve Loughran.

Change-Id: Ia290f79ea7973ce8713d4f90f1315b24d7a23da1
2022-04-24 17:33:04 +01:00
Viraj Jasani
f70935522b
HADOOP-18188. Support touch command for directory (#4135)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-04-07 17:29:45 +09:00
zhongjingxiong
08e6d0ce60
HADOOP-18145. Fileutil's unzip method causes unzipped files to lose their original permissions (#4036)
Contributed by jingxiong zhong
2022-03-30 12:42:50 +01:00
Owen O'Malley
eb16421386 HDFS-16517 Distance metric is wrong for non-DN machines in 2.10. Fixed in HADOOP-16161, but
this test case adds value to ensure the two getWeight methods stay in sync.

Fixes #4091

Signed-off-by: Owen O'Malley <oomalley@linkedin.com>
2022-03-28 12:52:46 -07:00
PJ Fanning
61e809b245
HADOOP-13386. Upgrade Avro to 1.9.2 (#3990)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-03-26 20:31:16 +09:00
Steve Loughran
708a0ce21b
HADOOP-13704. Optimized S3A getContentSummary()
Optimize the scan for s3 by performing a deep tree listing,
inferring directory counts from the paths returned.

Contributed by Ahmar Suhail.

Change-Id: I26ffa8c6f65fd11c68a88d6e2243b0eac6ffd024
2022-03-22 13:21:12 +00:00
Abhishek Das
da9970dd69 HADOOP-18129: Change URI to String in INodeLink to reduce memory footprint of ViewFileSystem
Fixes #3996
2022-03-17 17:25:55 -07:00
Steve Loughran
9037f9a334
HADOOP-18162. hadoop-common support for MAPREDUCE-7341 Manifest Committer
* New statistic names in StoreStatisticNames
  (for joint use with s3a committers)
* Improvements to IOStatistics implementation classes
* RateLimiting wrapper to guava RateLimiter
* S3A committer Tasks moved over as TaskPool and
  added support for RemoteIterator
* JsonSerialization.load() to fail fast if source does not exist

+ tests.

This commit is a prerequisite for the main MAPREDUCE-7341 Manifest Committer
patch.

Contributed by Steve Loughran

Change-Id: Ia92e2ab5083ac3d8d3d713a4d9cb3e9e0278f654
2022-03-17 11:20:53 +00:00
Xing Lin
8b8158f02d
HADOOP-18144: getTrashRoot in ViewFileSystem should return a path in ViewFS.
To get the new behavior, define fs.viewfs.trash.force-inside-mount-point to be true.

If the trash root for path p is in the same mount point as path p,
and one of:
* The mount point isn't at the top of the target fs.
* The resolved path of path is root (eg it is the fallback FS).
* The trash root isn't in user's target fs home directory.
get the corresponding viewFS path for the trash root and return it.
Otherwise, use <mnt>/.Trash/<user>.

Signed-off-by: Owen O'Malley <oomalley@linkedin.com>
2022-03-14 11:29:48 -07:00
Mukund Thakur
672e380c4f
HADOOP-18112: Implement paging during multi object delete. (#4045)
Multi object delete of size more than 1000 is not supported by S3 and 
fails with MalformedXML error. So implementing paging of requests to 
reduce the number of keys in a single request. Page size can be configured
using "fs.s3a.bulk.delete.page.size" 

 Contributed By: Mukund Thakur
2022-03-11 13:05:45 +05:30
Gautham B A
d0fa9b5775
HADOOP-18155. Refactor tests in TestFileUtil (#4053) 2022-03-10 22:02:38 +05:30
Duo Zhang
db36747e83
HADOOP-17526 Use Slf4jRequestLog for HttpRequestLog (#4050)
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
2022-03-10 10:15:09 +08:00
jianghuazhu
589695c6a9
HDFS-16316.Improve DirectoryScanner: add regular file check related block. (#3861) 2022-02-22 10:15:19 +08:00
Steve Loughran
cae749b076
HADOOP-18136. Verify FileUtils.unTar() handling of missing .tar files.
Contributed by Steve Loughran

Change-Id: I73af19d2e2e41f4ba686c470726a80c3903a1950
2022-02-21 17:08:56 +00:00
Xing Lin
ca8ba24051 HADOOP-18110. ViewFileSystem: Add Support for Localized Trash Root
Fixes #3956
2022-02-10 16:43:04 -08:00
Steve Loughran
efdec92cab
HADOOP-18091. S3A auditing leaks memory through ThreadLocal references (#3930)
Adds a new map type WeakReferenceMap, which stores weak
references to values, and a WeakReferenceThreadMap subclass
to more closely resemble a thread local type, as it is a
map of threadId to value.

Construct it with a factory method and optional callback
for notification on loss and regeneration.

 WeakReferenceThreadMap<WrappingAuditSpan> activeSpan =
      new WeakReferenceThreadMap<>(
          (k) -> getUnbondedSpan(),
          this::noteSpanReferenceLost);

This is used in ActiveAuditManagerS3A for span tracking.

Relates to
* HADOOP-17511. Add an Audit plugin point for S3A
* HADOOP-18094. Disable S3A auditing by default.

Contributed by Steve Loughran.
2022-02-10 12:31:41 +00:00
Ayush Saxena
aeae5716cc
Revert "HADOOP-18024. SocketChannel is not closed when IOException happens in Server$Listener.doAccept (#3719)"
This reverts commit 6ed01585eb.

Breaks TestIPC#testIOEOnListenerAccept
2022-02-01 14:11:25 +05:30
Viraj Jasani
4faac58841
HADOOP-18089. Test coverage for Async profiler servlets (#3913)
Reviewed-by: Akira Ajisaka <akiraaj@amazon.com>
Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org>
2022-01-26 11:24:16 +08:00
Xing Lin
0d17b629ff
HADOOP-18093. Better exception handling for testFileStatusOnMountLink() in ViewFsBaseTest.java (#3918). Contributed by Xing Lin.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-01-25 19:40:18 +05:30
Viraj Jasani
f64fda0f00
HADOOP-18055. Async Profiler endpoint for Hadoop daemons (#3824)
Reviewed-by: Akira Ajisaka <aajisaka@apache.org>
2022-01-06 17:56:49 +08:00
jianghuazhu
43afd1753a
HDFS-16394.RPCMetrics increases the number of handlers in processing. (#3822) 2021-12-31 16:40:14 +08:00
Ashutosh Gupta
caab29ec88
HDFS-14099. Unknown frame descriptor when decompressing multiple frames (#3836)
Co-authored-by: xuzq <xuzengqiang@kuaishou.com>
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2021-12-28 21:44:38 +09:00
Dhananjay Badaya
4483607a4e
HADOOP-13500. Synchronizing iteration of Configuration properties object (#3775)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2021-12-17 16:05:46 +09:00
Haoze Wu
6ed01585eb
HADOOP-18024. SocketChannel is not closed when IOException happens in Server$Listener.doAccept (#3719) 2021-12-08 18:48:43 +09:00