Commit Graph

5277 Commits

Author SHA1 Message Date
Mehakmeet Singh
d20b2deac3
HADOOP-17272. ABFS Streams to support IOStatistics API (#2604)
Contributed by Mehakmeet Singh.

Change-Id: I3445dec84b9b9e43bb1e41f709944ea05416bd74
2021-01-22 14:21:31 +00:00
He Xiaoqiao
26cd02fb29
HADOOP-16947. Stale record should be remove when MutableRollingAverages generating aggregate data. Contributed by Haibin Huang. 2021-01-19 23:29:45 +08:00
Steve Loughran
56576f080b
HADOOP-17451. IOStatistics test failures in S3A code. (#2594)
Caused by HADOOP-16830 and HADOOP-17271.

Fixes tests which fail intermittently based on configs and
in the case of the HugeFile tests, bulk runs with existing
FS instances meant statistic probes sometimes ended up probing those
of a previous FS.

Contributed by Steve Loughran.

Change-Id: I65ba3f44444e59d298df25ac5c8dc5a8781dfb7d
2021-01-14 13:21:20 +00:00
Steve Loughran
57abfae136
HADOOP-17450. Add Public IOStatistics API. (#2577)
This is the API and implementation classes of HADOOP-16830,
which allows callers to query IO object instances
(filesystems, streams, remote iterators, ...) and other classes
for statistics on their I/O Usage: operation count and min/max/mean
durations.

New Packages

org.apache.hadoop.fs.statistics.
  Public API, including:
    IOStatisticsSource
    IOStatistics
    IOStatisticsSnapshot (seralizable to java objects and json)
    +helper classes for logging and integration
    BufferedIOStatisticsInputStream
       implements IOStatisticsSource and StreamCapabilities
     BufferedIOStatisticsOutputStream
       implements IOStatisticsSource, Syncable and StreamCapabilities

org.apache.hadoop.fs.statistics.impl
  Implementation classes for internal use.

org.apache.hadoop.util.functional
  functional programming support for RemoteIterators and
  other operations which raise IOEs; all wrapper classes
  implement and propagate IOStatisticsSource

Contributed by Steve Loughran.

Change-Id: If56e8db2981613ff689c39239135e44feb25f78e
2021-01-14 13:20:17 +00:00
stack
b74d642220 Revert "HADOOP-16524. Reloading SSL keystore for both DataNode and NameNode (#2470)"
This reverts commit f7d2a5d7a52c41cba14b17eb0c9189d983f202cf.
2021-01-11 08:56:24 -08:00
He Xiaoqiao
e95ee67632
Make upstream aware of 3.2.2 release. 2021-01-09 18:07:10 +08:00
Michael Stack
f046ed27d6
HADOOP-16524. Reloading SSL keystore for both DataNode and NameNode (#2470) (#2609)
Co-authored-by: Borislav Iordanov <biordanov@apple.com>
Signed-off-by: stack <stack@apache.org>

Co-authored-by: Borislav Iordanov <borislav.iordanov@gmail.com>
Co-authored-by: Borislav Iordanov <biordanov@apple.com>
2021-01-08 13:45:44 -08:00
Ahmed Hussein
18e2835766 HADOOP-17408. Optimize NetworkTopology sorting block locations. (#2601). Contributed by Ahmed Hussein and Daryn Sharp.
(cherry picked from commit 77435a025e)
2021-01-08 19:29:14 +00:00
Steve Loughran
a2ae0d7079
Revert "HADOOP-17430. Restore ability to set Text to empty byte array (#2545)"
This reverts commit 9e85eb9a2e.

Change-Id: Id1ac803b29931b0f643cb37bbe58534726c36f1e
2021-01-08 10:50:28 +00:00
dgzdot
9e85eb9a2e HADOOP-17430. Restore ability to set Text to empty byte array (#2545)
Contributed by gaozhan.ding

Change-Id: Ib2ad9120c15c46a3fa2de9e3206875cbbc2363c2
2021-01-05 21:15:14 +00:00
Wei-Chiu Chuang
94c126cc9e HDFS-15719. [Hadoop 3] Both NameNodes can crash simultaneously due to the short JN socket timeout (#2533)
(cherry picked from commit 2b4febcf57)
2021-01-04 20:56:18 -08:00
Wei-Chiu Chuang
6340ac857b HADOOP-17371. Bump Jetty to the latest version 9.4.34. Contributed by Wei-Chiu Chuang. (#2453)
(cherry picked from commit 66ee0a6df0)
2021-01-04 11:28:26 -08:00
He Xiaoqiao
cfcd17ffe7
HDFS-15751. Add documentation for msync() API to filesystem.md. Contributed by Konstantin V Shvachko.
Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org>
Reviewed-by: Chao Sun <sunchao@apache.org>
2021-01-03 16:32:24 +08:00
Gautham B A
98fe00e208 HDFS-15699 Remove lz4 references in vcxproj (#2498) 2020-12-29 13:34:54 -08:00
Liang-Chi Hsieh
87064df1f2 HADOOP-17292. Using lz4-java in Lz4Codec (#2350)
Contributed by Liang-Chi Hsieh.
2020-12-29 13:17:26 -08:00
Masatake Iwasaki
b8a4361d7b HADOOP-17270. Fix testCompressorDecompressorWithExeedBufferLimit to c… (#2311) 2020-12-29 13:11:51 -08:00
He Xiaoqiao
3a860e876e HADOOP-17068. Client fails forever when namenode ipaddr changed. Contributed by Sean Chow.
(cherry picked from commit fa14e4bc00)
2020-12-15 14:01:48 -08:00
Chao Sun
81e533de8f
HADOOP-16080. hadoop-aws does not work with hadoop-client-api. Contributed by Chao Sun (#2522) 2020-12-12 09:37:13 -08:00
Jim Brennan
e5f11ea5b2 HADOOP-13571. ServerSocketUtil.getPort() should use loopback address, not 0.0.0.0. Contributed by Eric Badger
(cherry picked from commit 6de1a8eb67)
2020-12-11 20:19:08 +00:00
Akira Ajisaka
71bda1a2e8
HADOOP-17138. Fix spotbugs warnings surfaced after upgrade to 4.0.6. (#2155) (#2538)
(cherry picked from commit 1b29c9bfee)

Co-authored-by: Masatake Iwasaki <iwasakims@apache.org>
2020-12-11 13:58:02 +09:00
Ayush Saxena
8378ab9f92 HADOOP-17288. Use shaded guava from thirdparty. Contributed by Ayush Saxena. #2505 2020-12-10 05:50:55 +05:30
Hui Fei
cb2dce30d4 HDFS-15240. Erasure Coding: dirty buffer causes reconstruction block error. Contributed by HuangTao. 2020-12-08 10:40:14 +08:00
Jim Brennan
5bfb97bc7d HADOOP-17392. Remote exception messages should not include the exception class (#2486). Contributed by Daryn Sharp and Ahmed Hussein 2020-12-03 17:59:01 +00:00
Andrea Scarpino
c5b9c5dfe5
YARN-10511. Update yarn.nodemanager.env-whitelist value in docs (#2512)
Reviewed-by: Adam Antal <adamantal@apache.org>
Reviewed-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit 9170eb566b)
2020-12-04 00:16:45 +09:00
Steve Loughran
1eeb9d9d67
HADOOP-17318. Support concurrent S3A commit jobs with same app attempt ID. (#2399)
See also [SPARK-33402]: Jobs launched in same second have duplicate MapReduce JobIDs

Contributed by Steve Loughran.

Change-Id: Iae65333cddc84692997aae5d902ad8765b45772a
2020-11-26 17:22:56 +00:00
Steve Loughran
1ef34d0819
HADOOP-17313. FileSystem.get to support slow-to-instantiate FS clients. (#2396)
This adds a semaphore to throttle the number of FileSystem instances which
can be created simultaneously, set in "fs.creation.parallel.count".

This is designed to reduce the impact of many threads in an application calling
FileSystem.get() on a filesystem which takes time to instantiate -for example
to an object where HTTPS connections are set up during initialization.
Many threads trying to do this may create spurious delays by conflicting
for access to synchronized blocks, when simply limiting the parallelism
diminishes the conflict, so speeds up all threads trying to access
the store.

The default value, 64, is larger than is likely to deliver any speedup -but
it does mean that there should be no adverse effects from the change.

If a service appears to be blocking on all threads initializing connections to
abfs, s3a or store, try a smaller (possibly significantly smaller) value.

Contributed by Steve Loughran.

Change-Id: I57161b026f28349e339dc8b9d74f6567a62ce196
2020-11-25 14:55:29 +00:00
Eric Payne
8459f1d955 HADOOP-17346. Fair call queue is defeated by abusive service principals. Contributed by Ahmed Hussein (ahussein). 2020-11-23 20:37:33 +00:00
Jim Brennan
e24a6b550e HADOOP-17367. Add InetAddress api to ProxyUsers.authorize (#2449). Contributed by Daryn Sharp and Ahmed Hussein 2020-11-19 21:26:47 +00:00
Steve Loughran
4687c25389 HADOOP-17244. S3A directory delete tombstones dir markers prematurely. (#2310)
This fixes the S3Guard/Directory Marker Retention integration so that when
fs.s3a.directory.marker.retention=keep, failures during multipart delete
are handled correctly, as are incremental deletes during
directory tree operations.

In both cases, when a directory marker with children is deleted from
S3, the directory entry in S3Guard is not deleted, because it is still
critical to representing the structure of the store.

Contributed by Steve Loughran.

Change-Id: I4ca133a23ea582cd42ec35dbf2dc85b286297d2f
2020-11-18 12:30:43 +00:00
Ahmed Hussein
df4edb99f7 HADOOP-17360. Log the remote address for authentication success (#2441)
Co-authored-by: ahussein <ahmed.hussein@verizonmedia.com>
(cherry picked from commit 1ea3f74246)
2020-11-16 21:48:37 +00:00
Ahmed Hussein
75ca0c0f23 HADOOP-17362. reduce RPC calls doing ls on HAR file (#2444). Contributed by Daryn Sharp and Ahmed Hussein
(cherry picked from commit ebe1d1fbf7)
2020-11-13 21:14:47 +00:00
Ahmed Hussein
23fe3bdab3 HADOOP-17358. Improve excessive reloading of Configurations (#2436)
Co-authored-by: ahussein <ahmed.hussein@verizonmedia.com>
(cherry picked from commit 71071e5c0f)
2020-11-12 10:35:28 -08:00
Doroszlai, Attila
47131cdf7c
HADOOP-17365. Contract test for renaming over existing file is too lenient (#2447)
Contributed by Attila Doroszlai.

Change-Id: I21c29256b52449b7fea335704b3afa02e39c6a39
2020-11-11 21:21:11 +00:00
Stephen Jung
0712505b59 HADOOP-17096. Fix ZStandardCompressor input buffer offset (#2104). Contributed by Stephen Jung (Stripe).
(cherry picked from commit 45434c93e8)
2020-11-10 11:41:21 -08:00
Steve Loughran
7cb5325dda HADOOP-17340. TestLdapGroupsMapping failing -string mismatch in exception validation. (#2427). Contributed by Steve Loughran. 2020-11-07 17:05:23 +05:30
hchaverr
043cca01b1 HDFS-15623. Respect configured values of rpc.engine (#2403) Contributed by Hector Chaverri.
(cherry picked from commit 6eacaffeea)
2020-11-06 14:31:31 -08:00
Eric Badger
c6fee0a2c8 HADOOP-17342. Creating a token identifier should not do kerberos name
resolution. Contributed by Jim Brennan.

(cherry picked from commit af389d9897)
2020-11-05 21:56:46 +00:00
Jim Brennan
41d58d190d Revert "HADOOP-17306. RawLocalFileSystem's lastModifiedTime() looses milli seconds in JDK < 10.b09 (#2387)"
This reverts commit e21b81276e.
2020-11-05 17:31:39 +00:00
Wei-Chiu Chuang
cfa0986d00 Revert "HADOOP-17255. JavaKeyStoreProvider fails to create a new key if the keystore is HDFS. (#2291)"
This reverts commit dd1634ec3b.
2020-11-04 16:18:23 -08:00
Akira Ajisaka
dd1634ec3b HADOOP-17255. JavaKeyStoreProvider fails to create a new key if the keystore is HDFS. (#2291)
Reviewed-by: Steve Loughran <stevel@cloudera.com>
Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org>
(cherry picked from commit 7f5caca04c)
2020-11-03 11:22:48 -08:00
Sunil G
91a3d298b9 HADOOP-17329. mvn site commands fails due to MetricsSystemImpl changes. Contributed by Xiaoqiao He.
(cherry picked from commit f17e067d52)
2020-10-29 07:20:46 +05:30
Ayush Saxena
af5f90623c HADOOP-17328. LazyPersist Overwrite fails in direct write mode. (#2413)
(cherry picked from commit 872440610f)
2020-10-27 01:40:25 +09:00
Vinayakumar B
e21b81276e
HADOOP-17306. RawLocalFileSystem's lastModifiedTime() looses milli seconds in JDK < 10.b09 (#2387) 2020-10-23 11:34:14 +05:30
Takanobu Asanuma
0bb1f0df27 HDFS-15639. [JDK 11] Fix Javadoc errors in hadoop-hdfs-client. (#2394)
(cherry picked from commit 30f06e0c74)
2020-10-20 19:12:26 +09:00
Ayush Saxena
54c40cbf49
HADOOP-16878. FileUtil.copy() to throw IOException if the source and destination are the same (#2383)
Contributed by Gabor Bota.
2020-10-17 01:34:01 +05:30
Konstantin V Shvachko
b6423d2780 HDFS-15567. [SBN Read] HDFS should expose msync() API to allow downstream applications call it explicitly. Contributed by Konstantin V Shvachko.
(cherry picked from commit b3786d6c3c)
2020-10-12 17:38:42 -07:00
Jinglun
44ff4c1058
HADOOP-17021. Add concat fs command (#1993)
Contributed by Jinglun

Change-Id: Ia10ad2205ed0f3594c391ee78f7df4c3c31c796d
2020-10-08 10:36:40 +01:00
Mukund Thakur
475dba1ddf
HADOOP-17281 Implement FileSystem.listStatusIterator() in S3AFileSystem (#2354)
Contains HADOOP-17300: FileSystem.DirListingIterator.next() call should
return NoSuchElementException

Contributed by Mukund Thakur

Change-Id: I4e7e5c6e295525db9e2de6f416f32bbb81e146d3
2020-10-07 14:00:23 +01:00
Liang-Chi Hsieh
8f60a90688 HADOOP-17125. Use snappy-java in SnappyCodec (#2297)
This switches the SnappyCodec to use the java-snappy codec, rather than the native one.

To use the codec, snappy-java.jar (from org.xerial.snappy) needs to be on the classpath.

This comesin as an avro dependency,  so it is already on the hadoop-common classpath,
as well as in hadoop-common/lib.
The version used is now managed in the hadoop-project POM; initially 1.1.7.7

Contributed by DB Tsai and Liang-Chi Hsieh

Change-Id: Id52a404a0005480e68917cd17f0a27b7744aea4e
2020-10-06 17:15:17 +01:00
Karen Coppage
43c9959b3a
HADOOP-17267. Add debug-level logs in Filesystem.close() (#2321)
When a filesystem is closed, the FileSystem log will, at debug level,
log the method calling close/closeAll.

At trace level: the full calling stack.

Contributed by Karen Coppage.

Change-Id: I1444f065c171fd31d42b497c92ba4517969f67f0
2020-09-29 16:09:14 +01:00