Commit Graph

5183 Commits

Author SHA1 Message Date
Uma Maheswara Rao G
120ee793fc HDFS-15387. FSUsage#DF should consider ViewFSOverloadScheme in processPath. Contributed by Uma Maheswara Rao G.
(cherry picked from commit 785b1def95)
2020-06-16 20:02:44 -07:00
Uma Maheswara Rao G
0b5e202614 HDFS-15321. Make DFSAdmin tool to work with ViewFileSystemOverloadScheme. Contributed by Uma Maheswara Rao G.
(cherry picked from commit ed83c865dd)
2020-06-16 16:53:38 -07:00
Uma Maheswara Rao G
8e71e85af7 HDFS-15322. Make NflyFS to work when ViewFsOverloadScheme's scheme and target uris schemes are same. Contributed by Uma Maheswara Rao G.
(cherry picked from commit 4734c77b4b)
2020-06-16 16:53:10 -07:00
Abhishek Das
5b248de42d HADOOP-17024. ListStatus on ViewFS root (ls "/") should list the linkFallBack root (configured target root). Contributed by Abhishek Das.
(cherry picked from commit ce4ec74453)
2020-06-16 16:52:29 -07:00
Uma Maheswara Rao G
544996c857 HDFS-15306. Make mount-table to read from central place ( Let's say from HDFS). Contributed by Uma Maheswara Rao G.
(cherry picked from commit ac4a2e11d9)
2020-06-16 16:50:57 -07:00
Vinayakumar B
534b15caf9
HADOOP-17046. Support downstreams' existing Hadoop-rpc implementations using non-shaded protobuf classes (#2026) 2020-06-12 23:20:10 +05:30
Mingliang Liu
fa723aa7f8
HADOOP-17047. TODO comment exist in trunk while related issue HADOOP-6223 is already fixed. Contributed by Rungroj Maipradit 2020-06-08 11:31:42 -07:00
Mingliang Liu
543075b845
HADOOP-17059. ArrayIndexOfboundsException in ViewFileSystem#listStatus. Contributed by hemanthboyina 2020-06-08 10:38:17 -07:00
Mike
cf84bec6e3 HADOOP-14566. Add seek support for SFTP FileSystem. (#1999)
Contributed by Mikhail Pryakhin
2020-06-03 11:38:49 +01:00
Dhiraj
910d88eeed
HADOOP-17052. NetUtils.connect() throws unchecked exception (UnresolvedAddressException) causing clients to abort (#2036)
Contributed by Dhiraj Hegde.

Signed-off-by: Mingliang Liu <liuml07@apache.org>
2020-06-01 10:50:22 -07:00
S O'Donnell
90f57965e9 HADOOP-7002. Wrong description of copyFromLocal and copyToLocal in documentation. Contributed by Andras Bokor.
(cherry picked from commit 19f26a020e)
2020-05-29 14:49:40 +01:00
S O'Donnell
b803efbdce HADOOP-14698. Make copyFromLocals -t option available for put as well. Contributed by Andras Bokor. 2020-05-29 11:44:48 +01:00
Wanqiang Ji
9b84a637b7
HADOOP-17055. Remove residual code of Ozone (#2039)
(cherry picked from commit d9838f2d42)
2020-05-29 16:50:10 +09:00
Akira Ajisaka
6b54f259e7
HADOOP-17049. javax.activation-api and jakarta.activation-api define overlapping classes (#2027)
* Removed javax.activation-api from dependency

(cherry picked from commit 52b21de1d8)
2020-05-22 11:20:16 +09:00
Akira Ajisaka
77587ffb1e
HADOOP-17042. Hadoop distcp throws 'ERROR: Tools helper ///usr/lib/hadoop/libexec/tools/hadoop-distcp.sh was not found'. Contributed by Aki Tanaka.
(cherry picked from commit 27601fc79e)
2020-05-18 15:37:22 +09:00
S O'Donnell
433aaeefa4 HDFS-15255. Consider StorageType when DatanodeManager#sortLocatedBlock(). Contributed by Lisheng Sun. 2020-05-12 15:25:05 +01:00
Akira Ajisaka
282427f6d1
HADOOP-16768. SnappyCompressor test cases wrongly assume that the compressed data is always smaller than the input data. (#2003)
(cherry picked from commit 328eae9a14)
2020-05-11 14:46:43 +09:00
Akira Ajisaka
763a79916d
HDFS-15343. TestConfiguredFailoverProxyProvider is failing. (#2001)
(cherry picked from commit c784ba370e)
2020-05-08 17:19:48 +09:00
Uma Maheswara Rao G
edf52d29f1 HDFS-15305. Extend ViewFS and provide ViewFileSystemOverloadScheme implementation with scheme configurable. Contributed by Uma Maheswara Rao G.
(cherry picked from commit 9c8236d04d)
2020-05-06 15:13:33 -07:00
Akira Ajisaka
dfa7f160a5
Preparing for 3.3.1 development 2020-04-30 13:33:42 +09:00
S O'Donnell
a34174fea3 HDFS-15285. The same distance and load nodes don't shuffle when consider DataNode load. Contributed by Lisheng Sun.
(cherry picked from commit 9ca6298a9a)
2020-04-29 16:01:48 +01:00
Mike
68d8802624 HDFS-1820. FTPFileSystem attempts to close the outputstream even when it is not initialised. (#1952)
Contributed by Mikhail Pryakhin.
2020-04-27 14:46:52 +01:00
Dhiraj
1c19107ce8
HDFS-15281. Make sure ZKFC uses dfs.namenode.rpc-address to bind to host address (#1964)
Contributed by Dhiraj Hegde.

Signed-off-by: Mingliang Liu <liuml07@apache.org>
Signed-off-by: Inigo Goiri <inigoiri@apache.org>
2020-04-25 13:06:08 -07:00
Ayush Saxena
c2769384ac HADOOP-16886. Add hadoop.http.idle_timeout.ms to core-default.xml. Contributed by Lisheng Sun. 2020-04-25 13:23:16 +05:30
Mingliang Liu
5b92d73a74
HADOOP-17001. The suffix name of the unified compression class. Contributed by bianqi 2020-04-22 12:48:39 -07:00
Steve Loughran
de9a6b4588
HADOOP-16986. S3A to not need wildfly on the classpath. (#1948)
Contributed by Steve Loughran.

This is a successor to HADOOP-16346, which enabled the S3A connector
to load the native openssl SSL libraries for better HTTPS performance.

That patch required wildfly.jar to be on the classpath. This
update:

* Makes wildfly.jar optional except in the special case that
"fs.s3a.ssl.channel.mode" is set to "openssl"

* Retains the declaration of wildfly.jar as a compile-time
dependency in the hadoop-aws POM. This means that unless
explicitly excluded, applications importing that published
maven artifact will, transitively, add the specified
wildfly JAR into their classpath for compilation/testing/
distribution.

This is done for packaging and to offer that optional
speedup. It is not mandatory: applications importing
the hadoop-aws POM can exclude it if they choose.

Change-Id: I7ed3e5948d1e10ce21276b3508871709347e113d
2020-04-20 14:42:36 +01:00
Masatake Iwasaki
0a90df76bc HADOOP-16647. Support OpenSSL 1.1.1 LTS. Contributed by Rakesh Radhakrishnan.
(cherry picked from commit 8f8be6b92a)
2020-04-04 08:16:46 +09:00
Chao Sun
e3fbdcbc14 HADOOP-16912. Emit per priority RPC queue time and processing time from DecayRpcScheduler. Contributed by Fengnan Li. 2020-03-25 10:21:20 -07:00
Toshihiro Suzuki
d353b30baf
HDFS-15215. The Timestamp for longest write/read lock held log is wrong 2020-03-24 14:50:15 -07:00
Wei-Chiu Chuang
f197f05cff
HADOOP-16661. Support TLS 1.3 (#1880) 2020-03-16 10:56:30 -07:00
Steve Loughran
c734d69a55
HADOOP-16898. Batch listing of multiple directories via an (unstable) interface
Contributed by Steve Loughran.

This moves the new API of HDFS-13616 into a interface which is implemented by
HDFS RPC filesystem client (not WebHDFS or any other connector)

This new interface, BatchListingOperations, is in hadoop-common,
so applications do not need to be compiled with HDFS on the classpath.
They must cast the FS into the interface.

instanceof can probe the client for having the new interface -the patch
also adds a new path capability to probe for this.

The FileSystem implementation is cut; tests updated as appropriate.

All new interfaces/classes/constants are marked as @unstable.

Change-Id: I5623c51f2c75804f58f915dd7e60cb2cffdac681
2020-03-09 14:51:16 +00:00
Steve Loughran
d4d4c37810
HADOOP-14630 Contract Tests to verify create, mkdirs and rename under a file is forbidden
Contributed by Steve Loughran.

Not all stores do complete validation here; in particular the S3A
Connector does not: checking up the entire directory tree to see if a path matches
is a file significantly slows things down.

This check does take place in S3A mkdirs(), which walks backwards up the list of
parent paths until it finds a directory (success) or a file (failure).
In practice production applications invariably create destination directories
before writing 1+ file into them -restricting check purely to the mkdirs()
call deliver significant speed up while implicitly including the checks.

Change-Id: I2c9df748e92b5655232e7d888d896f1868806eb0
2020-03-09 14:44:28 +00:00
Sebastian Nagel
18050bc583
HADOOP-16909 Typo in distcp counters.
Contributed by Sebastian Nagel.
2020-03-09 14:37:08 +00:00
Steve Loughran
d0a7c790c6
HADOOP-16885. Fix hadoop-commons TestCopy failure
Followup to HADOOP-16885: Encryption zone file copy failure leaks a temp file

Moving the delete() call broke a mocking test, which slipped through the review process.

Contributed by Steve Loughran.

Change-Id: Ia13faf0f4fffb1c99ddd616d823e4f4d0b7b0cbb
2020-03-03 17:25:22 +00:00
cpugputpu
5678b19b01
HADOOP-16897. Sort fields in ReflectionUtils.java.
Contributed by cpugputpu.
2020-03-02 17:53:38 +00:00
Xiaoyu Yao
0dd8956f2e
HADOOP-16885. Encryption zone file copy failure leaks a temp file
Contributed by Xiaoyu Yao.

Contains HDFS-14892. Close the output stream if createWrappedOutputStream() fails

Copying file through the FsShell command into an HDFS encryption zone where
the caller lacks permissions is leaks a temp ._COPYING file
and potentially a wrapped stream unclosed.

This is a convergence of a fix for S3 meeting an issue in HDFS.

S3: a HEAD against a file can cache a 404, 
 -you must not do any existence checks, including deleteOnExit(),
  until the file is written. 

Hence: HADOOP-16490, only register files for deletion the create worked
and the upload is not direct. 

HDFS-14892. HDFS doesn't close wrapped streams when IOEs are raised on
create() failures. Which means that an entry is retained on the NN.
-you need to register a file with deleteOnExit() even if the file wasn't
created.

This patch:

* Moves the deleteOnExit to ensure the created file get deleted cleanly.
* Fixes HDFS to close the wrapped stream on failures.
2020-03-02 13:22:00 +00:00
Kihwal Lee
352a4ec16d HDFS-15147. LazyPersistTestCase wait logic is flawed. Contributed by Ahmed Hussein. 2020-02-26 09:33:29 -06:00
Takanobu Asanuma
5cbc4c5461 HADOOP-16841. The description of hadoop.http.authentication.signature.secret.file contains outdated information. Contributed by Xieming Li. 2020-02-25 11:08:13 +09:00
Sahil Takiar
42dfd270a1
HADOOP-16859: ABFS: Add unbuffer support to ABFS connector.
Contributed by Sahil Takiar
2020-02-24 16:28:00 +00:00
Ayush Saxena
b5698e0c33 HDFS-15176. Enable GcTimePercentage Metric in NameNode's JvmMetrics. Contributed by Jinglun. 2020-02-24 00:07:18 +05:30
Wei-Chiu Chuang
cb3f3cca01 HADOOP-16868. ipc.Server readAndProcess threw NullPointerException. Contributed by Tsz-wo Sze. 2020-02-18 21:53:08 -08:00
Ayush Saxena
ac4b556e2d HDFS-13739. Add option to disable rack local write preference. Contributed by Ayush Saxena. 2020-02-19 08:20:59 +05:30
Arpit Agarwal
0cfff16ac0
HADOOP-16833. InstrumentedLock should log lock queue time. Contributed by Stephen O'Donnell.
Change-Id: Idddff05051b6f642b88e51694b40c5bb1bef0026
2020-02-18 09:50:11 -08:00
Steve Loughran
a562942b05
HADOOP-16759. FileSystem Javadocs to list what breaks on API changes
Followup to the main openFile().withStatus() patch.
It turns out that this broke the hive builds, which
was not well appreciated.

This patch lists places to review in the hadoop codebase,
and external projects where changes are likely to cause problems.

Contributed by Steve Loughran

Change-Id: Ifac815c65b74d083cd277764b780ac2b5b0f6b36
2020-02-17 22:14:39 +00:00
Ayush Saxena
84f7638840 HADOOP-13666. Supporting rack exclusion in countNumOfAvailableNodes in NetworkTopology. Contributed by Inigo Goiri. 2020-02-18 00:43:33 +05:30
Akira Ajisaka
954930e9d9
HADOOP-16850. Support getting thread info from thread group for JvmMetrics to improve the performance. Contributed by Tao Yang. 2020-02-14 15:20:28 +09:00
Steve Loughran
56dee66770
HADOOP-16823. Large DeleteObject requests are their own Thundering Herd.
Contributed by Steve Loughran.

During S3A rename() and delete() calls, the list of objects delete is
built up into batches of a thousand and then POSTed in a single large
DeleteObjects request.

But as the IO capacity allowed on an S3 partition may only be 3500 writes
per second *and* each entry in that POST counts as a single write, then
one of those posts alone can trigger throttling on an already loaded
S3 directory tree. Which can trigger backoff and retry, with the same
thousand entry post, and so recreate the exact same problem.

Fixes

* Page size for delete object requests is set in
  fs.s3a.bulk.delete.page.size; the default is 250.
* The property fs.s3a.experimental.aws.s3.throttling (default=true)
  can be set to false to disable throttle retry logic in the AWS
  client SDK -it is all handled in the S3A client. This
  gives more visibility in to when operations are being throttled
* Bulk delete throttling events are logged to the log
  org.apache.hadoop.fs.s3a.throttled log at INFO; if this appears
  often then choose a smaller page size.
* The metric "store_io_throttled" adds the entire count of delete
  requests when a single DeleteObjects request is throttled.
* A new quantile, "store_io_throttle_rate" can track throttling
  load over time.
* DynamoDB metastore throttle resilience issues have also been
  identified and fixed. Note: the fs.s3a.experimental.aws.s3.throttling
  flag does not apply to DDB IO precisely because there may still be
  lurking issues there and it safest to rely on the DynamoDB client
  SDK.

Change-Id: I00f85cdd94fc008864d060533f6bd4870263fd84
2020-02-13 19:09:49 +00:00
Stephen O'Donnell
d7c136b9ed HDFS-15150. Introduce read write lock to Datanode. Contributed Stephen O'Donnell.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
2020-02-11 08:00:15 -08:00
Jan Hentschel
cc8ae59104
HADOOP-16851. Removed unused import in Configuration
Contributed by Jan Hentschel.
2020-02-11 11:51:45 +00:00
testfixer
d36cd37e60
HADOOP-16847. Test can fail if HashSet iterates in a different order.
Contributed by Testfixer
2020-02-11 11:22:07 +00:00