Commit Graph

2170 Commits

Author SHA1 Message Date
Masatake Iwasaki
9e2936f8d1
HADOOP-17424. Replace HTrace with No-Op tracer (#3520)
(cherry picked from commit 1a205cc3ad)

 Conflicts:
	hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java
	hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
	hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/tracing/TestTracing.java

Co-authored-by: Siyao Meng <50227127+smengcl@users.noreply.github.com>
2021-10-12 00:07:09 +09:00
Viraj Jasani
77ee5a4266
HADOOP-17950. Provide replacement for deprecated APIs of commons-io IOUtils (#3515)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit 8071dbb9c6)
2021-10-07 11:00:19 +09:00
Ahmed Hussein
2cdc6a245d HADOOP-17930. implement non-guava Precondition checkState (#3522)
Reviewed-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
(cherry picked from commit c36f9402dc)
2021-10-07 10:57:20 +09:00
Mehakmeet Singh
aee975a136
HADOOP-13887. Support S3 client side encryption (S3-CSE) using AWS-SDK (#2706)
This (big!) patch adds support for client side encryption in AWS S3,
with keys managed by AWS-KMS.

Read the documentation in encryption.md very, very carefully before
use and consider it unstable.

S3-CSE is enabled in the existing configuration option
"fs.s3a.server-side-encryption-algorithm":

fs.s3a.server-side-encryption-algorithm=CSE-KMS
fs.s3a.server-side-encryption.key=<KMS_KEY_ID>

You cannot enable CSE and SSE in the same client, although
you can still enable a default SSE option in the S3 console.

* Filesystem list/get status operations subtract 16 bytes from the length
  of all files >= 16 bytes long to compensate for the padding which CSE
  adds.
* The SDK always warns about the specific algorithm chosen being
  deprecated. It is critical to use this algorithm for ranged
  GET requests to work (i.e. random IO). Ignore.
* Unencrypted files CANNOT BE READ.
  The entire bucket SHOULD be encrypted with S3-CSE.
* Uploading files may be a bit slower as blocks are now
  written sequentially.
* The Multipart Upload API is disabled when S3-CSE is active.

Contributed by Mehakmeet Singh

Change-Id: Ie1a27a036a39db66a67e9c6d33bc78d54ea708a0
2021-10-05 11:37:41 +01:00
Ahmed Hussein
31b44c519c
HADOOP-17929. implement non-guava Precondition checkArgument (#3473)
Reviewed-by: Viraj Jasani <vjasani@apache.org>
(cherry picked from commit 0c498f21de)
2021-10-01 16:49:07 +08:00
Chao Sun
6931b70a00
HADOOP-17936. Fix test failure after reverting HADOOP-16878 from branch-3.3 (#3478) 2021-09-27 13:56:44 -07:00
Chao Sun
ff26a7700d Revert "HADOOP-16878. FileUtil.copy() to throw IOException if the source and destination are the same (#2383)"
This reverts commit 54c40cbf49.
2021-09-23 15:04:27 -07:00
Mehakmeet Singh
8e5620cd9e
HADOOP-17195. ABFS: OutOfMemory error while uploading huge files (#3446)
Addresses the problem of processes running out of memory when
there are many ABFS output streams queuing data to upload,
especially when the network upload bandwidth is less than the rate
data is generated.

ABFS Output streams now buffer their blocks of data to
"disk", "bytebuffer" or "array", as set in
"fs.azure.data.blocks.buffer"

When buffering via disk, the location for temporary storage
is set in "fs.azure.buffer.dir"

For safe scaling: use "disk" (default); for performance, when
confident that upload bandwidth will never be a bottleneck,
experiment with the memory options.

The number of blocks a single stream can have queued for uploading
is set in "fs.azure.block.upload.active.blocks".
The default value is 20.

Contributed by Mehakmeet Singh.
2021-09-22 11:19:16 +01:00
Neil
9700d98eac
HADOOP-17893. Improve PrometheusSink for Namenode TopMetrics (#3426)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit ae2c5ccfcf)
2021-09-21 10:44:51 +09:00
Steve Loughran
9188fa8cce
HADOOP-17126. implement non-guava Precondition checkNotNull
This adds a new class org.apache.hadoop.util.Preconditions which is

* @Private/@Unstable
* Intended to allow us to move off Google Guava
* Is designed to be trivially backportable
  (i.e contains no references to guava classes internally)

Please use this instead of the guava equivalents, where possible.

Contributed by: Ahmed Hussein

Change-Id: Ic392451bcfe7d446184b7c995734bcca8c07286e
2021-09-17 11:06:59 +01:00
Adam Binford
59a955dfa0
HADOOP-17804. Expose prometheus metrics only after a flush and dedupe with tag values (#3369)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit 4ced012f33)
2021-09-09 16:51:04 +09:00
Masatake Iwasaki
76393e1359 HADOOP-17899. Avoid using implicit dependency on junit-jupiter-api. (#3399)
(cherry picked from commit ce7a5bfbd3)
2021-09-08 09:11:39 +00:00
Yellow Flash
09e8e5c5cb
HADOOP-17870. Http Filesystem to qualify relative paths. (#3338)
Contributed by Yellowflash

Change-Id: I217da06a1a2e5c0ca2b324f8e21baa0846f64858
2021-09-07 10:54:35 +01:00
Chris Nauroth
cc90b4f987 HADOOP-15129. Datanode caches namenode DNS lookup failure and cannot startup (#3348)
Co-authored-by:  Karthik Palaniappan

Change-Id: Id079a5319e5e83939d5dcce5fb9ebe3715ee864f
2021-09-03 18:48:07 +00:00
jianghuazhu
7c663043b2
HDFS-16173.Improve CopyCommands#Put#executor queue configurability. (#3302)
Co-authored-by: zhujianghua <zhujianghua@zhujianghuadeMacBook-Pro.local>
Reviewed-by: Hui Fei <ferhui@apache.org>
Reviewed-by: Viraj Jasani <vjasani@apache.org>
(cherry picked from commit 4c94831364)
2021-08-27 12:06:26 +08:00
jianghuazhu
2b2f8f575b
HDFS-16175.Improve the configurable value of Server #PURGE_INTERVAL_NANOS. (#3307)
Co-authored-by: zhujianghua <zhujianghua@zhujianghuadeMacBook-Pro.local>
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
(cherry picked from commit ad54f5195c)
2021-08-25 17:35:50 +08:00
Bryan Beaudreault
2fda130260 HADOOP-17837: Add unresolved endpoint value to UnknownHostException (ADDENDUM) (#3276)
(cherry picked from commit b0b867e977)
2021-08-06 21:57:46 +05:30
Bryan Beaudreault
7659b62682
HADOOP-17837: Add unresolved endpoint value to UnknownHostException (#3272)
(cherry picked from commit 5e54d92e6e)
2021-08-06 17:32:01 +08:00
Steve Loughran
26514b6534 HADOOP-17628. Distcp contract test is really slow with ABFS and S3A; timing out. (#3240)
This patch cuts down the size of directory trees used for
distcp contract tests against object stores, so making
them much faster against distant/slow stores.

On abfs, the test only runs with -Dscale (as was the case for s3a already),
and has the larger scale test timeout.

After every test case, the FileSystem IOStatistics are logged,
to provide information about what IO is taking place and
what it's performance is.

There are some test cases which upload files of 1+ MiB; you can
increase the size of the upload in the option
"scale.test.distcp.file.size.kb" 
Set it to zero and the large file tests are skipped.

Contributed by Steve Loughran.
2021-08-02 12:58:37 +01:00
Petre Bogdan Stolojan
f2cec5cb88
HADOOP-17139 Re-enable optimized copyFromLocal implementation in S3AFileSystem (#3101)
This work
* Defines the behavior of FileSystem.copyFromLocal in filesystem.md
* Implements a high performance implementation of copyFromLocalOperation
  for S3
* Adds a contract test for the operation: AbstractContractCopyFromLocalTest
* Implements the contract tests for Local and S3A FileSystems

Contributed by: Bogdan Stolojan

Change-Id: I25d502102775c3626c4264e5a14c649879730050
2021-08-02 11:58:36 +01:00
Viraj Jasani
ec3311975c
HADOOP-16290. Enable RpcMetrics units to be configurable (#3198)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit e1d00addb5)
2021-07-20 14:56:28 +08:00
Abhishek Das
450dae7383 HADOOP-17028. ViewFS should initialize mounted target filesystems lazily. Contributed by Abhishek Das (#2260)
(cherry picked from commit 1dd03cc4b5)
2021-07-13 18:23:27 -07:00
Rafal Wojdyla
e3fb63f33f
HADOOP-17402. Add GCS config to the core-site (#2638)
Contributed by Rafal Wojdyla
2021-07-07 22:43:31 +01:00
liangxs
24b780820c HADOOP-17749. Remove lock contention in SelectorPool of SocketIOWithTimeout (#3080)
(cherry picked from commit a5db6831bc)
2021-07-07 09:41:11 +08:00
Viraj Jasani
b8a98e4f82 HDFS-16075. Use empty array constants present in StorageType and DatanodeInfo to avoid creating redundant objects (#3115)
Reviewed-by: Hui Fei <ferhui@apache.org>
(cherry picked from commit c488abbc79)
2021-06-21 10:28:05 +09:00
Takanobu Asanuma
25138c98bf HADOOP-17760. Delete hadoop.ssl.enabled and dfs.https.enable from docs and core-default.xml (#3099)
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
(cherry picked from commit 9e7c7ad129)
2021-06-17 10:00:36 +09:00
Steve Loughran
4ac9123619
HADOOP-17631. Configuration ${env.VAR:-FALLBACK} to eval FALLBACK when restrictSystemProps=true (#2977)
Contributed by Steve Loughran.

Change-Id: I9b82109eddeb659c01896152cf603d458e2a04cd
2021-06-08 22:05:00 +01:00
Steve Loughran
464bbd5b7c
HADOOP-17511. Add audit/telemetry logging to S3A connector (#2807)
The S3A connector supports
"an auditor", a plugin which is invoked
at the start of every filesystem API call,
and whose issued "audit span" provides a context
for all REST operations against the S3 object store.

The standard auditor sets the HTTP Referrer header
on the requests with information about the API call,
such as process ID, operation name, path,
and even job ID.

If the S3 bucket is configured to log requests, this
information will be preserved there and so can be used
to analyze and troubleshoot storage IO.

Contributed by Steve Loughran.

Change-Id: Ic0a105c194342ed2d529833ecc42608e8ba2f258
2021-05-25 12:55:38 +01:00
Vinayakumar B
dbf1ef4aff
HDFS-15790. Make ProtobufRpcEngineProtos and ProtobufRpcEngineProtos2 Co-Exist (#2767)
(cherry picked from commit 2bbeae3240)

 Conflicts:
	hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/ProtobufRpcEngine.java
2021-05-24 18:00:38 +08:00
Wei-Chiu Chuang
86c28f0639
Revert "HADOOP-17669. Backport HADOOP-17079, HADOOP-17505 to branch-3.3 (#2959)"
This reverts commit 4ffe5eb1dd.
2021-05-24 17:37:18 +08:00
Wei-Chiu Chuang
4ffe5eb1dd
HADOOP-17669. Backport HADOOP-17079, HADOOP-17505 to branch-3.3 (#2959)
* HADOOP-17079. Optimize UGI#getGroups by adding UGI#getGroupsSet.

Co-authored-by: Wei-Chiu Chuang <weichiu@apache.org>
Change-Id: I0f31409923ece24a82dfba4c4610d8a38c52d9fb

* HADOOP-17505. public interface GroupMappingServiceProvider needs default impl for getGroupsSet() (#2661). Contributed by Vinayakumar B.

(cherry picked from commit c4c0683dff)

Co-authored-by: Xiaoyu Yao <xyao@apache.org>
Co-authored-by: Vinayakumar B <vinayakumarb@apache.org>
2021-05-17 18:57:46 -07:00
Xiaoyu Yao
3f9c9ccf46
HADOOP-17284. Support BCFKS keystores for Hadoop Credential Provider.… (#3010)
* HADOOP-17284. Support BCFKS keystores for Hadoop Credential Provider. (#2334)

(cherry picked from commit 4c5ad57818)
2021-05-13 16:57:58 -07:00
Mike
0f12f3e125
HADOOP-17036. TestFTPFileSystem failing as ftp server dir already exists.
Contributed by Mikhail Pryakhin.

(cherry picked from commit 017d24e970)
2021-05-07 14:20:29 +09:00
Viraj Jasani
be34c1222a HADOOP-11616. Remove workaround for Curator's ChildReaper requiring Guava 15+ (#2973)
Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit b93e448f9a)
2021-05-06 04:53:06 +09:00
Takanobu Asanuma
65bf544118 HADOOP-16954. Add -S option in "Count" command to show only Snapshot Counts. Contributed by hemanthboyina.
(cherry picked from commit b89d875f7b)
2021-05-04 17:44:34 +01:00
kishendas
98aa4fc32c
HADOOP-17657: implement StreamCapabilities in SequenceFile.Writer and fall back to flush, if hflush is not supported (#2949)
Co-authored-by: Kishen Das <kishen@cloudera.com>
Reviewed-by: Steve Loughran <stevel@apache.org>
(cherry picked from commit e571025f5b)
2021-05-04 16:35:00 +08:00
Akira Ajisaka
72355c7b6e
HADOOP-17630. [JDK 15] TestPrintableString fails due to Unicode 13.0 support. (#2890)
Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org>
(cherry picked from commit 156ecc89be)
2021-04-13 17:10:00 +09:00
touchida
dca2bf9dd5 HDFS-15759. EC: Verify EC reconstruction correctness on DataNode (#2585)
(cherry picked from commit 95e6892675)
2021-04-08 17:20:08 +08:00
Viraj Jasani
8b4b3d6fe6 HADOOP-17622. Avoid usage of deprecated IOUtils#cleanup API. (#2862)
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
(cherry picked from commit 3f2682b92b)
2021-04-06 14:18:31 +09:00
Borislav Iordanov
c365149e16 HADOOP-16524. Automatic keystore reloading for HttpServer2
Reapply of issue reverted first because it caused yarn failures.

Signed-off-by: stack <stack@apache.org>
2021-03-31 10:50:28 -07:00
Stephen O'Donnell
56ef16468a
HADOOP-17222. Create socket address leveraging URI cache (#2817)
Contributed by fanrui.

Signed-off-by: Mingliang Liu <liuml07@apache.org>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2021-03-30 11:59:44 +01:00
Ayush Saxena
9c9b16c957
HADOOP-17531. DistCp: Reduce memory usage on copying huge directories. (#2808). Contributed by Ayush Saxena.
* HADOOP-17531. DistCp: Reduce memory usage on copying huge directories. (#2732).

* HADOOP-17531.Addendum: DistCp: Reduce memory usage on copying huge directories. (#2820)

Signed-off-by: Steve Loughran <stevel@apache.org>
2021-03-27 09:25:25 +05:30
Xiaoyu Yao
67d52af225 HADOOP-16828. Zookeeper Delegation Token Manager fetch sequence number by batch. Contributed by Fengnan Li.
(cherry picked from commit 6288e15118)
2021-03-25 14:44:02 +00:00
Ayush Saxena
27944772d3 HADOOP-17310. Touch command with -c option is broken. (#2393). Contributed by Ayush Saxena. 2021-03-19 00:13:31 +05:30
Jim Brennan
ad74038e02 MAPREDUCE-7322. revisiting TestMRIntermediateDataEncryption. Contributed by Ahmed Hussein.
(cherry picked from commit 299b8062f1)
2021-03-15 20:17:02 +00:00
He Xiaoqiao
7fb49a48d1 HADOOP-17585. Correct timestamp format in the docs for the touch command. Contributed by Stephen O'Donnell.
(cherry picked from commit b1dc6c40a0)
2021-03-14 14:56:16 +00:00
Mike
5ffcee8979
HADOOP-17528. SFTP File System: close the connection pool when closing a FileSystem (#2701)
Contributed by Mike Pryakhin.

Change-Id: I59ef67c38c313f30c5e000b2fe41fcf715cf3a4b
2021-03-09 19:58:11 +00:00
Ahmed Hussein
792329fde9 MAPREDUCE-7320. organize test directories for ClusterMapReduceTestCase (#2722). Contributed by Ahmed Hussein
(cherry picked from commit e04bcb3a06)
2021-02-26 19:56:07 +00:00
Steve Loughran
4423a7e736
HADOOP-16906. Abortable (#2684)
Adds an Abortable.abort() interface for streams to enable output streams to be terminated; this
is implemented by the S3A connector's output stream. It allows for commit protocols
to be implemented which commit/abort work by writing to the final destination and
using the abort() call to cancel any write which is not intended to be committed.
Consult the specification document for information about the interface and its use.

Contributed by Jungtaek Lim and Steve Loughran.

Change-Id: I7fcc25e9dd8c10ce6c29f383529f3a2642a201ae
2021-02-17 11:29:19 +00:00
Steve Loughran
98e4d516ea
HADOOP-13327 Output Stream Specification. (#2587)
This defines what output streams and especially those which implement
Syncable are meant to do, and documents where implementations (HDFS; S3)
don't. With tests.

The file:// FileSystem now supports Syncable if an application calls
FileSystem.setWriteChecksum(false) before creating a file -checksumming
and Syncable.hsync() are incompatible.

Contributed by Steve Loughran.

Change-Id: I892d768de6268f4dd6f175b3fe3b7e5bcaa91194
2021-02-10 10:31:22 +00:00
Anton Kutuzov
dcf6d77279 HDFS-15632. AbstractContractDeleteTest should set recursive peremeter to true for recursive test cases. Contributed by Anton Kutuzov.
(cherry picked from commit 91d4ba57c5)
2021-01-22 18:09:57 -08:00
Steve Loughran
57abfae136
HADOOP-17450. Add Public IOStatistics API. (#2577)
This is the API and implementation classes of HADOOP-16830,
which allows callers to query IO object instances
(filesystems, streams, remote iterators, ...) and other classes
for statistics on their I/O Usage: operation count and min/max/mean
durations.

New Packages

org.apache.hadoop.fs.statistics.
  Public API, including:
    IOStatisticsSource
    IOStatistics
    IOStatisticsSnapshot (seralizable to java objects and json)
    +helper classes for logging and integration
    BufferedIOStatisticsInputStream
       implements IOStatisticsSource and StreamCapabilities
     BufferedIOStatisticsOutputStream
       implements IOStatisticsSource, Syncable and StreamCapabilities

org.apache.hadoop.fs.statistics.impl
  Implementation classes for internal use.

org.apache.hadoop.util.functional
  functional programming support for RemoteIterators and
  other operations which raise IOEs; all wrapper classes
  implement and propagate IOStatisticsSource

Contributed by Steve Loughran.

Change-Id: If56e8db2981613ff689c39239135e44feb25f78e
2021-01-14 13:20:17 +00:00
stack
b74d642220 Revert "HADOOP-16524. Reloading SSL keystore for both DataNode and NameNode (#2470)"
This reverts commit f7d2a5d7a52c41cba14b17eb0c9189d983f202cf.
2021-01-11 08:56:24 -08:00
Michael Stack
f046ed27d6
HADOOP-16524. Reloading SSL keystore for both DataNode and NameNode (#2470) (#2609)
Co-authored-by: Borislav Iordanov <biordanov@apple.com>
Signed-off-by: stack <stack@apache.org>

Co-authored-by: Borislav Iordanov <borislav.iordanov@gmail.com>
Co-authored-by: Borislav Iordanov <biordanov@apple.com>
2021-01-08 13:45:44 -08:00
Steve Loughran
a2ae0d7079
Revert "HADOOP-17430. Restore ability to set Text to empty byte array (#2545)"
This reverts commit 9e85eb9a2e.

Change-Id: Id1ac803b29931b0f643cb37bbe58534726c36f1e
2021-01-08 10:50:28 +00:00
dgzdot
9e85eb9a2e HADOOP-17430. Restore ability to set Text to empty byte array (#2545)
Contributed by gaozhan.ding

Change-Id: Ib2ad9120c15c46a3fa2de9e3206875cbbc2363c2
2021-01-05 21:15:14 +00:00
Liang-Chi Hsieh
87064df1f2 HADOOP-17292. Using lz4-java in Lz4Codec (#2350)
Contributed by Liang-Chi Hsieh.
2020-12-29 13:17:26 -08:00
Masatake Iwasaki
b8a4361d7b HADOOP-17270. Fix testCompressorDecompressorWithExeedBufferLimit to c… (#2311) 2020-12-29 13:11:51 -08:00
Chao Sun
81e533de8f
HADOOP-16080. hadoop-aws does not work with hadoop-client-api. Contributed by Chao Sun (#2522) 2020-12-12 09:37:13 -08:00
Jim Brennan
e5f11ea5b2 HADOOP-13571. ServerSocketUtil.getPort() should use loopback address, not 0.0.0.0. Contributed by Eric Badger
(cherry picked from commit 6de1a8eb67)
2020-12-11 20:19:08 +00:00
Ayush Saxena
8378ab9f92 HADOOP-17288. Use shaded guava from thirdparty. Contributed by Ayush Saxena. #2505 2020-12-10 05:50:55 +05:30
Jim Brennan
5bfb97bc7d HADOOP-17392. Remote exception messages should not include the exception class (#2486). Contributed by Daryn Sharp and Ahmed Hussein 2020-12-03 17:59:01 +00:00
Steve Loughran
1ef34d0819
HADOOP-17313. FileSystem.get to support slow-to-instantiate FS clients. (#2396)
This adds a semaphore to throttle the number of FileSystem instances which
can be created simultaneously, set in "fs.creation.parallel.count".

This is designed to reduce the impact of many threads in an application calling
FileSystem.get() on a filesystem which takes time to instantiate -for example
to an object where HTTPS connections are set up during initialization.
Many threads trying to do this may create spurious delays by conflicting
for access to synchronized blocks, when simply limiting the parallelism
diminishes the conflict, so speeds up all threads trying to access
the store.

The default value, 64, is larger than is likely to deliver any speedup -but
it does mean that there should be no adverse effects from the change.

If a service appears to be blocking on all threads initializing connections to
abfs, s3a or store, try a smaller (possibly significantly smaller) value.

Contributed by Steve Loughran.

Change-Id: I57161b026f28349e339dc8b9d74f6567a62ce196
2020-11-25 14:55:29 +00:00
Eric Payne
8459f1d955 HADOOP-17346. Fair call queue is defeated by abusive service principals. Contributed by Ahmed Hussein (ahussein). 2020-11-23 20:37:33 +00:00
Jim Brennan
e24a6b550e HADOOP-17367. Add InetAddress api to ProxyUsers.authorize (#2449). Contributed by Daryn Sharp and Ahmed Hussein 2020-11-19 21:26:47 +00:00
Steve Loughran
4687c25389 HADOOP-17244. S3A directory delete tombstones dir markers prematurely. (#2310)
This fixes the S3Guard/Directory Marker Retention integration so that when
fs.s3a.directory.marker.retention=keep, failures during multipart delete
are handled correctly, as are incremental deletes during
directory tree operations.

In both cases, when a directory marker with children is deleted from
S3, the directory entry in S3Guard is not deleted, because it is still
critical to representing the structure of the store.

Contributed by Steve Loughran.

Change-Id: I4ca133a23ea582cd42ec35dbf2dc85b286297d2f
2020-11-18 12:30:43 +00:00
Ahmed Hussein
75ca0c0f23 HADOOP-17362. reduce RPC calls doing ls on HAR file (#2444). Contributed by Daryn Sharp and Ahmed Hussein
(cherry picked from commit ebe1d1fbf7)
2020-11-13 21:14:47 +00:00
Ahmed Hussein
23fe3bdab3 HADOOP-17358. Improve excessive reloading of Configurations (#2436)
Co-authored-by: ahussein <ahmed.hussein@verizonmedia.com>
(cherry picked from commit 71071e5c0f)
2020-11-12 10:35:28 -08:00
Doroszlai, Attila
47131cdf7c
HADOOP-17365. Contract test for renaming over existing file is too lenient (#2447)
Contributed by Attila Doroszlai.

Change-Id: I21c29256b52449b7fea335704b3afa02e39c6a39
2020-11-11 21:21:11 +00:00
Steve Loughran
7cb5325dda HADOOP-17340. TestLdapGroupsMapping failing -string mismatch in exception validation. (#2427). Contributed by Steve Loughran. 2020-11-07 17:05:23 +05:30
hchaverr
043cca01b1 HDFS-15623. Respect configured values of rpc.engine (#2403) Contributed by Hector Chaverri.
(cherry picked from commit 6eacaffeea)
2020-11-06 14:31:31 -08:00
Jim Brennan
41d58d190d Revert "HADOOP-17306. RawLocalFileSystem's lastModifiedTime() looses milli seconds in JDK < 10.b09 (#2387)"
This reverts commit e21b81276e.
2020-11-05 17:31:39 +00:00
Ayush Saxena
af5f90623c HADOOP-17328. LazyPersist Overwrite fails in direct write mode. (#2413)
(cherry picked from commit 872440610f)
2020-10-27 01:40:25 +09:00
Vinayakumar B
e21b81276e
HADOOP-17306. RawLocalFileSystem's lastModifiedTime() looses milli seconds in JDK < 10.b09 (#2387) 2020-10-23 11:34:14 +05:30
Ayush Saxena
54c40cbf49
HADOOP-16878. FileUtil.copy() to throw IOException if the source and destination are the same (#2383)
Contributed by Gabor Bota.
2020-10-17 01:34:01 +05:30
Jinglun
44ff4c1058
HADOOP-17021. Add concat fs command (#1993)
Contributed by Jinglun

Change-Id: Ia10ad2205ed0f3594c391ee78f7df4c3c31c796d
2020-10-08 10:36:40 +01:00
Mukund Thakur
475dba1ddf
HADOOP-17281 Implement FileSystem.listStatusIterator() in S3AFileSystem (#2354)
Contains HADOOP-17300: FileSystem.DirListingIterator.next() call should
return NoSuchElementException

Contributed by Mukund Thakur

Change-Id: I4e7e5c6e295525db9e2de6f416f32bbb81e146d3
2020-10-07 14:00:23 +01:00
Liang-Chi Hsieh
8f60a90688 HADOOP-17125. Use snappy-java in SnappyCodec (#2297)
This switches the SnappyCodec to use the java-snappy codec, rather than the native one.

To use the codec, snappy-java.jar (from org.xerial.snappy) needs to be on the classpath.

This comesin as an avro dependency,  so it is already on the hadoop-common classpath,
as well as in hadoop-common/lib.
The version used is now managed in the hadoop-project POM; initially 1.1.7.7

Contributed by DB Tsai and Liang-Chi Hsieh

Change-Id: Id52a404a0005480e68917cd17f0a27b7744aea4e
2020-10-06 17:15:17 +01:00
crossfire
c3cb86ba42
HADOOP-17088. Failed to load XInclude files with relative path. (#2097)
Contributed by Yushi Hayasaka.

Change-Id: I8aad5143c34fb831bef0077f7b659643f8ae073a
2020-09-21 19:13:20 +01:00
Uma Maheswara Rao G
2d9c5395ef HDFS-15578: Fix the rename issues with fallback fs enabled (#2305). Contributed by Uma Maheswara Rao G.
Co-authored-by: Uma Maheswara Rao G <umagangumalla@cloudera.com>
(cherry picked from commit e4cb0d3514)
2020-09-16 23:01:03 -07:00
zz
2d5ca83078 HADOOP-15891. provide Regex Based Mount Point In Inode Tree (#2185). Contributed by Zhenzhao Wang.
Co-authored-by: Zhenzhao Wang <zhenzhaowang@gmail.com>
(cherry picked from commit 12a316cdf9)
2020-09-12 20:42:06 -07:00
Steve Loughran
aa80bcb1ec
Revert "HADOOP-17244. S3A directory delete tombstones dir markers prematurely. (#2280)"
This reverts commit 0c82eb0324.

Change-Id: I6bd100d9de19660b0f28ee0ab16faf747d6d9f05
2020-09-11 18:07:05 +01:00
Steve Loughran
0c82eb0324
HADOOP-17244. S3A directory delete tombstones dir markers prematurely. (#2280)
This changes directory tree deletion so that only files are incrementally deleted
from S3Guard after the objects are deleted; the directories are left alone
until metadataStore.deleteSubtree(path) is invoke.

This avoids directory tombstones being added above files/child directories,
which stop the treewalk and delete phase from working.

Also:

* Callback to delete objects splits files and dirs so that
any problems deleting the dirs doesn't trigger s3guard updates
* New statistic to measure #of objects deleted, alongside request count.
* Callback listFilesAndEmptyDirectories renamed listFilesAndDirectoryMarkers
  to clarify behavior.
* Test enhancements to replicate the failure and verify the fix

Contributed by Steve Loughran

Change-Id: I0e6ea2c35e487267033b1664228c8837279a35c7
2020-09-10 17:29:33 +01:00
Steve Loughran
262c575fab
HADOOP-17181. Handle transient stream read failures in FileSystem contract tests (#2286)
Contributed by Steve Loughran.

* Fixes AbstractContractSeekTest test to use readFully
* Doesn't do this to AbstractContractUnbufferTest test as it changes the test too much.
Instead just notes in the error that this may be transient

The issue is that read(buffer) doesn't guarantee that the buffer is filled, only that it will
read up to a point, and that may be just be the amount of data left in the TCP packet.
readFully corrects for this, but using it in the unbuffer test runs the risk that what
is tested for in terms of unbuffering doesn't actually get validated.

Change-Id: I046eadb69b80ba0aac468b354c82c4d510dc3699
2020-09-09 12:01:47 +01:00
Steve Loughran
1b9109d237
HDFS-15471. TestHDFSContractMultipartUploader failing (#2252)
Contributed by Steve Loughran
(Was: broken by Steve Loughran)

Change-Id: If6a82706f3ea6d802bc6da03c2a2ca734e30388f
2020-08-28 15:47:06 +01:00
sguggilam
fcb80c1ade
HADOOP-17159. Make UGI support forceful relogin from keytab ignoring the last login time (#2249)
Contributed by Sandeep Guggilam.

Signed-off-by: Mingliang Liu <liuml07@apache.org>
Signed-off-by: Steve Loughran <stevel@apache.org>
2020-08-26 23:49:31 -07:00
Mingliang Liu
ee7d214118
Revert "HADOOP-17159 Ability for forceful relogin in UserGroupInformation class (#2197)"
This reverts commit da129a67bb.
2020-08-26 11:22:46 -07:00
sguggilam
da129a67bb
HADOOP-17159 Ability for forceful relogin in UserGroupInformation class (#2197)
Contributed by Sandeep Guggilam.

Signed-off-by: Mingliang Liu <liuml07@apache.org>
Signed-off-by: Steve Loughran <stevel@apache.org>
2020-08-24 23:40:56 -07:00
Joey
ce51048e8c HADOOP-16925. MetricsConfig incorrectly loads the configuration whose value is String list in the properties file (#1896)
Contributed by Jiayi Liu
2020-08-24 14:03:36 +01:00
Steve Loughran
49f8ae965e
HADOOP-13230. S3A to optionally retain directory markers.
This adds an option to disable "empty directory" marker deletion,
so avoid throttling and other scale problems.

This feature is *not* backwards compatible.
Consult the documentation and use with care.

Contributed by Steve Loughran.

Change-Id: I69a61e7584dc36e485d5e39ff25b1e3e559a1958
2020-08-15 20:19:49 +01:00
sguggilam
97dd1cb57e
HADOOP-17164. UGI loginUserFromKeytab doesn't set the last login time (#2178)
Contributed by Sandeep Guggilam.

Signed-off-by: Mingliang Liu <liuml07@apache.org>
Signed-off-by: Steve Loughran <stevel@apache.org>
2020-08-04 10:31:26 -07:00
Uma Maheswara Rao G
4fe491d10e HDFS-15478: When Empty mount points, we are assigning fallback link to self. But it should not use full URI for target fs. (#2160). Contributed by Uma Maheswara Rao G.
(cherry picked from commit ac9a07b51a)
2020-07-31 01:31:37 -07:00
Uma Maheswara Rao G
ae8261c671 HDFS-15464: ViewFsOverloadScheme should work when -fs option pointing to remote cluster without mount links (#2132). Contributed by Uma Maheswara Rao G.
(cherry picked from commit 3e70006639)
2020-07-31 01:31:15 -07:00
Ayush Saxena
e3b8d4eb05 HADOOP-17100. Replace Guava Supplier with Java8+ Supplier in Hadoop. Contributed by Ahmed Hussein. 2020-07-22 18:21:14 +05:30
Chen Liang
c8c40be761 HDFS-15404. ShellCommandFencer should expose info about source. Contributed by Chen Liang.
(cherry picked from commit 3833c616e0)
2020-07-20 15:22:34 -07:00
Ahmed Hussein
9e7266df6c HADOOP-17099. Replace Guava Predicate with Java8+ Predicate
Signed-off-by: Jonathan Eagles <jeagles@gmail.com>
(cherry picked from commit 1f71c4ae71)
2020-07-15 11:40:13 -05:00
Erik Krogen
67e01ed2ca HADOOP-17127. Use RpcMetrics.TIMEUNIT to initialize rpc queueTime and processingTime. Contributed by Jim Brennan.
(cherry picked from 317fe4584a)
2020-07-15 08:26:38 -07:00
Steve Loughran
a51d72f0c6 HDFS-13934. Multipart uploaders to be created through FileSystem/FileContext.
Contributed by Steve Loughran.

Change-Id: Iebd34140c1a0aa71f44a3f4d0fee85f6bdf123a3
2020-07-13 13:32:04 +01:00
Siyao Meng
358934059f HDFS-15462. Add fs.viewfs.overload.scheme.target.ofs.impl to core-default.xml (#2131)
(cherry picked from commit 0e694b20b9)
2020-07-09 16:30:58 -07:00
Uma Maheswara Rao G
f85ce2570e HDFS-15394. Add all available fs.viewfs.overload.scheme.target.<scheme>.impl classes in core-default.xml bydefault. Contributed by Uma Maheswara Rao G.
(cherry picked from commit 3ca15292c5)
2020-07-09 16:26:04 -07:00