Commit Graph

2149 Commits

Author SHA1 Message Date
Liang-Chi Hsieh
6014a089fd
HADOOP-17825. Add BuiltInGzipCompressor (#3250)
Currently, GzipCodec only supports BuiltInGzipDecompressor, if native zlib is not loaded. So, without Hadoop native codec installed, saving SequenceFile using GzipCodec will throw exception like "SequenceFile doesn't work with GzipCodec without native-hadoop code!"

Same as other codecs which we migrated to using prepared packages (lz4, snappy), it will be better if we support GzipCodec generally without Hadoop native codec installed. Similar to BuiltInGzipDecompressor, we can use Java Deflater to support BuiltInGzipCompressor.
2021-08-16 10:08:03 -07:00
Viraj Jasani
6342d5e523
HDFS-16171. De-flake testDecommissionStatus (#3280)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2021-08-16 14:54:25 +09:00
Viraj Jasani
23e2a0b202
HADOOP-17835. Use CuratorCache implementation instead of PathChildrenCache / TreeCache (#3266)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2021-08-07 11:20:35 +09:00
Bryan Beaudreault
b0b867e977
HADOOP-17837: Add unresolved endpoint value to UnknownHostException (ADDENDUM) (#3276) 2021-08-06 21:54:07 +05:30
Bryan Beaudreault
5e54d92e6e
HADOOP-17837: Add unresolved endpoint value to UnknownHostException (#3272) 2021-08-06 17:00:20 +08:00
Viraj Jasani
ccfa072dc7
HADOOP-17612. Upgrade Zookeeper to 3.6.3 and Curator to 5.2.0 (#3241)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2021-08-03 14:44:00 +09:00
Steve Loughran
ee466d4b40
HADOOP-17628. Distcp contract test is really slow with ABFS and S3A; timing out. (#3240)
This patch cuts down the size of directory trees used for
distcp contract tests against object stores, so making
them much faster against distant/slow stores.

On abfs, the test only runs with -Dscale (as was the case for s3a already),
and has the larger scale test timeout.

After every test case, the FileSystem IOStatistics are logged,
to provide information about what IO is taking place and
what it's performance is.

There are some test cases which upload files of 1+ MiB; you can
increase the size of the upload in the option
"scale.test.distcp.file.size.kb" 
Set it to zero and the large file tests are skipped.

Contributed by Steve Loughran.
2021-08-02 11:36:43 +01:00
Petre Bogdan Stolojan
a218038960
HADOOP-17139 Re-enable optimized copyFromLocal implementation in S3AFileSystem (#3101)
This work
* Defines the behavior of FileSystem.copyFromLocal in filesystem.md
* Implements a high performance implementation of copyFromLocalOperation
  for S3 
* Adds a contract test for the operation: AbstractContractCopyFromLocalTest
* Implements the contract tests for Local and S3A FileSystems

Contributed by: Bogdan Stolojan
2021-07-30 19:42:08 +01:00
Viraj Jasani
e001f8ee39
HADOOP-17814. Provide fallbacks for identity/cost providers and backoff enable (#3230)
Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org>
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2021-07-29 02:10:07 +09:00
Mehakmeet Singh
f813554769
HADOOP-13887. Support S3 client side encryption (S3-CSE) using AWS-SDK (#2706)
This (big!) patch adds support for client side encryption in AWS S3,
with keys managed by AWS-KMS.

Read the documentation in encryption.md very, very carefully before
use and consider it unstable.

S3-CSE is enabled in the existing configuration option
"fs.s3a.server-side-encryption-algorithm":

fs.s3a.server-side-encryption-algorithm=CSE-KMS
fs.s3a.server-side-encryption.key=<KMS_KEY_ID>

You cannot enable CSE and SSE in the same client, although
you can still enable a default SSE option in the S3 console. 
  
* Filesystem list/get status operations subtract 16 bytes from the length
  of all files >= 16 bytes long to compensate for the padding which CSE
  adds.
* The SDK always warns about the specific algorithm chosen being
  deprecated. It is critical to use this algorithm for ranged
  GET requests to work (i.e. random IO). Ignore.
* Unencrypted files CANNOT BE READ.
  The entire bucket SHOULD be encrypted with S3-CSE.
* Uploading files may be a bit slower as blocks are now
  written sequentially.
* The Multipart Upload API is disabled when S3-CSE is active.

Contributed by Mehakmeet Singh
2021-07-27 11:08:51 +01:00
Viraj Jasani
e1d00addb5
HADOOP-16290. Enable RpcMetrics units to be configurable (#3198)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2021-07-19 23:55:49 -07:00
Viraj Jasani
df44178eb6
HADOOP-17795. Provide fallbacks for callqueue.impl and scheduler.impl (#3192)
Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org>
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2021-07-14 20:58:32 +09:00
Abhishek Das
1dd03cc4b5 HADOOP-17028. ViewFS should initialize mounted target filesystems lazily. Contributed by Abhishek Das (#2260) 2021-07-13 18:11:50 -07:00
Viraj Jasani
618c9218ee
HADOOP-17788. Replace IOUtils#closeQuietly usages by Hadoop's own utility (#3171)
Reviewed-by: Steve Loughran <stevel@apache.org>
Reviewed-by: Akira Ajisaka <aajisaka@apache.org>
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2021-07-08 16:03:40 +09:00
liangxs
a5db6831bc
HADOOP-17749. Remove lock contention in SelectorPool of SocketIOWithTimeout (#3080) 2021-07-06 09:11:03 +08:00
Rafal Wojdyla
f639fbc29f
HADOOP-17402. Add GCS config to the core-site (#2638)
Contributed by Rafal Wojdyla
2021-07-05 21:07:12 +01:00
Akira Ajisaka
20a4b1ae36
HADOOP-17331. [JDK 16] TestDNS fails (#2884) 2021-06-30 03:06:29 -07:00
Viraj Jasani
c488abbc79
HDFS-16075. Use empty array constants present in StorageType and DatanodeInfo to avoid creating redundant objects (#3115)
Reviewed-by: Hui Fei <ferhui@apache.org>
2021-06-21 10:25:12 +09:00
Takanobu Asanuma
9e7c7ad129
HADOOP-17760. Delete hadoop.ssl.enabled and dfs.https.enable from docs and core-default.xml (#3099)
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
2021-06-17 09:58:47 +09:00
Steve Loughran
762a83e044
HADOOP-17631. Configuration ${env.VAR:-FALLBACK} to eval FALLBACK when restrictSystemProps=true (#2977)
Contributed by Steve Loughran.
2021-06-08 21:56:40 +01:00
Viraj Jasani
f4b24c68e7
HADOOP-17743. Replace Guava Lists usage by Hadoop's own Lists in hadoop-common, hadoop-tools and cloud-storage projects (#3072) 2021-06-07 13:24:09 +09:00
Viraj Jasani
59fc4061cb
HADOOP-17152. Provide Hadoop's own Lists utility to reduce dependency on Guava (#3061)
Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org>
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2021-06-03 18:56:00 +09:00
Steve Loughran
832a3c6a89
HADOOP-17511. Add audit/telemetry logging to S3A connector (#2807)
The S3A connector supports
"an auditor", a plugin which is invoked
at the start of every filesystem API call,
and whose issued "audit span" provides a context
for all REST operations against the S3 object store.

The standard auditor sets the HTTP Referrer header
on the requests with information about the API call,
such as process ID, operation name, path,
and even job ID.

If the S3 bucket is configured to log requests, this
information will be preserved there and so can be used
to analyze and troubleshoot storage IO.

Contributed by Steve Loughran.
2021-05-25 10:25:41 +01:00
Vinayakumar B
2bbeae3240
HDFS-15790. Make ProtobufRpcEngineProtos and ProtobufRpcEngineProtos2 Co-Exist (#2767) 2021-05-24 02:45:39 -07:00
Viraj Jasani
e4062ad027
HADOOP-17115. Replace Guava Sets usage by Hadoop's own Sets in hadoop-common and hadoop-tools (#2985)
Signed-off-by: Sean Busbey <busbey@apache.org>
2021-05-20 10:47:04 -05:00
Hongbing Wang
f7247922b7
HDFS-16018. Optimize the display of hdfs "count -e" or "count -t" com… (#2994) 2021-05-20 11:23:54 +08:00
Xiaoyu Yao
86729e130f
HADOOP-17699. Remove hardcoded SunX509 usage from SSLFactory. (#3016) 2021-05-18 10:11:36 -07:00
Akira Ajisaka
35ca1dcb9d
HADOOP-17685. Fix junit deprecation warnings in hadoop-common module. (#2983)
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2021-05-13 14:22:25 +09:00
Viraj Jasani
b93e448f9a
HADOOP-11616. Remove workaround for Curator's ChildReaper requiring Guava 15+ (#2973)
Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2021-05-06 04:52:02 +09:00
kishendas
e571025f5b
HADOOP-17657: implement StreamCapabilities in SequenceFile.Writer and fall back to flush, if hflush is not supported (#2949)
Co-authored-by: Kishen Das <kishen@cloudera.com>
Reviewed-by: Steve Loughran <stevel@apache.org>
2021-05-04 01:20:56 -07:00
Wei-Chiu Chuang
b2e54762a4
HDFS-15624. fix the function of setting quota by storage type (#2377) (#2955)
1. puts NVDIMM to the end of storage type enum to make sure compatibility.
2. adds check to make sure the software layout version is satisfied

Co-authored-by: su xu <kevinbrandon@163.com>
Co-authored-by: huangtianhua <huangtianhua223@gmail.com>
Co-authored-by: YaYun-Wang <34060507+YaYun-Wang@users.noreply.github.com>

Signed-off-by: Mingliang Liu <liuml07@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
Signed-off-by: Vinayakumar B <vinayakumarb@apache.org>

Change-Id: I3c58beef50730827a09b3c968e9ad637baa57d44
2021-04-28 23:54:39 -07:00
Wei-Chiu Chuang
90c6caf650 Revert "HDFS-15624. fix the function of setting quota by storage type (#2377)"
This reverts commit 394b9f7a5c.

Ref: HDFS-15995.
Had to revert this commit, so we can commit HDFS-15566 (a critical bug preventing rolling upgrade to Hadoop 3.3)
Will re-work this fix again later.
2021-04-26 11:27:15 +08:00
Viraj Jasani
9179638017
HADOOP-17524. Remove EventCounter and Log counters from JVM Metrics (#2909)
Reviewed-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2021-04-15 18:04:46 +09:00
Akira Ajisaka
156ecc89be
HADOOP-17630. [JDK 15] TestPrintableString fails due to Unicode 13.0 support. (#2890)
Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org>
2021-04-13 17:08:49 +09:00
Viraj Jasani
3f2682b92b
HADOOP-17622. Avoid usage of deprecated IOUtils#cleanup API. (#2862)
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2021-04-06 13:39:10 +09:00
Borislav Iordanov
2c482fbacf HADOOP-16524. Automatic keystore reloading for HttpServer2
Reapply of issue reverted first because it caused yarn failures and
then again because the commit message was incorrectly formatted
(and yet again because of commit message format).

Signed-off-by: stack <stack@apache.org>
2021-03-31 10:46:35 -07:00
stack
22961a615d Revert "HADOOP-16524. Automatic keystore reloading for HttpServer2"
This reverts commit a2975d2153.
2021-03-31 10:43:09 -07:00
stack
a2975d2153 HADOOP-16524. Automatic keystore reloading for HttpServer2
Reapply of issue reverted first because it caused yarn failures and
then again because the commit message was incorrectly formatted.
2021-03-31 10:40:20 -07:00
stack
5183aaeda2 Revert "Hadoop 16524 - resubmission following some unit test fixes (#2693)"
Revert to fix the summary message.

This reverts commit 9509bebf7f.
2021-03-31 10:39:55 -07:00
Borislav Iordanov
9509bebf7f
Hadoop 16524 - resubmission following some unit test fixes (#2693)
Signed-off-by: stack <stack@apache.org>
2021-03-31 10:07:42 -07:00
Ayush Saxena
f5c1557288
HADOOP-17531.Addendum: DistCp: Reduce memory usage on copying huge directories. (#2820). Contributed by Ayush Saxena.
Signed-off-by: Steve Loughran <stevel@apache.org>
2021-03-27 03:01:41 +05:30
Akira Ajisaka
af1f9f43ea
HADOOP-17133. Implement HttpServer2 metrics (#2145) 2021-03-25 12:09:43 -07:00
touchida
95e6892675
HDFS-15759. EC: Verify EC reconstruction correctness on DataNode (#2585) 2021-03-24 16:56:09 +08:00
Ayush Saxena
03cfc85279
HADOOP-17531. DistCp: Reduce memory usage on copying huge directories. (#2732). Contributed by Ayush Saxena.
Signed-off-by: Steve Loughran <stevel@apache.org>
2021-03-24 02:36:26 +05:30
Jim Brennan
299b8062f1 MAPREDUCE-7322. revisiting TestMRIntermediateDataEncryption. Contributed by Ahmed Hussein. 2021-03-15 20:13:17 +00:00
He Xiaoqiao
b1dc6c40a0
HADOOP-17585. Correct timestamp format in the docs for the touch command. Contributed by Stephen O'Donnell. 2021-03-14 18:09:50 +08:00
Chao Sun
176bd88890
HADOOP-16080. hadoop-aws does not work with hadoop-client-api. (#2522)
Contributed by Chao Sun.

(Cherry-picked via PR #2575)
2021-03-09 20:01:29 +00:00
Haoze Wu
ef7ab535c5
HADOOP-17552. Change ipc.client.rpc-timeout.ms from 0 to 120000 by default to avoid potential hang. (#2727) 2021-03-06 22:26:16 +09:00
Ahmed Hussein
e04bcb3a06
MAPREDUCE-7320. organize test directories for ClusterMapReduceTestCase (#2722). Contributed by Ahmed Hussein 2021-02-26 13:42:33 -06:00
Mike
7b7c0019f4
HADOOP-17528. SFTP File System: close the connection pool when closing a FileSystem (#2701)
Contributed by Mike Pryakhin.
2021-02-23 17:03:27 +00:00