Commit Graph

5198 Commits

Author SHA1 Message Date
Steve Loughran
9221704f85
HADOOP-16490. Avoid/handle cached 404s during S3A file creation.
Contributed by Steve Loughran.

This patch avoids issuing any HEAD path request when creating a file with overwrite=true,
so 404s will not end up in the S3 load balancers unless someone calls getFileStatus/exists/isFile
in their own code.

The Hadoop FsShell CommandWithDestination class is modified to not register uncreated files
for deleteOnExit(), because that calls exists() and so can place the 404 in the cache, even
after S3A is patched to not do it itself.

Because S3Guard knows when a file should be present, it adds a special FileNotFound retry policy
independently configurable from other retry policies; it is also exponential, but with
different parameters. This is because every HEAD request will refresh any 404 cached in
the S3 Load Balancers. It's not enough to retry: we have to have a suitable gap between
attempts to (hopefully) ensure any cached entry wil be gone.

The options and values are:

fs.s3a.s3guard.consistency.retry.interval: 2s
fs.s3a.s3guard.consistency.retry.limit: 7

The S3A copy() method used during rename() raises a RemoteFileChangedException which is not caught
so not downgraded to false. Thus: when a rename is unrecoverable, this fact is propagated.

Copy operations without S3Guard lack the confidence that the file exists, so don't retry the same way:
it will fail fast with a different error message. However, because create(path, overwrite=false) no
longer does HEAD path, we can at least be confident that S3A itself is not creating those cached
404 markers.

Change-Id: Ia7807faad8b9a8546836cb19f816cccf17cca26d
2019-09-11 16:46:25 +01:00
Daisuke Kobayashi
bc2d3a71d6 HADOOP-16549. Remove Unsupported SSL/TLS Versions from Docs/Properties. Contributed by Daisuke Kobayashi.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
Reviewed-by: Akira Ajisaka <aajisaka@apache.org>
2019-09-10 10:51:47 +08:00
Sneha Vijayarajan
147f98629c
HADOOP-16438. ADLS Gen1 OpenSSL config control.
Contributed by Sneha Vijayarajan.

Change-Id: Ib79ea6b4a90ad068033e175f3f59c5185868872d
2019-09-09 17:09:32 +01:00
Jungtaek Lim (HeartSaVioR)
bb0b922a71
HADOOP-16255. Add ChecksumFs.rename(path, path, boolean)
Contributed by Jungtaek Lim

Change-Id: If00a4d7d30456c08eb2b0f7e2b242197bc4ee05d
2019-09-06 21:53:00 +01:00
Erik Krogen
a23417533e HADOOP-16531. Log more timing information for slow RPCs. Contributed by Chen Zhang. 2019-09-06 10:28:21 -07:00
Erik Krogen
c92a3e94d8 HADOOP-15565. Add an inner FS cache to ViewFileSystem, separate from the global cache, to avoid file system leaks. Contributed by Jinglun. 2019-09-06 10:22:28 -07:00
Steve Loughran
511df1e837 HADOOP-16430. S3AFilesystem.delete to incrementally update s3guard with deletions
Contributed by Steve Loughran.

This overlaps the scanning for directory entries with batched calls to S3 DELETE and updates of the S3Guard tables.
It also uses S3Guard to list the files to delete, so find newly created files even when S3 listings are not use consistent.

For path which the client considers S3Guard to be authoritative, we also do a recursive LIST of the store and delete files; this is to find unindexed files and do guarantee that the delete(path, true) call really does delete everything underneath.

Change-Id: Ice2f6e940c506e0b3a78fa534a99721b1698708e
2019-09-05 14:25:15 +01:00
Erik Krogen
337e9b794d HADOOP-16268. Allow StandbyException to be thrown as CallQueueOverflowException when RPC call queue is filled. Contributed by CR Hota. 2019-09-04 08:22:02 -07:00
Surendra Singh Lilhore
5ff76cb8bc HDFS-14630. Configuration.getTimeDurationHelper() should not log time unit warning in info log. Contributed by hemanthboyina. 2019-09-03 12:37:09 +05:30
Stephen O'Donnell
915cbc91c0 HDFS-14706. Checksums are not checked if block meta file is less than 7 bytes. Contributed by Stephen O'Donnell.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
2019-09-02 09:47:04 -07:00
Akira Ajisaka
dc0acceabb
YARN-9783. Remove low-level zookeeper test to be able to build Hadoop against zookeeper 3.5.5. Contributed by Mate Szalay-Beko. 2019-08-30 10:13:10 +09:00
Surendra Singh Lilhore
29bd6f3fc3 HDFS-8631. WebHDFS : Support setQuota. Contributed by Chao Sun. 2019-08-28 23:58:23 +05:30
Akira Ajisaka
55cc115878 HADOOP-16527. Add a whitelist of endpoints to skip Kerberos authentication (#1336) Contributed by Akira Ajisaka. 2019-08-28 14:28:41 +09:00
Steve Loughran
61b2df2331
HADOOP-16470. Make last AWS credential provider in default auth chain EC2ContainerCredentialsProviderWrapper.
Contributed by Steve Loughran.

Contains HADOOP-16471. Restore (documented) fs.s3a.SharedInstanceProfileCredentialsProvider.

Change-Id: I06b99b57459cac80bf743c5c54f04e59bb54c2f8
2019-08-22 17:27:56 +01:00
Akira Ajisaka
30ce8546f1
HADOOP-16496. Apply HDDS-1870 (ConcurrentModification at PrometheusMetricsSink) to Hadoop common.
This closes #1317

Reviewed-by: Bharat Viswanadham <bharat@apache.org>
2019-08-21 10:10:11 +09:00
Wei-Chiu Chuang
51b65370b9 HADOOP-14784. [KMS] Improve KeyAuthorizationKeyProvider#toString(). Contributed by Yeliang Cang.
Reviewed-by: Dinesh Chitlangia <dchitlangia@cloudera.com>
2019-08-19 11:12:09 -07:00
David Mollitor
a707bb7c1b HADOOP-15246. SpanReceiverInfo - Prefer ArrayList over LinkedList. Contributed by David Mollitor.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
2019-08-18 18:44:35 -07:00
Wei-Chiu Chuang
971a4c8e83 HDFS-14523. Remove excess read lock for NetworkToplogy. Contributed by Wu Weiwei.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
Reviewed-by: Chen Liang <cliang@apache.org>
2019-08-16 17:19:58 -07:00
Erik Krogen
e356e4f4b7 HADOOP-16391 Add a prefix to the metric names for MutableRatesWithAggregation used for deferred RPC metrics to avoid collision with non-deferred metrics. Contributed by Bilwa S T. 2019-08-16 09:01:44 -07:00
Wei-Chiu Chuang
5882cf94ea HADOOP-16504. Increase ipc.server.listen.queue.size default from 128 to 256. Contributed by Lisheng Sun. 2019-08-15 15:20:54 -07:00
Akira Ajisaka
0f8add8a60
HADOOP-16495. Fix invalid metric types in PrometheusMetricsSink (#1244) 2019-08-14 12:24:03 +09:00
Masatake Iwasaki
da0006fe04 HDFS-14423. Percent (%) and plus (+) characters no longer work in WebHDFS.
Signed-off-by: Masatake Iwasaki <iwasakims@apache.org>
2019-08-14 08:39:40 +09:00
Inigo Goiri
6b4564f1d5 HADOOP-16453. Update how exceptions are handled in NetUtils. Contributed by Lisheng Sun. 2019-08-11 20:34:36 -07:00
Steve Loughran
e25a5c2eab HADOOP-16499. S3A retry policy to be exponential (#1246). Contributed by Steve Loughran. 2019-08-09 15:52:37 +02:00
Szilard Nemeth
df30d8ea09 YARN-9727: Allowed Origin pattern is discouraged if regex contains *. Contributed by Zoltan Siegl 2019-08-09 09:34:23 +02:00
Zsombor Gegesy
b0131bc265 HADOOP-15014. Addendum: KMS should log the IP address of the clients. Contributed by Zsombor Gegesy.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
2019-08-07 20:57:42 -07:00
Yiqun Lin
a5bb1e8ee8 HDFS-14313. Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du. Contributed by Lisheng Sun. 2019-08-07 10:18:11 +08:00
Wei-Chiu Chuang
8cef9f89f4 HDFS-14652. Addendum: HealthMonitor connection retry times should be configurable. Contributed by Chen Zhang. 2019-08-06 15:24:11 -07:00
Eric Yang
22430c10e2 HADOOP-16457. Fixed Kerberos activation in ServiceAuthorizationManager.
Contributed by Prabhu Joseph
2019-08-06 17:04:17 -04:00
Jianfei Jiang
71aad60e51
HDFS-14691. Wrong usage hint for hadoop fs command "test".
Contributed by Jianfei Jiang.

Change-Id: I9f5e89721ff210641375fbf42a70043f0d74458e
2019-08-05 13:08:47 +01:00
Wei-Chiu Chuang
61180f4656 HADOOP-15942. Change the logging level form DEBUG to ERROR for RuntimeErrorException in JMXJsonServlet. Contributed by Anuhan Torgonshar. 2019-08-02 14:58:24 -07:00
Wei-Chiu Chuang
797d14e816 Revert "HADOOP-16336. finish variable is unused in ZStandardCompressor. Contributed by cxorm."
This reverts commit 076618677d.
2019-08-02 08:25:41 -07:00
Wei-Chiu Chuang
e872ceb810 HADOOP-15865. ConcurrentModificationException in Configuration.overlay() method. Contributed by Oleksandr Shevchenko. 2019-08-01 19:56:51 -07:00
Wei-Chiu Chuang
e20b19543b HADOOP-15681. AuthenticationFilter should generate valid date format for Set-Cookie header regardless of default Locale. Contributed by Cao Manh Dat. 2019-08-01 17:35:31 -07:00
Wei-Chiu Chuang
d086d058d8 HDFS-14652. HealthMonitor connection retry times should be configurable. Contributed by Chen Zhang. 2019-08-01 16:13:10 -07:00
Wei-Chiu Chuang
b94eba9f11 HADOOP-12282. Connection thread's name should be updated after address changing is detected. Contributed by Lisheng Sun. 2019-08-01 15:50:43 -07:00
He Xiaoqiao
f86de6f76a HDFS-13529. Fix default trash policy emptier trigger time correctly. Contributed by He Xiaoqiao.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
2019-08-01 14:54:39 -07:00
Akira Ajisaka
8bda91d20a
HADOOP-16398. Exports Hadoop metrics to Prometheus (#1170) 2019-07-31 10:11:36 -07:00
Wei-Chiu Chuang
0f2dad6679 HDFS-14569. Result of crypto -listZones is not formatted properly. Contributed by hemanthboyina. 2019-07-30 16:52:42 -07:00
Siyao Meng
c75f16db79 HADOOP-16452. Increase ipc.maximum.data.length default from 64MB to 128MB. Contributed by Siyao Meng.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
2019-07-30 11:22:45 -07:00
Don Jeba
204a977f55
HADOOP-15910. Fix Javadoc for LdapAuthenticationHandler#ENABLE_START_TLS
Contributed by Don Jeba.

Change-Id: I2755bfb1263fc659078a1af8f0bdfd739fd1ae40
2019-07-30 12:39:48 +01:00
Akira Ajisaka
cbfa3f3e98
HADOOP-16435. RpcMetrics should not retained forever. Contributed by Zoltan Haindrich. 2019-07-29 17:37:26 -07:00
Erik Krogen
62efb63006 HADOOP-16245. Restrict the effect of LdapGroupsMapping SSL configurations to avoid interfering with other SSL connections. Contributed by Erik Krogen. 2019-07-26 11:16:58 -07:00
Gopal V
aebac6d2d2
HADOOP-16461. Regression: FileSystem cache lock parses XML within the lock.
Contributed by Gopal V.

Change-Id: If6654f850e9c24ee0d9519a46fd6269b18e1a7a4
2019-07-26 11:32:13 +01:00
Steve Loughran
07530314c2
HADOOP-9844. NPE when trying to create an error message response of SASL RPC
This closes #55

Change-Id: I10a20380565fa89762f4aa564b2f1c83b9aeecdc
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2019-07-26 17:53:18 +09:00
sunlisheng
a1251addff
HADOOP-16431. Remove useless log in IOUtils.java and ExceptionDiags.java.
This closes #1091

Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2019-07-24 10:04:39 +09:00
Steve Loughran
4317d33232
HADOOP-16380. S3Guard to determine empty directory status for all non-root directories.
Contributed by Steve Loughran and Gabor Bota.

This
* Asks S3Guard to determine the empty directory status.
* Has S3A's root directory rm("/") command to always return false (as abfs does)
* Documents that object stores MAY do this
* Overloads ContractTestUtils.assertDeleted to let assertions declare that the source directory does not need to exist. This stops inconsistencies in directory listings failing a root test.

It avoids a recent regression (HADOOP-16279) where if there was a tombstone above the first element found in a directory listing, the directory would be considered empty, when in fact there were child entries. That could downgrade an rm(path, recursive) to a no-op, while also confusing rename(src, dest), as dest could be mistaken for an empty directory and so permit the copy above it, rather than reject it "destination path exists and is not empty".

Change-Id: I136a3d1a5a48a67e6155d790a40ff558d0d2c108
2019-07-23 14:52:03 +01:00
S O'Donnell
eb36b09cb7
HADOOP-16443. Improve help text for setfacl --set option.
Contributed by S O'Donnell.

Change-Id: I1da46c4c414a5d2b07ee15867508f0799440a413
2019-07-23 10:24:07 +01:00
Sean Mackrory
7f1b76ca35
HADOOP-13868. [s3a] New default for S3A multi-part configuration (#1125) 2019-07-19 09:49:59 -06:00
Josh Rosen
d545f9c290 HADOOP-16437 documentation typo fix: fs.s3a.experimental.input.fadvise
Fix fs.s3a.experimental.fadvise to fs.s3a.experimental.input.fadvise 

Contributed by: Josh Rosen
2019-07-18 23:19:38 +01:00