23417 Commits

Author SHA1 Message Date
Steve Loughran
49df838995
HADOOP-16697. Tune/audit S3A authoritative mode.
Contains:

HADOOP-16474. S3Guard ProgressiveRenameTracker to mark destination
              dirirectory as authoritative on success.
HADOOP-16684. S3guard bucket info to list a bit more about
              authoritative paths.
HADOOP-16722. S3GuardTool to support FilterFileSystem.

This patch improves the marking of newly created/import directory
trees in S3Guard DynamoDB tables as authoritative.

Specific changes:

 * Renamed directories are marked as authoritative if the entire
   operation succeeded (HADOOP-16474).
 * When updating parent table entries as part of any table write,
   there's no overwriting of their authoritative flag.

s3guard import changes:

* new -verbose flag to print out what is going on.

* The "s3guard import" command lets you declare that a directory tree
is to be marked as authoritative

  hadoop s3guard import -authoritative -verbose s3a://bucket/path

When importing a listing and a file is found, the import tool queries
the metastore and only updates the entry if the file is different from
before, where different == new timestamp, etag, or length. S3Guard can get
timestamp differences due to clock skew in PUT operations.

As the recursive list performed by the import command doesn't retrieve the
versionID, the existing entry may in fact be more complete.
When updating an existing due to clock skew the existing version ID
is propagated to the new entry (note: the etags must match; this is needed
to deal with inconsistent listings).

There is a new s3guard command to audit a s3guard bucket/path's
authoritative state:

  hadoop s3guard authoritative -check-config s3a://bucket/path

This is primarily for testing/auditing.

The s3guard bucket-info command also provides some more details on the
authoritative state of a store (HADOOP-16684).

Change-Id: I58001341c04f6f3597fcb4fcb1581ccefeb77d91
2020-01-10 11:11:56 +00:00
Takanobu Asanuma
9da294a140 HDFS-15110. HttpFS: post requests are not supported for path "/". Contributed by hemanthboyina. 2020-01-10 17:53:19 +09:00
Adam Antal
20a90c0b5b
YARN-10071. Sync Mockito version with other modules
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2020-01-10 17:41:04 +09:00
Akira Ajisaka
0315ef8448 HDFS-15100. RBF: Print stacktrace when DFSRouter fails to fetch/parse JMX output from NameNode. (#1800) 2020-01-10 13:16:57 +09:00
Ayush Saxena
b32757c616 HDFS-15107. dfs.client.server-defaults.validity.period.ms to support time units. Contributed by Ayush Saxena. 2020-01-10 08:14:56 +05:30
Takanobu Asanuma
782c0556fb HDFS-15102. HttpFS: put requests are not supported for path "/". Contributed by hemanthboyina. 2020-01-10 09:52:13 +09:00
Eric E Payne
93233a7d6e YARN-9018. Add functionality to AuxiliaryLocalPathHandler to return all locations to read for a given path. Contributed by Kuhu Shukla (kshukla) 2020-01-09 17:18:44 +00:00
Akira Ajisaka
a40dc9ee31 HADOOP-15993. Upgrade Kafka to 2.4.0 in hadoop-kafka module. (#1796) 2020-01-09 16:24:58 +09:00
Surendra Singh Lilhore
bf45f3b80a HDFS-14957. INodeReference Space Consumed was not same in QuotaUsage and ContentSummary. Contributed by hemanthboyina. 2020-01-09 12:04:05 +05:30
Ayush Saxena
8fe01db34a HDFS-15094. RBF: Reuse ugi string in ConnectionPoolID. Contributed by Ayush Saxena. 2020-01-09 09:02:38 +05:30
Ayush Saxena
fd30f4c52b HDFS-15096. RBF: GetServerDefaults Should be Cached At Router. Contributed by Ayush Saxena. 2020-01-09 08:26:51 +05:30
Eric E Payne
b1e07d27cc YARN-7387: org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer fails intermittently. Contributed by Jim Brennan (Jim_Brennan) 2020-01-08 19:26:01 +00:00
Eric E Payne
6899be5a17 YARN-10072: TestCSAllocateCustomResource failures. Contributed by Jim Brennan (Jim_Brennan) 2020-01-08 17:29:56 +00:00
Ahmed Hussein
cdd6efd3ab MAPREDUCE-7252. Handling 0 progress in SimpleExponential task runtime estimator
Signed-off-by: Jonathan Eagles <jeagles@gmail.com>
2020-01-08 11:08:13 -06:00
Steve Loughran
52cc20e9ea
HADOOP-16642. ITestDynamoDBMetadataStoreScale fails when throttled.
Contributed by Steve Loughran.

Change-Id: If9b4ebe937200c17d7fdfb9923e6ae0ab4c541ef
2020-01-08 14:28:20 +00:00
Steve Loughran
17aa8f6764
HADOOP-16785. Improve wasb and abfs resilience on double close() calls.
This hardens the wasb and abfs output streams' resilience to being invoked
in/after close().

wasb:
  Explicity raise IOEs on operations invoked after close,
  rather than implicitly raise NPEs.
  This ensures that invocations which catch and swallow IOEs will perform as
  expected.

abfs:
  When rethrowing an IOException in the close() call, explicitly wrap it
  with a new instance of the same subclass.
  This is needed to handle failures in try-with-resources clauses, where
  any exception in closed() is added as a suppressed exception to the one
  thrown in the try {} clause
  *and you cannot attach the same exception to itself*

Contributed by Steve Loughran.

Change-Id: Ic44b494ff5da332b47d6c198ceb67b965d34dd1b
2020-01-08 11:46:54 +00:00
Steve Loughran
bb1aed475b
HADOOP-16751. Followup: move java import. (#1799)
This moves the import of a java module to the preferred place in the import ordering.

Change-Id: I1a594e3d954554a72c2b71c954eda0ae940a8f70
2020-01-08 11:32:31 +00:00
Tamás Pénzes
f1f3f23c3c HADOOP-16772. Extract version numbers to head of pom.xml (addendum) (#1773)
Follow up task of HADOOP-16729, extract even more version numbers.

Change-Id: I2aba142657f3978b24be2560ed6161f1132a8f9e
2020-01-08 12:25:01 +01:00
Rakesh Radhakrishnan
7030722e5d HDFS-15080. Fix the issue in reading persistent memory cached data with an offset. Contributed by Feilong He. 2020-01-08 14:25:17 +05:30
Masatake Iwasaki
aba3f6c3e1
HDFS-15077. Fix intermittent failure of TestDFSClientRetries#testLeaseRenewSocketTimeout. (#1797) 2020-01-08 16:45:39 +09:00
Prabhu Joseph
571795cd18 YARN-10068. Fix TimelineV2Client leaking File Descriptors.
Contributed by Anand Srinivasan. Reviewed by Adam Antal.
2020-01-08 12:01:30 +05:30
Masatake Iwasaki
a43c177f1d HDFS-15072. HDFS MiniCluster fails to start when run in directory path with a %. (#1775) 2020-01-08 11:28:34 +09:00
Hanisha Koneru
a7fccc1122 HADOOP-16727. KMS Jetty server does not startup if trust store password is null. 2020-01-07 15:46:14 -08:00
Sneha Vijayarajan
d1f5976c00
HADOOP-16699. Add verbose TRACE logging to ABFS.
Contributed by Sneha Vijayarajan,

Change-Id: Ic616a10406e6e9f11616c9cc05d8630ebbedaf65
2020-01-07 18:05:47 +00:00
Akira Ajisaka
bc366d4ea7
HADOOP-16773. Fix duplicate assertj-core dependency in hadoop-common module. Contributed by Xieming Li. 2020-01-07 20:49:24 +09:00
Steve Loughran
2bbf73f1df HADOOP-16645. S3A Delegation Token extension point to use StoreContext.
Contributed by Steve Loughran.

This is part of the ongoing refactoring of the S3A codebase, with the
delegation token support (HADOOP-14556) no longer given a direct reference
to the owning S3AFileSystem. Instead it gets a StoreContext and a new
interface, DelegationOperations, to access those operations offered by S3AFS
which are specifically needed by the DT bindings.

The sole operation needed is listAWSPolicyRules(), which is used to allow
S3A FS and the S3Guard metastore to return the AWS policy rules needed to
access their specific services/buckets/tables, allowing the AssumedRole
delegation token to be locked down.

As further restructuring takes place, that interface's implementation
can be moved to wherever the new home for those operations ends up.

Although it changes the API of an extension point, that feature (S3
Delegation Tokens) has not shipped; backwards compatibility is not a
problem except for anyone who has implemented DT support against trunk.
To those developers: sorry.

Change-Id: I770f58b49ff7634a34875ba37b7d51c94d7c21da
2020-01-07 11:17:37 +00:00
Takanobu Asanuma
59aac00283 HDFS-15066. HttpFS: Implement setErasureCodingPolicy , unsetErasureCodingPolicy , getErasureCodingPolicy. Contributed by hemanthboyina. 2020-01-07 11:10:32 +09:00
Mukund Thakur
819159fa06
HDFS-14788. Use dynamic regex filter to ignore copy of source files in Distcp.
Contributed by Mukund Thakur.

Change-Id: I781387ddce95ee300c12a160dc9a0f7d602403c3
2020-01-06 19:10:39 +00:00
Eric Yang
d81d45ff2f YARN-9956. Improved connection error message for YARN ApiServerClient.
Contributed by Prabhu Joseph
2020-01-06 13:24:16 -05:00
Szilard Nemeth
dd2607e3ec YARN-10026. Pull out common code pieces from ATS v1.5 and v2. Contributed by Adam Antal 2020-01-06 17:16:11 +01:00
Szilard Nemeth
768ee22e9e YARN-10035. Add ability to filter the Cluster Applications API request by name. Contributed by Adam Antal 2020-01-06 16:26:33 +01:00
Takanobu Asanuma
4a76ab777f HDFS-15090. RBF: MountPoint Listing Should Return Flag Values Of Destination. Contributed by Ayush Saxena. 2020-01-06 18:09:59 +09:00
Sergey Pogorelov
b343e1533b MAPREDUCE-7255. Fix typo in MapReduce documentaion example (#1793) 2020-01-06 12:36:11 +09:00
luhuachao
77ae7b9ce2 HDFS-15089. RBF: SmallFix for RBFMetrics in doc (#1786) 2020-01-06 12:31:13 +09:00
Ayush Saxena
f8644fbe9f HDFS-15091. Cache Admin and Quota Commands Should Check SuperUser Before Taking Lock. Contributed by Ayush Saxena. 2020-01-04 19:02:59 +05:30
Masatake Iwasaki
037ec8cfb1 HDFS-15068. DataNode could meet deadlock if invoke refreshVolumes when register. Contributed by Aiphago.
Signed-off-by: Masatake Iwasaki <iwasakims@apache.org>
2020-01-04 01:55:36 +09:00
Rajesh Balamohan
b19d87c2b7
HADOOP-16751. DurationInfo text parsing/formatting should be moved out of hotpath.
Contributed by Rajesh Balamohan

Change-Id: Icc3dcfa81aa69164f2c088f9b533d231138cbb8b
2020-01-02 17:03:07 +00:00
Ayush Saxena
1b04bcc0d9 HADOOP-16784. Update the year to 2020. Contributed by Ayush Saxena. 2020-01-02 21:42:36 +05:30
Steve Loughran
958764479d
HADOOP-16777. Add Tez to LimitedPrivate of ClusterStorageCapacityExceededException
Contributed by Wang Yan.

Change-Id: I92dfe7079ba8ebe89d70255bb845309be0603a8e
2020-01-02 15:49:42 +00:00
Steve Loughran
b6dc00f481
HADOOP-16775. DistCp reuses the same temp file within the task for different files.
Contributed by Amir Shenavandeh.

This avoids overwrite consistency issues with S3 and other stores -though
given S3's copy operation is O(data), you are still best of using -direct
when distcp-ing to it.

Change-Id: I8dc9f048ad0cc57ff01543b849da1ce4eaadf8c3
2020-01-02 15:36:33 +00:00
Prabhu Joseph
eca7e14c2f YARN-10053. Use Shared Group Mapping Service in Placement Rules.
Contributed by Wilfred Spiegelenburg.
2020-01-02 14:13:57 +05:30
Prabhu Joseph
21ada4d1b0 Revert "YARN-10053. Use Shared Group Mapping Service in Placement Rules."
This reverts commit 217b56ffdd5fa254f06734bc8cb6f04a02066f1a.
2020-01-02 14:12:43 +05:30
Prabhu Joseph
217b56ffdd YARN-10053. Use Shared Group Mapping Service in Placement Rules.
Contributed by Wilfred Spiegelenburg.
2020-01-02 14:07:49 +05:30
Rakesh Radhakrishnan
d79cce20ab HDFS-14740. Recover data blocks from persistent memory read cache during datanode restarts. Contributed by Feilong He. 2020-01-02 11:44:00 +05:30
Takanobu Asanuma
074050ca59 HDFS-15063. HttpFS: getFileStatus doesn't return ecPolicy. Contributed by hemanthboyina. 2020-01-01 11:26:38 +09:00
Ayush Saxena
62423910a4 HDFS-14937. [SBN read] ObserverReadProxyProvider should throw InterruptException. Contributed by xuzq. 2019-12-29 13:07:22 +05:30
Surendra Singh Lilhore
ee51eadda0 HDFS-15074. DataNode.DataTransfer thread should catch all the expception and log it. Contributed by hemanthboyina. 2019-12-29 11:15:54 +05:30
Takanobu Asanuma
dc32f583af HDFS-14934. [SBN Read] Standby NN throws many InterruptedExceptions when dfs.ha.tail-edits.period is 0. Contributed by Ayush Saxena. 2019-12-28 21:32:15 +09:00
Ayush Saxena
926d0b48f0 HDFS-15081. Typo in RetryCache#waitForCompletion annotation. Contributed by Fei Hui. 2019-12-27 18:32:15 +05:30
Liu sheng
0fed874adf YARN-10041. Should not use AbstractPath to create unix domain socket (#1771) 2019-12-27 16:50:15 +05:30