Commit Graph

24360 Commits

Author SHA1 Message Date
Mukund Thakur
29b19cd592
HADOOP-16900. Very large files can be truncated when written through the S3A FileSystem.
Contributed by Mukund Thakur and Steve Loughran.

This patch ensures that writes to S3A fail when more than 10,000 blocks are
written. That upper bound still exists. To write massive files, make sure
that the value of fs.s3a.multipart.size is set to a size which is large
enough to upload the files in fewer than 10,000 blocks.

Change-Id: Icec604e2a357ffd38d7ae7bc3f887ff55f2d721a
2020-05-20 13:42:25 +01:00
Prabhu Joseph
cef0756929 YARN-9606. Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient.
Contributed by Bilwa S T. Reviewed by Akira Ajisaka.
2020-05-20 11:36:52 +05:30
Yiqun Lin
1983eea62d HDFS-15340. RBF: Implement BalanceProcedureScheduler basic framework. Contributed by Jinglun. 2020-05-20 10:39:40 +08:00
Masatake Iwasaki
0b7799bf6e
HADOOP-16586. ITestS3GuardFsck, others fails when run using a local metastore. (#1950) 2020-05-20 08:47:04 +09:00
Sneha Vijayarajan
8f78aeb250
Hadoop-17015. ABFS: Handling Rename and Delete idempotency
Contributed by Sneha Vijayarajan.
2020-05-19 12:30:07 -07:00
Surendra Singh Lilhore
d4e36409d4 MAPREDUCE-6826. Job fails with InvalidStateTransitonException: Invalid event: JOB_TASK_COMPLETED at SUCCEEDED/COMMITTING. Contributed by Bilwa S T. 2020-05-19 11:06:36 +05:30
Abhishek Das
ce4ec74453
HADOOP-17024. ListStatus on ViewFS root (ls "/") should list the linkFallBack root (configured target root). Contributed by Abhishek Das. 2020-05-18 22:27:12 -07:00
bilaharith
bdbd59cfa0
HADOOP-17004. ABFS: Improve the ABFS driver documentation
Contributed by Bilahari T H.
2020-05-18 20:45:54 -07:00
Chen Liang
7bb902bc0d HDFS-15293. Relax the condition for accepting a fsimage when receiving a checkpoint. Contributed by Chen Liang 2020-05-18 10:58:52 -07:00
Ayush Saxena
c84e6beada HDFS-14999. Avoid Potential Infinite Loop in DFSNetworkTopology. Contributed by Ayush Saxena. 2020-05-18 22:24:34 +05:30
Wei-Chiu Chuang
2abcf7762a HDFS-15202 Boost short circuit cache (rebase PR-1884) (#2016) 2020-05-18 09:23:09 -07:00
Wei-Chiu Chuang
4525292d41 Revert "HDFS-15202 Boost short circuit cache (rebase PR-1884) (#2016)"
This reverts commit 86e6aa8eec.
2020-05-18 09:22:05 -07:00
Wei-Chiu Chuang
50caba1a92 HDFS-15207. VolumeScanner skip to scan blocks accessed during recent scan peroid. Contributed by Yang Yun. 2020-05-18 08:40:38 -07:00
He Xiaoqiao
a3f44dacc1 HDFS-13183. Standby NameNode process getBlocks request to reduce Active load. Contributed by Xiaoqiao He.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
2020-05-18 07:09:43 -07:00
pustota2009
86e6aa8eec
HDFS-15202 Boost short circuit cache (rebase PR-1884) (#2016)
Added parameter dfs.client.short.circuit.num improving HDFS-client's massive reading performance by create few instances ShortCircuit caches instead of one. It helps avoid locks and lets CPU do job.
2020-05-18 07:04:04 -07:00
Akira Ajisaka
b65815d691
Revert "YARN-9606. Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient."
This reverts commit 7836bc4c35.
2020-05-18 16:29:07 +09:00
Akira Ajisaka
27601fc79e
HADOOP-17042. Hadoop distcp throws 'ERROR: Tools helper ///usr/lib/hadoop/libexec/tools/hadoop-distcp.sh was not found'. Contributed by Aki Tanaka. 2020-05-18 15:36:20 +09:00
Ayush Saxena
a3809d2023 HDFS-15082. RBF: Check each component length of destination path when add/update mount entry. Contributed by Xiaoqiao He. 2020-05-17 19:45:34 +05:30
Ayush Saxena
6e416a83d1 HDFS-15358. RBF: Unify router datanode UI with namenode datanode UI. Contributed by Ayush Saxena. 2020-05-17 03:06:27 +05:30
Ayush Saxena
178336f8a8 HDFS-15356. Unify configuration dfs.ha.allow.stale.reads to DFSConfigKeys. Contributed by Xiaoqiao He. 2020-05-16 16:35:06 +05:30
Uma Maheswara Rao G
ac4a2e11d9
HDFS-15306. Make mount-table to read from central place ( Let's say from HDFS). Contributed by Uma Maheswara Rao G. 2020-05-14 17:29:35 -07:00
Steve Loughran
d08b9e94e3
Revert "HADOOP-14557. Document HADOOP-8143 (Change distcp to have -pb on by default)."
This reverts commit 44350fdf49.

It is related to the rollback of HADOOP-8143.

Change-Id: If48e3dd670c920ada702dc36461ff398fe9d35cc
2020-05-14 19:04:36 +01:00
Steve Loughran
4486220bb2
Revert "HADOOP-8143. Change distcp to have -pb on by default."
This reverts commit dd65eea74b.

Change-Id: I74180cf59d5bbad8c9f66cb331535addcbea863e
2020-05-14 19:03:56 +01:00
Mike
017d24e970
HADOOP-17036. TestFTPFileSystem failing as ftp server dir already exists.
Contributed by Mikhail Pryakhin.
2020-05-14 18:28:00 +01:00
Prabhu Joseph
7836bc4c35 YARN-9606. Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient.
Contributed by Bilwa S T. Reviewed by Brahma Reddy Battula.
2020-05-14 19:40:42 +05:30
Prabhu Joseph
6ce295b787 YARN-10259. Fix reservation logic in Multi Node Placement.
Reviewed by Wangda Tan.
2020-05-14 16:52:11 +05:30
Surendra Singh Lilhore
1958cb7c2b YARN-10265. Upgrade Netty-all dependency to latest version 4.1.50 to fix ARM support issue. Contributed by liusheng 2020-05-14 15:09:34 +05:30
Ayush Saxena
0918433b4d YARN-9898. Dependency netty-all-4.1.27.Final doesn't support ARM platform. Contributed by liusheng. 2020-05-14 00:36:20 +05:30
Ayush Saxena
c757cb61eb HADOOP-14254. Add a Distcp option to preserve Erasure Coding attributes. Contributed by Ayush Saxena. 2020-05-14 00:31:20 +05:30
Xiaoyu Yao
3cacf1ce56
HDFS-15344. DataNode#checkSuperuserPrivilege should use UGI#getGroups after HADOOP-13442. (#2004) 2020-05-13 11:47:19 -07:00
Inigo Goiri
108ecf992f YARN-8942. PriorityBasedRouterPolicy throws exception if all sub-cluster weights have negative value. Contributed by Bilwa S T. 2020-05-13 10:04:12 -07:00
Surendra Singh Lilhore
743c2e9071 HDFS-15316. Deletion failure should not remove directory from snapshottables. Contributed by hemanthboyina 2020-05-13 15:01:07 +05:30
Prabhu Joseph
450e5aa9dd YARN-10154. Addendum Patch which fixes below bugs
1. RM fails to start when LeafQueueTemplate max capacity is not specified.
2. Job stuck in ACCEPTED state with DominantResourceCalculator as Queue
   Capacity is set to NaN during RM startup with clusterResource is zero.

Reviewed by Sunil G and Manikandan R.
2020-05-13 14:35:37 +05:30
Akira Ajisaka
8ffc356b1e
Revert "SPNEGO TLS verification"
This reverts commit ba66f3b454.
2020-05-13 17:14:14 +09:00
Joseph Smith
d60496e6c6
BytesWritable causes OOME when array size reaches Integer.MAX_VALUE. (#393) 2020-05-13 00:20:35 +05:30
Thomas Marquardt
b214bbd2d9
HADOOP-16916: ABFS: Delegation SAS generator for integration with Ranger
Contributed by Thomas Marquardt.

DETAILS:

Previously we had a SASGenerator class which generated Service SAS, but we need to add DelegationSASGenerator.
I separated SASGenerator into a base class and two subclasses ServiceSASGenerator and DelegationSASGenreator.  The
code in ServiceSASGenerator is copied from SASGenerator but the DelegationSASGenrator code is new.  The
DelegationSASGenerator code demonstrates how to use Delegation SAS with minimal permissions, as would be used
by an authorization service such as Apache Ranger.  Adding this to the tests helps us lock in this behavior.

Added a MockDelegationSASTokenProvider for testing User Delegation SAS.

Fixed the ITestAzureBlobFileSystemCheckAccess tests to assume oauth client ID so that they are ignored when that
is not configured.

To improve performance, AbfsInputStream/AbfsOutputStream re-use SAS tokens until the expiry is within 120 seconds.
After this a new SAS will be requested.  The default period of 120 seconds can be changed using the configuration
setting "fs.azure.sas.token.renew.period.for.streams".

The SASTokenProvider operation names were updated to correspond better with the ADLS Gen2 REST API, since these
operations must be provided tokens with appropriate SAS parameters to succeed.

Support for the version 2.0 AAD authentication endpoint was added to AzureADAuthenticator.

The getFileStatus method was mistakenly calling the ADLS Gen2 Get Properties API which requires read permission
while the getFileStatus call only requires execute permission.  ADLS Gen2 Get Status API is supposed to be used
for this purpose, so the underlying AbfsClient.getPathStatus API was updated with a includeProperties
parameter which is set to false for getFileStatus and true for getXAttr.

Added SASTokenProvider support for delete recursive.

Fixed bugs in AzureBlobFileSystem where public methods were not validating the Path by calling makeQualified.  This is
necessary to avoid passing null paths and to convert relative paths into absolute paths.

Canonicalized the path used for root path internally so that root path can be used with SAS tokens, which requires
that the path in the URL and the path in the SAS token match.  Internally the code was using
"//" instead of "/" for the root path, sometimes.  Also related to this, the AzureBlobFileSystemStore.getRelativePath
API was updated so that we no longer remove and then add back a preceding forward / to paths.

To run ITestAzureBlobFileSystemDelegationSAS tests follow the instructions in testing_azure.md under the heading
"To run Delegation SAS test cases".  You also need to set "fs.azure.enable.check.access" to true.

TEST RESULTS:

namespace.enabled=true
auth.type=SharedKey
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 0, Skipped: 41
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24

namespace.enabled=false
auth.type=SharedKey
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 0, Skipped: 244
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24

namespace.enabled=true
auth.type=SharedKey
sas.token.provider.type=MockDelegationSASTokenProvider
enable.check.access=true
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 0, Skipped: 33
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24

namespace.enabled=true
auth.type=OAuth
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 1, Skipped: 74
Tests run: 206, Failures: 0, Errors: 0, Skipped: 140
2020-05-12 18:35:38 +00:00
Jonathan Hung
fff1d2c122 YARN-10260. Allow transitioning queue from DRAINING to RUNNING state. Contributed by Bilwa S T 2020-05-12 10:48:54 -07:00
Ayush Saxena
936bf09c37 HDFS-15300. RBF: updateActiveNamenode() is invalid when RPC address is IP. Contributed by xuzq. 2020-05-12 21:54:54 +05:30
Elixir Kook
a3f945fb84
HADOOP-17035. fixed typos (timeout, interruped) (#2007)
Co-authored-by: Sungpeo Kook <elixir.kook@kakaocorp.com>
2020-05-12 10:50:04 -05:00
Xiaoyu Yao
047d8879e7
HDFS-15345. RouterPermissionChecker#checkSuperuserPrivilege should use UGI#getGroups after HADOOP-13442. 2020-05-12 08:31:04 -07:00
Inigo Goiri
96bbc3bc97 YARN-9301. Too many InvalidStateTransitionException with SLS. Contributed by Bilwa S T. 2020-05-12 08:24:34 -07:00
Inigo Goiri
9cbd0cd2a9 YARN-9301. Too many InvalidStateTransitionException with SLS. Contributed by Bilwa S T. 2020-05-12 08:20:03 -07:00
S O'Donnell
29dddb8a14 HDFS-15255. Consider StorageType when DatanodeManager#sortLocatedBlock(). Contributed by Lisheng Sun. 2020-05-12 15:07:51 +01:00
Takanobu Asanuma
928b81a533
HDFS-15350. Set dfs.client.failover.random.order to true as default. (#2008) 2020-05-12 09:04:03 -05:00
Ayush Saxena
8dad38c0be HDFS-14367. EC: Parameter maxPoolSize in striped reconstruct thread pool isn't affecting number of threads. Contributed by Guo Lei. 2020-05-12 18:34:26 +05:30
Ayush Saxena
0fe49036e5 HDFS-15243. Add an option to prevent sub-directories of protected directories from deletion. Contributed by liuyanyu. 2020-05-12 13:11:31 +05:30
Wei-Chiu Chuang
bd342bef64
HADOOP-17033. Update commons-codec from 1.11 to 1.14. (#2000) 2020-05-11 08:41:14 -07:00
Ayush Saxena
4c53fb9ce1 HDFS-15338. listOpenFiles() should throw InvalidPathException in case of invalid paths. Contributed by Jinglun. 2020-05-11 16:48:34 +05:30
Akira Ajisaka
328eae9a14
HADOOP-16768. SnappyCompressor test cases wrongly assume that the compressed data is always smaller than the input data. (#2003) 2020-05-11 14:44:18 +09:00
Ayush Saxena
aab9e0b16e HDFS-15250. Setting dfs.client.use.datanode.hostname to true can crash the system because of unhandled UnresolvedAddressException. Contributed by Ctest. 2020-05-10 11:43:30 +05:30