Commit Graph

25217 Commits

Author SHA1 Message Date
Mehakmeet Singh
8e5620cd9e
HADOOP-17195. ABFS: OutOfMemory error while uploading huge files (#3446)
Addresses the problem of processes running out of memory when
there are many ABFS output streams queuing data to upload,
especially when the network upload bandwidth is less than the rate
data is generated.

ABFS Output streams now buffer their blocks of data to
"disk", "bytebuffer" or "array", as set in
"fs.azure.data.blocks.buffer"

When buffering via disk, the location for temporary storage
is set in "fs.azure.buffer.dir"

For safe scaling: use "disk" (default); for performance, when
confident that upload bandwidth will never be a bottleneck,
experiment with the memory options.

The number of blocks a single stream can have queued for uploading
is set in "fs.azure.block.upload.active.blocks".
The default value is 20.

Contributed by Mehakmeet Singh.
2021-09-22 11:19:16 +01:00
sumangala-patki
dd30db78e7
HADOOP-17290. ABFS: Add Identifiers to Client Request Header (#2520)
Contributed by Sumangala Patki.

(cherry picked from commit 35570e414a)
2021-09-21 16:45:51 +01:00
Neil
9700d98eac
HADOOP-17893. Improve PrometheusSink for Namenode TopMetrics (#3426)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit ae2c5ccfcf)
2021-09-21 10:44:51 +09:00
Rintaro Ikeda
92af6cd3bc HADOOP-17919. Fix command line example in Hadoop Cluster Setup documentation. (#3453)
(cherry picked from commit 607c20c612)
2021-09-17 13:34:07 +00:00
Steve Loughran
9188fa8cce
HADOOP-17126. implement non-guava Precondition checkNotNull
This adds a new class org.apache.hadoop.util.Preconditions which is

* @Private/@Unstable
* Intended to allow us to move off Google Guava
* Is designed to be trivially backportable
  (i.e contains no references to guava classes internally)

Please use this instead of the guava equivalents, where possible.

Contributed by: Ahmed Hussein

Change-Id: Ic392451bcfe7d446184b7c995734bcca8c07286e
2021-09-17 11:06:59 +01:00
Liang-Chi Hsieh
103ef9c711 HADOOP-17891. Fix compilation error under skipShade (ADDENDUM) (#3441) 2021-09-16 10:12:28 -07:00
Eric Badger
52ba50fd3c YARN-10935. AM Total Queue Limit goes below per-user AM Limit if parent is full. Contributed by Eric Payne.
(cherry picked from commit 43f0a34dd4)
2021-09-16 16:46:44 +00:00
wangzhaohui
2f73ac1c14 HDFS-16181. [SBN Read] Fix display of JournalNode metric RpcRequestCacheMissAmount (#3317)
Co-authored-by: wangzhaohui8 <wangzhaohui8@jd.com>

(cherry picked from commit 232fd7cae170de8c6b52c14841a47dca8735c6d2)
2021-09-15 10:02:13 -07:00
Liang-Chi Hsieh
b9715c2931 HADOOP-17891. Exclude snappy-java and lz4-java from relocation in shaded hadoop client libraries (#3385) 2021-09-14 11:21:41 -07:00
Szilard Nemeth
6c68211062 YARN-10870. Missing user filtering check -> yarn.webapp.filter-entity-list-by-user for RM Scheduler page. Contributed by Gergely Pollak 2021-09-14 18:08:34 +02:00
Tamas Domok
8e4ac01135
YARN-10901. Permission checking error on an existing directory in LogAggregationFileController#verifyAndCreateRemoteLogDir (#3409)
Co-authored-by: Tamas Domok <tdomok@cloudera.com>
2021-09-14 17:34:32 +02:00
EungsopYoo
51a4a23e37
HDFS-16198. Short circuit read leaks Slot objects when InvalidToken exception is thrown (#3359)
Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org>
Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org>
(cherry picked from commit c4c5883d8b)
2021-09-14 16:27:59 +08:00
Symious
c0f32f3cf8 HDFS-16221. RBF: Add usage of refreshCallQueue for Router (#3421)
(cherry picked from commit 7f6553af75)
2021-09-13 10:48:09 +08:00
Symious
8affaa6312 HDFS-16210. RBF: Add the option of refreshCallQueue to RouterAdmin (#3379)
(cherry picked from commit c0890e6d04)
2021-09-10 10:01:27 +08:00
sumangala-patki
1cb9e747eb
HADOOP-17618. ABFS: Partially obfuscate SAS object IDs in Logs (#2845)
Contributed by Sumangala Patki

(cherry picked from commit 3450522c2f)
2021-09-09 14:04:12 +01:00
Adam Binford
59a955dfa0
HADOOP-17804. Expose prometheus metrics only after a flush and dedupe with tag values (#3369)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit 4ced012f33)
2021-09-09 16:51:04 +09:00
Ahmed Hussein
1f61944e3b HDFS-16207. Remove NN logs stack trace for non-existent xattr query (#3375)
Change-Id: Ibde523b20a6b8ac92991da52583e625a018d2ee6
2021-09-09 05:27:13 +00:00
Steve Loughran
a2242df10a
HADOOP-17894. CredentialProviderFactory.getProviders() recursion loading JCEKS file from S3A (#3393)
* CredentialProviderFactory to detect and report on recursion.
* S3AFS to remove incompatible providers.
* Integration Test for this.

Contributed by Steve Loughran.

Change-Id: Ia247b3c9fe8488ffdb7f57b40eb6e37c57e522ef
2021-09-08 17:00:20 +01:00
Masatake Iwasaki
76393e1359 HADOOP-17899. Avoid using implicit dependency on junit-jupiter-api. (#3399)
(cherry picked from commit ce7a5bfbd3)
2021-09-08 09:11:39 +00:00
Masatake Iwasaki
5926ccde77 HADOOP-17897. Allow nested blocks in switch case in checkstyle settings. (#3394)
(cherry picked from commit e183ec8998)
2021-09-08 04:59:05 +00:00
Mukund Thakur
3b1c594355 HADOOP-17156. ABFS: Release the byte buffers held by input streams in close() (#3285)
Contributed By: Mukund Thakur
2021-09-07 15:29:22 +05:30
Yellow Flash
09e8e5c5cb
HADOOP-17870. Http Filesystem to qualify relative paths. (#3338)
Contributed by Yellowflash

Change-Id: I217da06a1a2e5c0ca2b324f8e21baa0846f64858
2021-09-07 10:54:35 +01:00
Chris Nauroth
cc90b4f987 HADOOP-15129. Datanode caches namenode DNS lookup failure and cannot startup (#3348)
Co-authored-by:  Karthik Palaniappan

Change-Id: Id079a5319e5e83939d5dcce5fb9ebe3715ee864f
2021-09-03 18:48:07 +00:00
Viraj Jasani
7a4eaeb8bf
HADOOP-17874. ExceptionsHandler to add terse/suppressed Exceptions in thread-safe manner (#3343)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit 99a157fa4a)
2021-09-03 10:27:06 +09:00
Ahmed Hussein
fc5d67dfb4 HADOOP-17886. Upgrade ant to 1.10.11 (#3371)
(cherry picked from commit 051207375b)
2021-09-02 16:17:32 -05:00
lzx404243
d2c02f5afc
MAPREDUCE-7311. Clear filesystem statistics after tests in TestTaskProgressReporter (#2500)
Co-authored-by: Zhengxi Li <zli89@illinois.edu>
(cherry picked from commit 6187f76f11)
2021-09-01 17:15:31 +09:00
Szilard Nemeth
0a726250ea
YARN-10428. Zombie applications in the YARN queue using FAIR + sizebasedweight. Contributed by Guang Yang, Andras Gyori
(cherry picked from commit 79a46599f7)
2021-09-01 10:44:15 +09:00
Uma Maheswara Rao G
580b6c400b
HDFS-16192: ViewDistributedFileSystem#rename wrongly using src in the place of dst. (#3353)
Co-authored-by: Uma Maheswara Rao G <umagangumalla@cloudera.com>
(cherry picked from commit 164608b546)
2021-08-31 12:27:43 +08:00
Dongjoon Hyun
8606b2cddd
HADOOP-17869. fs.s3a.connection.maximum should be bigger than fs.s3a.threads.max (#3337).
The value of `fs.s3a.connection.maximum` has been increased to 96

Contributed by Dongjoon Hyun

Change-Id: I9020a2bfd2a67fa7a2ec0598ed9d63e78ee99c73
2021-08-30 18:31:57 +01:00
lzx404243
4a93ca78f9
MAPREDUCE-7342. Stop RMService in TestClientRedirect.testRedirect() (#2968)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit 7b5be74228)
2021-08-30 08:41:46 +09:00
jianghuazhu
7c663043b2
HDFS-16173.Improve CopyCommands#Put#executor queue configurability. (#3302)
Co-authored-by: zhujianghua <zhujianghua@zhujianghuadeMacBook-Pro.local>
Reviewed-by: Hui Fei <ferhui@apache.org>
Reviewed-by: Viraj Jasani <vjasani@apache.org>
(cherry picked from commit 4c94831364)
2021-08-27 12:06:26 +08:00
jianghuazhu
2b2f8f575b
HDFS-16175.Improve the configurable value of Server #PURGE_INTERVAL_NANOS. (#3307)
Co-authored-by: zhujianghua <zhujianghua@zhujianghuadeMacBook-Pro.local>
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
(cherry picked from commit ad54f5195c)
2021-08-25 17:35:50 +08:00
Viraj Jasani
0967483b7c HDFS-16184. De-flake TestBlockScanner#testSkipRecentAccessFile (#3329)
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
(cherry picked from commit 1b9927afe1)
2021-08-25 17:42:56 +09:00
Masatake Iwasaki
3645a13586 HADOOP-14922. Build of Mapreduce Native Task module fails with unknown opcode "bswap". Contributed by Anup Halarnkar.
(cherry picked from commit 0d59500e8c)
2021-08-25 01:54:36 +00:00
Viraj Jasani
fc6b1cafd4 HADOOP-17858. Avoid possible class loading deadlock with VerifierNone initialization (#3321)
(cherry picked from commit fc566ad9b0)
2021-08-24 22:44:11 +09:00
Szilard Nemeth
224b42108d YARN-10814. Fallback to RandomSecretProvider if the secret file is empty. Contributed by Tamas Domok 2021-08-24 14:16:15 +02:00
litao
5a1ed37893
HDFS-16177. Bug fix for Util#receiveFile (#3310)
Reviewed-by: Hui Fei <ferhui@apache.org>
(cherry picked from commit 07627ef19e)
2021-08-19 12:31:52 +08:00
Siyao Meng
226f94b4fc HADOOP-17834. Bump aliyun-sdk-oss to 3.13.0 (#3261)
Change-Id: I335d4a2cb08c75dc24ef36bdfab51111f87e0762
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit 3aaac8a1f6)
2021-08-16 00:32:05 +09:00
Renukaprasad C
5b566c3914 HADOOP-17844. Upgrade JSON smart to 2.4.7 (#3299)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit b90389ae98)

 Conflicts:
	LICENSE-binary
2021-08-14 20:00:38 +09:00
jianghuazhu
0a5f76b814
HDFS-16151. Improve the parameter comments related to ProtobufRpcEngine2#Server(). (#3256)
Co-authored-by: zhujianghua <zhujianghua@zhujianghuadeMacBook-Pro.local>
Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit 0c7b951e03)
2021-08-08 14:57:08 +09:00
Akira Ajisaka
025ecf42be
HADOOP-17370. Upgrade commons-compress to 1.21 (#3274)
(cherry picked from commit 3565c9477d)
2021-08-08 11:25:26 +09:00
Akira Ajisaka
00c382d118
Fix potential heap buffer overflow in hdfs.c. Contributed by Igor Chervatyuk.
(cherry picked from commit 4972e7a246)
2021-08-07 11:27:52 +09:00
Bryan Beaudreault
2fda130260 HADOOP-17837: Add unresolved endpoint value to UnknownHostException (ADDENDUM) (#3276)
(cherry picked from commit b0b867e977)
2021-08-06 21:57:46 +05:30
wangzhaohui
7068e18842
HDFS-16154. TestMiniJournalCluster failing intermittently because of not reseting UserGroupInformation completely (#3270)
Co-authored-by: wangzhaohui8 <wangzhaohui8@jd.com>
Reviewed-by: Viraj Jasani <vjasani@apache.org>
(cherry picked from commit e85c44657c)
2021-08-06 18:25:00 +08:00
Bryan Beaudreault
7659b62682
HADOOP-17837: Add unresolved endpoint value to UnknownHostException (#3272)
(cherry picked from commit 5e54d92e6e)
2021-08-06 17:32:01 +08:00
wangzhaohui
e9ba4f4591 HDFS-16153. Avoid evaluation of LOG.debug statement in QuorumJournalManager (#3269). Contributed by wangzhaohui.
(cherry picked from commit a73b64f86b)
2021-08-06 09:28:39 +01:00
Viraj Jasani
b3077543cf HADOOP-17808. Avoid excessive logging for interruption (ADDENDUM) (#3267)
(cherry picked from commit 9fe1f24ec1)
2021-08-06 09:30:43 +08:00
Siyao Meng
72508e6430 HDFS-16055. Quota is not preserved in snapshot INode (#3078)
(cherry picked from commit ebee2aed00)
2021-08-03 20:24:18 +01:00
Steve Loughran
c1ad91e72d
HADOOP-17822. fs.s3a.acl.default not working after S3A Audit feature (#3249)
Fixes the regression caused by HADOOP-17511 by moving where the
option  fs.s3a.acl.default is read -doing it before the RequestFactory
is created.

Adds

* A unit test in TestRequestFactory to verify the ACLs are set
  on all file write operations.
* A new ITestS3ACannedACLs test which verifies that ACLs really
  do get all the way through.
* S3A Assumed Role delegation tokens to include the IAM permission
  s3:PutObjectAcl in the generated role.

Contributed by Steve Loughran

Change-Id: I3abac6a1b9e150b6b6df0af7c2c70093f8f518cb
2021-08-02 15:33:34 +01:00
Steve Loughran
26514b6534 HADOOP-17628. Distcp contract test is really slow with ABFS and S3A; timing out. (#3240)
This patch cuts down the size of directory trees used for
distcp contract tests against object stores, so making
them much faster against distant/slow stores.

On abfs, the test only runs with -Dscale (as was the case for s3a already),
and has the larger scale test timeout.

After every test case, the FileSystem IOStatistics are logged,
to provide information about what IO is taking place and
what it's performance is.

There are some test cases which upload files of 1+ MiB; you can
increase the size of the upload in the option
"scale.test.distcp.file.size.kb" 
Set it to zero and the large file tests are skipped.

Contributed by Steve Loughran.
2021-08-02 12:58:37 +01:00