hadoop

Author	SHA1	Message	Date
Mehakmeet Singh	acffe203b8	HADOOP-17195. ABFS: OutOfMemory error while uploading huge files (#3446 ) Addresses the problem of processes running out of memory when there are many ABFS output streams queuing data to upload, especially when the network upload bandwidth is less than the rate data is generated. ABFS Output streams now buffer their blocks of data to "disk", "bytebuffer" or "array", as set in "fs.azure.data.blocks.buffer" When buffering via disk, the location for temporary storage is set in "fs.azure.buffer.dir" For safe scaling: use "disk" (default); for performance, when confident that upload bandwidth will never be a bottleneck, experiment with the memory options. The number of blocks a single stream can have queued for uploading is set in "fs.azure.block.upload.active.blocks". The default value is 20. Contributed by Mehakmeet Singh.	2021-09-21 12:48:06 +01:00
Neil	ae2c5ccfcf	HADOOP-17893. Improve PrometheusSink for Namenode TopMetrics (#3426 ) Signed-off-by: Akira Ajisaka <aajisaka@apache.org>	2021-09-21 10:43:50 +09:00
Szilard Nemeth	4df4389325	YARN-10911. AbstractCSQueue: Create a separate class for usernames and weights that are travelling in a Map. Contributed by Szilard Nemeth	2021-09-20 16:47:46 +02:00
Tamas Domok	f93e8fbf2d	HDFS-16129. Fixing the signature secret file misusage in HttpFS. Contributed by Tamas Domok * HDFS-16129. Fixing the signature secret file misusage in HttpFS. The signature secret file was not used in HttpFs. - if the configuration did not contain the deprecated httpfs.authentication.signature.secret.file option then it used the random secret provider - if both option (httpfs. and hadoop.http.) was set then the HttpFSAuthenticationFilter could not read the file because the file path was not substituted properly !NOTE! behavioral change: the deprecated httpfs. configuration values are overwritten with the hadoop.http. values. The commit also contains a follow up change to the YARN-10814, empty secret files will result in a random secret provider. Co-authored-by: Tamas Domok <tdomok@cloudera.com>	2021-09-20 14:29:50 +02:00
Rintaro Ikeda	607c20c612	HADOOP-17919. Fix command line example in Hadoop Cluster Setup documentation. (#3453 )	2021-09-17 22:24:44 +09:00
Steve Loughran	5ebcd4bb92	HADOOP-17126. implement non-guava Precondition checkNotNull This adds a new class org.apache.hadoop.util.Preconditions which is * @Private/@Unstable * Intended to allow us to move off Google Guava * Is designed to be trivially backportable (i.e contains no references to guava classes internally) Please use this instead of the guava equivalents, where possible. Contributed by: Ahmed Hussein Change-Id: Ic392451bcfe7d446184b7c995734bcca8c07286e	2021-09-17 11:06:13 +01:00
litao	71a601241c	HADOOP-17914. Print RPC response length in the exception message (#3436 )	2021-09-17 14:45:14 +08:00
Mehakmeet Singh	c54bf19978	HADOOP-17871. S3A CSE: minor tuning (#3412 ) This migrates the fs.s3a-server-side encryption configuration options to a name which covers client-side encryption too. fs.s3a.server-side-encryption-algorithm becomes fs.s3a.encryption.algorithm fs.s3a.server-side-encryption.key becomes fs.s3a.encryption.key The existing keys remain valid, simply deprecated and remapped to the new values. If you want server-side encryption options to be picked up regardless of hadoop versions, use the old keys. (the old key also works for CSE, though as no version of Hadoop with CSE support has shipped without this remapping, it's less relevant) Contributed by: Mehakmeet Singh	2021-09-15 22:29:22 +01:00
Steve Loughran	10f3abeae7	Revert "HADOOP-17195. OutOfMemory error while performing hdfs CopyFromLocal to ABFS (#3406 )" (#3443 ) This reverts commit `52c024cc3a`.	2021-09-15 22:27:49 +01:00
Mehakmeet Singh	52c024cc3a	HADOOP-17195. OutOfMemory error while performing hdfs CopyFromLocal to ABFS (#3406 ) This migrates the fs.s3a-server-side encryption configuration options to a name which covers client-side encryption too. fs.s3a.server-side-encryption-algorithm becomes fs.s3a.encryption.algorithm fs.s3a.server-side-encryption.key becomes fs.s3a.encryption.key The existing keys remain valid, simply deprecated and remapped to the new values. If you want server-side encryption options to be picked up regardless of hadoop versions, use the old keys. (the old key also works for CSE, though as no version of Hadoop with CSE support has shipped without this remapping, it's less relevant) Contributed by: Mehakmeet Singh	2021-09-15 22:27:28 +01:00
Weihao Zheng	3aa76f7e48	HADOOP-17907. FileUtil#fullyDelete deletes contents of sym-linked directory when symlink cannot be deleted because of local fs fault (#3431 ). Contributed by Weihao Zheng. Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>	2021-09-15 23:07:21 +05:30
Ayush Saxena	d9eb5ad6d3	HADOOP-17900. Move ClusterStorageCapacityExceededException to Public from LimitedPrivate. (#3404 ). Contributed by Ayush Saxena.	2021-09-13 22:50:39 +05:30
LeonGao	90bc688c78	HDFS-16188. RBF: Router to support resolving monitored namenodes with DNS (#3346 ) Contributed by Leon Gao * Router to support resolving monitored namenodes with DNS * Style * fix style and test failure * Add test for NNHAServiceTarget const * Resolve comments * Fix test * Comments and style * Create a simple function to extract port * Use LambdaTestUtils.intercept * fix javadoc * Trigger Build	2021-09-10 16:40:08 -07:00
pbacsko	827e19271a	HADOOP-17901. Performance degradation in Text.append() after HADOOP-1… (#3411 )	2021-09-10 16:01:37 -07:00
9uapaw	811fd23f23	YARN-10852. Optimise CSConfiguration getAllUserWeightsForQueue (#3392 )	2021-09-10 16:59:46 +02:00
Adam Binford	4ced012f33	HADOOP-17804. Expose prometheus metrics only after a flush and dedupe with tag values (#3369 ) Signed-off-by: Akira Ajisaka <aajisaka@apache.org>	2021-09-09 16:49:40 +09:00
Liang-Chi Hsieh	e708836641	HADOOP-17887. Remove the wrapper class GzipOutputStream (#3377 )	2021-09-08 21:23:25 -07:00
Szilard Nemeth	5428d36b56	HADOOP-17857. Check real user ACLs in addition to proxied user ACLs. Contributed by Eric Payne	2021-09-08 17:27:41 +02:00
Masatake Iwasaki	ce7a5bfbd3	HADOOP-17899. Avoid using implicit dependency on junit-jupiter-api. (#3399 )	2021-09-08 18:10:50 +09:00
Steve Loughran	6e3aeb1544	HADOOP-17894. CredentialProviderFactory.getProviders() recursion loading JCEKS file from S3A (#3393 ) * CredentialProviderFactory to detect and report on recursion. * S3AFS to remove incompatible providers. * Integration Test for this. Contributed by Steve Loughran.	2021-09-07 15:29:37 +01:00
Chris Nauroth	1d808f59d7	HADOOP-15129. Datanode caches namenode DNS lookup failure and cannot startup (#3348 ) Co-authored-by: Karthik Palaniappan Change-Id: Id079a5319e5e83939d5dcce5fb9ebe3715ee864f	2021-09-03 18:43:48 +00:00
Viraj Jasani	99a157fa4a	HADOOP-17874. ExceptionsHandler to add terse/suppressed Exceptions in thread-safe manner (#3343 ) Signed-off-by: Akira Ajisaka <aajisaka@apache.org>	2021-09-03 10:25:33 +09:00
Yellow Flash	4ea60b5733	HADOOP-17870. Http Filesystem to qualify relative paths. (#3338 ) Contributed by Yellowflash	2021-08-31 13:55:52 +01:00
Uma Maheswara Rao G	164608b546	HDFS-16192: ViewDistributedFileSystem#rename wrongly using src in the place of dst. (#3353 ) Co-authored-by: Uma Maheswara Rao G <umagangumalla@cloudera.com>	2021-08-31 12:25:03 +08:00
Dongjoon Hyun	265a48e245	HADOOP-17869. `fs.s3a.connection.maximum` should be bigger than `fs.s3a.threads.max` (#3337 ). The value of `fs.s3a.connection.maximum` has been increased to 96 Contributed by Dongjoon Hyun	2021-08-30 18:30:43 +01:00
Akira Ajisaka	50dda774f1	HADOOP-17544. Mark KeyProvider as Stable. (#2776 ) Reviewed-by: Masatake Iwasaki <iwasakims@apache.org>	2021-08-30 09:55:53 +09:00
Liang-Chi Hsieh	73a0c31370	HADOOP-17877. BuiltInGzipCompressor header and trailer should not be static variables (#3350 )	2021-08-29 08:21:55 -07:00
jianghuazhu	4c94831364	HDFS-16173.Improve CopyCommands#Put#executor queue configurability. (#3302 ) Co-authored-by: zhujianghua <zhujianghua@zhujianghuadeMacBook-Pro.local> Reviewed-by: Hui Fei <ferhui@apache.org> Reviewed-by: Viraj Jasani <vjasani@apache.org>	2021-08-27 11:41:44 +08:00
Viraj Jasani	aa9cdf2af6	HDFS-16143. Add Timer in EditLogTailer and de-flake TestEditLogTailer#testStandbyTriggersLogRollsWhenTailInProgressEdits (#3235 ) Contributed by Viraj Jasani. Signed-off-by: Mingliang Liu <liuml07@apache.org> Signed-off-by: Takanobu Asanuma <tasanuma@apache.org> Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>	2021-08-26 00:37:38 -07:00
LeonGao	b53cae0ffb	HDFS-16157. Support configuring DNS record to get list of journal nodes contributed by Leon Gao. (#3284 ) * Add DNS resolution for QJM * Add log * Resolve comments * checkstyle * typo	2021-08-25 17:40:12 -07:00
jianghuazhu	ad54f5195c	HDFS-16175.Improve the configurable value of Server #PURGE_INTERVAL_NANOS. (#3307 ) Co-authored-by: zhujianghua <zhujianghua@zhujianghuadeMacBook-Pro.local> Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>	2021-08-25 17:34:45 +08:00
Viraj Jasani	fc566ad9b0	HADOOP-17858. Avoid possible class loading deadlock with VerifierNone initialization (#3321 )	2021-08-24 22:41:59 +09:00
Liang-Chi Hsieh	6014a089fd	HADOOP-17825. Add BuiltInGzipCompressor (#3250 ) Currently, GzipCodec only supports BuiltInGzipDecompressor, if native zlib is not loaded. So, without Hadoop native codec installed, saving SequenceFile using GzipCodec will throw exception like "SequenceFile doesn't work with GzipCodec without native-hadoop code!" Same as other codecs which we migrated to using prepared packages (lz4, snappy), it will be better if we support GzipCodec generally without Hadoop native codec installed. Similar to BuiltInGzipDecompressor, we can use Java Deflater to support BuiltInGzipCompressor.	2021-08-16 10:08:03 -07:00
Viraj Jasani	6342d5e523	HDFS-16171. De-flake testDecommissionStatus (#3280 ) Signed-off-by: Akira Ajisaka <aajisaka@apache.org>	2021-08-16 14:54:25 +09:00
Viraj Jasani	6a7883431f	HADOOP-17841. Remove ListenerHandle from Hadoop registry (#3278 )	2021-08-09 16:57:53 +08:00
jianghuazhu	0c7b951e03	HDFS-16151. Improve the parameter comments related to ProtobufRpcEngine2#Server(). (#3256 ) Co-authored-by: zhujianghua <zhujianghua@zhujianghuadeMacBook-Pro.local> Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org> Signed-off-by: Akira Ajisaka <aajisaka@apache.org>	2021-08-08 14:55:55 +09:00
Viraj Jasani	23e2a0b202	HADOOP-17835. Use CuratorCache implementation instead of PathChildrenCache / TreeCache (#3266 ) Signed-off-by: Akira Ajisaka <aajisaka@apache.org>	2021-08-07 11:20:35 +09:00
Bryan Beaudreault	b0b867e977	HADOOP-17837: Add unresolved endpoint value to UnknownHostException (ADDENDUM) (#3276 )	2021-08-06 21:54:07 +05:30
Bryan Beaudreault	5e54d92e6e	HADOOP-17837: Add unresolved endpoint value to UnknownHostException (#3272 )	2021-08-06 17:00:20 +08:00
Viraj Jasani	9fe1f24ec1	HADOOP-17808. Avoid excessive logging for interruption (ADDENDUM) (#3267 )	2021-08-06 09:27:30 +08:00
jianghuazhu	8616591b0c	HDFS-16149.Improve the parameter annotation in FairCallQueue#priorityLevels. (#3255 ) Co-authored-by: zhujianghua <zhujianghua@zhujianghuadeMacBook-Pro.local>	2021-08-03 16:53:24 +08:00
Viraj Jasani	ccfa072dc7	HADOOP-17612. Upgrade Zookeeper to 3.6.3 and Curator to 5.2.0 (#3241 ) Signed-off-by: Akira Ajisaka <aajisaka@apache.org>	2021-08-03 14:44:00 +09:00
Steve Loughran	4627e9c7ef	HADOOP-17822. fs.s3a.acl.default not working after S3A Audit feature (#3249 ) Fixes the regression caused by HADOOP-17511 by moving where the option fs.s3a.acl.default is read -doing it before the RequestFactory is created. Adds * A unit test in TestRequestFactory to verify the ACLs are set on all file write operations. * A new ITestS3ACannedACLs test which verifies that ACLs really do get all the way through. * S3A Assumed Role delegation tokens to include the IAM permission s3:PutObjectAcl in the generated role. Contributed by Steve Loughran	2021-08-02 15:26:56 +01:00
Steve Loughran	ee466d4b40	HADOOP-17628. Distcp contract test is really slow with ABFS and S3A; timing out. (#3240 ) This patch cuts down the size of directory trees used for distcp contract tests against object stores, so making them much faster against distant/slow stores. On abfs, the test only runs with -Dscale (as was the case for s3a already), and has the larger scale test timeout. After every test case, the FileSystem IOStatistics are logged, to provide information about what IO is taking place and what it's performance is. There are some test cases which upload files of 1+ MiB; you can increase the size of the upload in the option "scale.test.distcp.file.size.kb" Set it to zero and the large file tests are skipped. Contributed by Steve Loughran.	2021-08-02 11:36:43 +01:00
Petre Bogdan Stolojan	a218038960	HADOOP-17139 Re-enable optimized copyFromLocal implementation in S3AFileSystem (#3101 ) This work * Defines the behavior of FileSystem.copyFromLocal in filesystem.md * Implements a high performance implementation of copyFromLocalOperation for S3 * Adds a contract test for the operation: AbstractContractCopyFromLocalTest * Implements the contract tests for Local and S3A FileSystems Contributed by: Bogdan Stolojan	2021-07-30 19:42:08 +01:00
Tamas Domok	798a0837c1	YARN-10814. Fallback to RandomSecretProvider if the secret file is empty (#3206 ) The rest endpoint would be unusable with an empty secret file (throwing IllegalArgumentExceptions). Any IO error would have resulted in the same fallback path. Co-authored-by: Tamas Domok <tdomok@cloudera.com>	2021-07-30 12:16:46 +02:00
hchaverr	3c8a48e681	HADOOP-17819. Add extensions to ProtobufRpcEngine RequestHeaderProto. Contributed by Hector Sandoval Chaverri. (#3242 )	2021-07-28 15:37:56 -07:00
Viraj Jasani	e001f8ee39	HADOOP-17814. Provide fallbacks for identity/cost providers and backoff enable (#3230 ) Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org> Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>	2021-07-29 02:10:07 +09:00
jianghuazhu	fd13970d94	HDFS-16137.Improve the comments related to FairCallQueue#queues. (#3226 ) Co-authored-by: zhujianghua <zhujianghua@zhujianghuadeMacBook-Pro.local> Reviewed-by: Viraj Jasani <vjasani@apache.org>	2021-07-28 03:18:04 -07:00
Mehakmeet Singh	f813554769	HADOOP-13887. Support S3 client side encryption (S3-CSE) using AWS-SDK (#2706 ) This (big!) patch adds support for client side encryption in AWS S3, with keys managed by AWS-KMS. Read the documentation in encryption.md very, very carefully before use and consider it unstable. S3-CSE is enabled in the existing configuration option "fs.s3a.server-side-encryption-algorithm": fs.s3a.server-side-encryption-algorithm=CSE-KMS fs.s3a.server-side-encryption.key=<KMS_KEY_ID> You cannot enable CSE and SSE in the same client, although you can still enable a default SSE option in the S3 console. * Filesystem list/get status operations subtract 16 bytes from the length of all files >= 16 bytes long to compensate for the padding which CSE adds. * The SDK always warns about the specific algorithm chosen being deprecated. It is critical to use this algorithm for ranged GET requests to work (i.e. random IO). Ignore. * Unencrypted files CANNOT BE READ. The entire bucket SHOULD be encrypted with S3-CSE. * Uploading files may be a bit slower as blocks are now written sequentially. * The Multipart Upload API is disabled when S3-CSE is active. Contributed by Mehakmeet Singh	2021-07-27 11:08:51 +01:00

1 2 3 4 5 ...

5667 Commits