hadoop

Author	SHA1	Message	Date
litao	0029f22d7d	HADOOP-18003. Add a method appendIfAbsent for CallerContext (#3644 ) Cherry-picked from 573b358f by Owen O'Malley	2022-03-14 10:29:23 -07:00
Fei Hui	5a38ed2f22	HADOOP-17276. Extend CallerContext to make it include many items (#2327 ) Cherry-picked from d0d10f7e by Owen O'Malley	2022-03-14 10:28:38 -07:00
Wei-Chiu Chuang	743db6e7b4	HADOOP-18155. Refactor tests in TestFileUtil (#4063 ) (cherry picked from commit d0fa9b5775185bd83e4a767a7dfc13ef89c5154a) Conflicts: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileUtil.java hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestFileUtil.java Co-authored-by: Gautham B A <gautham.bangalore@gmail.com>	2022-03-14 09:40:17 +09:00
Mukund Thakur	e0619b702a	HADOOP-18112: Implement paging during multi object delete. (#4045 ) Multi object delete of size more than 1000 is not supported by S3 and fails with MalformedXML error. So implementing paging of requests to reduce the number of keys in a single request. Page size can be configured using "fs.s3a.bulk.delete.page.size" Contributed By: Mukund Thakur	2022-03-11 13:16:51 +05:30
Steve Loughran	94a0a04113	HADOOP-18136. Verify FileUtils.unTar() handling of missing .tar files. Contributed by Steve Loughran Change-Id: I3856afa821dbc8c2e3cb1cbe33793ec1734e2e24	2022-02-21 17:09:36 +00:00
Steve Loughran	088684ec60	HADOOP-18091. S3A auditing leaks memory through ThreadLocal references (#3930 ) Adds a new map type WeakReferenceMap, which stores weak references to values, and a WeakReferenceThreadMap subclass to more closely resemble a thread local type, as it is a map of threadId to value. Construct it with a factory method and optional callback for notification on loss and regeneration. WeakReferenceThreadMap<WrappingAuditSpan> activeSpan = new WeakReferenceThreadMap<>( (k) -> getUnbondedSpan(), this::noteSpanReferenceLost); This is used in ActiveAuditManagerS3A for span tracking. Relates to * HADOOP-17511. Add an Audit plugin point for S3A * HADOOP-18094. Disable S3A auditing by default. Contributed by Steve Loughran. Change-Id: Ibf7bb082fd47298f7ebf46d92f56e80ca9b2aaf8	2022-02-10 12:33:40 +00:00
Xing Lin	d613776b64	HADOOP-18093. Better exception handling for testFileStatusOnMountLink() in ViewFsBaseTest.java (#3918 ). Contributed by Xing Lin. (#3929 ) Signed-off-by: Ayush Saxena <ayushsaxena@apache.org> (cherry picked from commit 0d17b629ffee2f645f405ad46b0afa65224f87d5)	2022-01-26 21:55:32 +05:30
Viraj Jasani	5e9e779ed2	HADOOP-17152. Provide Hadoop's own Lists utility to reduce dependency on Guava (#3061 ) Change-Id: I52e55b9d9826ad661e9ad7dc15f007aa168f0fe1 Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org> Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>	2022-01-18 11:57:25 +00:00
Ashutosh Gupta	6535a183b2	HDFS-14099. Unknown frame descriptor when decompressing multiple frames (#3836 ) Co-authored-by: xuzq <xuzengqiang@kuaishou.com> Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com> Signed-off-by: Akira Ajisaka <aajisaka@apache.org> (cherry picked from commit caab29ec889a0771191b58714c306439b2415d91)	2021-12-28 22:00:48 +09:00
Akira Ajisaka	cd30687a15	Revert "HDFS-14099. Unknown frame descriptor when decompressing multiple frames (#3836 )" This reverts commit 05b43f205758a39ac5af25a9c7b699704e3b99d2.	2021-12-28 21:51:40 +09:00
Ashutosh Gupta	05b43f2057	HDFS-14099. Unknown frame descriptor when decompressing multiple frames (#3836 ) Co-authored-by: xuzq <xuzengqiang@kuaishou.com> Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com> Signed-off-by: Akira Ajisaka <aajisaka@apache.org> (cherry picked from commit caab29ec889a0771191b58714c306439b2415d91)	2021-12-28 21:49:06 +09:00
Dhananjay Badaya	e7b1f87665	HADOOP-13500. Synchronizing iteration of Configuration properties object (#3775 ) Signed-off-by: Akira Ajisaka <aajisaka@apache.org> (cherry picked from commit 4483607a4edc0f13264f4ea37abd90aba97e1ef0)	2021-12-17 16:07:46 +09:00
Steve Loughran	67eaf5aa9f	HADOOP-17979. Add Interface EtagSource to allow FileStatus subclasses to provide etags (#3633 ) Contributed by Steve Loughran Change-Id: I596205d788f623114c12962941445432e2036c34	2021-11-29 16:20:55 +00:00
smarthan	bc40a41064	HADOOP-18023. Allow cp command to run with multi threads. (#3721 ) (cherry picked from commit 932a78fe38b34a923f6852a1a19482075806ecba)	2021-11-29 12:47:02 +00:00
Viraj Jasani	6094e1ec9a	HDFS-16171. De-flake testDecommissionStatus (#3280 ) Signed-off-by: Akira Ajisaka <aajisaka@apache.org> (cherry picked from commit 6342d5e523941622a140fd877f06e9b59f48c48b) Conflicts: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDecommissioningStatus.java	2021-11-25 14:15:10 +09:00
Istvan Fajth	48e95d8109	HADOOP-17975. Fallback to simple auth does not work for a secondary DistributedFileSystem instance. (#3579 ) (cherry picked from commit ae3ba45db58467ce57b0a440e236fd80f6be9ec6)	2021-11-24 10:47:49 +00:00
smarthan	cbb3ba135c	HADOOP-17998. Allow get command to run with multi threads. (#3645 ) (cherry picked from commit 63018dc73f4d29632e93be08d035ab9a7e73531c) Conflicts: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CopyCommands.java	2021-11-22 12:14:32 +00:00
Abhishek Das	f456dc1837	HADOOP-17999. No-op implementation of setWriteChecksum and setVerifyChecksum in ViewFileSystem. Contributed by Abhishek Das. (#3639 ) (cherry picked from commit 54a1d78e16533e286455de62a545ee75cbc1eff5)	2021-11-16 22:40:24 -08:00
Mehakmeet Singh	bd077c3814	HADOOP-17953. S3A: Tests to lookup global or per-bucket configuration for encryption algorithm (#3525 ) Followup to S3-CSE work of HADOOP-13887 Contributed by Mehakmeet Singh	2021-10-21 12:03:50 +01:00
Szilard Nemeth	6f45666d0b	HADOOP-17857. Check real user ACLs in addition to proxied user ACLs. Contributed by Eric Payne (cherry picked from commit 5428d36b56fab319ab68258139d6133ded9bbafc)	2021-10-19 20:40:30 +00:00
Steve Loughran	b8f3e54ff7	HADOOP-17945. JsonSerialization raises EOFException reading JSON data stored on google GCS (#3501 ) Contributed By: Steve Loughran	2021-10-19 15:36:10 +05:30
Xing Lin	af920f138b	HADOOP-16532. Fix TestViewFsTrash to use the correct homeDir. Contributed by Xing Lin. (#3514 ) (cherry picked from commit 97c0f968792e1a45a1569a3184af7b114fc8c022)	2021-10-13 14:58:08 -07:00
Masatake Iwasaki	9e2936f8d1	HADOOP-17424. Replace HTrace with No-Op tracer (#3520 ) (cherry picked from commit 1a205cc3adffa568c814a5241e041b08e2fcd3eb) Conflicts: hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/tracing/TestTracing.java Co-authored-by: Siyao Meng <50227127+smengcl@users.noreply.github.com>	2021-10-12 00:07:09 +09:00
Viraj Jasani	77ee5a4266	HADOOP-17950. Provide replacement for deprecated APIs of commons-io IOUtils (#3515 ) Signed-off-by: Akira Ajisaka <aajisaka@apache.org> (cherry picked from commit 8071dbb9c6b4a654d5e1e7c8e3b4d2ca1a736d53)	2021-10-07 11:00:19 +09:00
Ahmed Hussein	2cdc6a245d	HADOOP-17930. implement non-guava Precondition checkState (#3522 ) Reviewed-by: Viraj Jasani <vjasani@apache.org> Signed-off-by: Takanobu Asanuma <tasanuma@apache.org> (cherry picked from commit c36f9402dc082a8903cf6e7fdca128658b11c59d)	2021-10-07 10:57:20 +09:00
Mehakmeet Singh	aee975a136	HADOOP-13887. Support S3 client side encryption (S3-CSE) using AWS-SDK (#2706 ) This (big!) patch adds support for client side encryption in AWS S3, with keys managed by AWS-KMS. Read the documentation in encryption.md very, very carefully before use and consider it unstable. S3-CSE is enabled in the existing configuration option "fs.s3a.server-side-encryption-algorithm": fs.s3a.server-side-encryption-algorithm=CSE-KMS fs.s3a.server-side-encryption.key=<KMS_KEY_ID> You cannot enable CSE and SSE in the same client, although you can still enable a default SSE option in the S3 console. * Filesystem list/get status operations subtract 16 bytes from the length of all files >= 16 bytes long to compensate for the padding which CSE adds. * The SDK always warns about the specific algorithm chosen being deprecated. It is critical to use this algorithm for ranged GET requests to work (i.e. random IO). Ignore. * Unencrypted files CANNOT BE READ. The entire bucket SHOULD be encrypted with S3-CSE. * Uploading files may be a bit slower as blocks are now written sequentially. * The Multipart Upload API is disabled when S3-CSE is active. Contributed by Mehakmeet Singh Change-Id: Ie1a27a036a39db66a67e9c6d33bc78d54ea708a0	2021-10-05 11:37:41 +01:00
Ahmed Hussein	31b44c519c	HADOOP-17929. implement non-guava Precondition checkArgument (#3473 ) Reviewed-by: Viraj Jasani <vjasani@apache.org> (cherry picked from commit 0c498f21dee7a5bbf91ad8afbfb372d08bacce6c)	2021-10-01 16:49:07 +08:00
Chao Sun	6931b70a00	HADOOP-17936. Fix test failure after reverting HADOOP-16878 from branch-3.3 (#3478 )	2021-09-27 13:56:44 -07:00
Chao Sun	ff26a7700d	Revert "HADOOP-16878. FileUtil.copy() to throw IOException if the source and destination are the same (#2383 )" This reverts commit 54c40cbf49f2ebf4bbc1976279a6eba7a2c5fe23.	2021-09-23 15:04:27 -07:00
Mehakmeet Singh	8e5620cd9e	HADOOP-17195. ABFS: OutOfMemory error while uploading huge files (#3446 ) Addresses the problem of processes running out of memory when there are many ABFS output streams queuing data to upload, especially when the network upload bandwidth is less than the rate data is generated. ABFS Output streams now buffer their blocks of data to "disk", "bytebuffer" or "array", as set in "fs.azure.data.blocks.buffer" When buffering via disk, the location for temporary storage is set in "fs.azure.buffer.dir" For safe scaling: use "disk" (default); for performance, when confident that upload bandwidth will never be a bottleneck, experiment with the memory options. The number of blocks a single stream can have queued for uploading is set in "fs.azure.block.upload.active.blocks". The default value is 20. Contributed by Mehakmeet Singh.	2021-09-22 11:19:16 +01:00
Neil	9700d98eac	HADOOP-17893. Improve PrometheusSink for Namenode TopMetrics (#3426 ) Signed-off-by: Akira Ajisaka <aajisaka@apache.org> (cherry picked from commit ae2c5ccfcf75a89c60ec6e4a339b46131f9134be)	2021-09-21 10:44:51 +09:00
Steve Loughran	9188fa8cce	HADOOP-17126. implement non-guava Precondition checkNotNull This adds a new class org.apache.hadoop.util.Preconditions which is * @Private/@Unstable * Intended to allow us to move off Google Guava * Is designed to be trivially backportable (i.e contains no references to guava classes internally) Please use this instead of the guava equivalents, where possible. Contributed by: Ahmed Hussein Change-Id: Ic392451bcfe7d446184b7c995734bcca8c07286e	2021-09-17 11:06:59 +01:00
Adam Binford	59a955dfa0	HADOOP-17804. Expose prometheus metrics only after a flush and dedupe with tag values (#3369 ) Signed-off-by: Akira Ajisaka <aajisaka@apache.org> (cherry picked from commit 4ced012f3301d0848680fdf0ef2972da9b3e1298)	2021-09-09 16:51:04 +09:00
Masatake Iwasaki	76393e1359	HADOOP-17899. Avoid using implicit dependency on junit-jupiter-api. (#3399 ) (cherry picked from commit ce7a5bfbd3cb55afda265d105ff10ba2e2874a3f)	2021-09-08 09:11:39 +00:00
Yellow Flash	09e8e5c5cb	HADOOP-17870. Http Filesystem to qualify relative paths. (#3338 ) Contributed by Yellowflash Change-Id: I217da06a1a2e5c0ca2b324f8e21baa0846f64858	2021-09-07 10:54:35 +01:00
Chris Nauroth	cc90b4f987	HADOOP-15129. Datanode caches namenode DNS lookup failure and cannot startup (#3348 ) Co-authored-by: Karthik Palaniappan Change-Id: Id079a5319e5e83939d5dcce5fb9ebe3715ee864f	2021-09-03 18:48:07 +00:00
jianghuazhu	7c663043b2	HDFS-16173.Improve CopyCommands#Put#executor queue configurability. (#3302 ) Co-authored-by: zhujianghua <zhujianghua@zhujianghuadeMacBook-Pro.local> Reviewed-by: Hui Fei <ferhui@apache.org> Reviewed-by: Viraj Jasani <vjasani@apache.org> (cherry picked from commit 4c94831364e9258247029c22a222a665771ab4c0)	2021-08-27 12:06:26 +08:00
jianghuazhu	2b2f8f575b	HDFS-16175.Improve the configurable value of Server #PURGE_INTERVAL_NANOS. (#3307 ) Co-authored-by: zhujianghua <zhujianghua@zhujianghuadeMacBook-Pro.local> Reviewed-by: Ayush Saxena <ayushsaxena@apache.org> (cherry picked from commit ad54f5195c8c01f333703c55cd70703109d75f29)	2021-08-25 17:35:50 +08:00
Bryan Beaudreault	2fda130260	HADOOP-17837: Add unresolved endpoint value to UnknownHostException (ADDENDUM) (#3276 ) (cherry picked from commit b0b867e977ab853d1dfc434195c486cf0ca32dab)	2021-08-06 21:57:46 +05:30
Bryan Beaudreault	7659b62682	HADOOP-17837: Add unresolved endpoint value to UnknownHostException (#3272 ) (cherry picked from commit 5e54d92e6ec866dc49a750110863a3fa8b2bcf7c)	2021-08-06 17:32:01 +08:00
Steve Loughran	26514b6534	HADOOP-17628. Distcp contract test is really slow with ABFS and S3A; timing out. (#3240 ) This patch cuts down the size of directory trees used for distcp contract tests against object stores, so making them much faster against distant/slow stores. On abfs, the test only runs with -Dscale (as was the case for s3a already), and has the larger scale test timeout. After every test case, the FileSystem IOStatistics are logged, to provide information about what IO is taking place and what it's performance is. There are some test cases which upload files of 1+ MiB; you can increase the size of the upload in the option "scale.test.distcp.file.size.kb" Set it to zero and the large file tests are skipped. Contributed by Steve Loughran.	2021-08-02 12:58:37 +01:00
Petre Bogdan Stolojan	f2cec5cb88	HADOOP-17139 Re-enable optimized copyFromLocal implementation in S3AFileSystem (#3101 ) This work * Defines the behavior of FileSystem.copyFromLocal in filesystem.md * Implements a high performance implementation of copyFromLocalOperation for S3 * Adds a contract test for the operation: AbstractContractCopyFromLocalTest * Implements the contract tests for Local and S3A FileSystems Contributed by: Bogdan Stolojan Change-Id: I25d502102775c3626c4264e5a14c649879730050	2021-08-02 11:58:36 +01:00
Viraj Jasani	ec3311975c	HADOOP-16290. Enable RpcMetrics units to be configurable (#3198 ) Signed-off-by: Akira Ajisaka <aajisaka@apache.org> (cherry picked from commit e1d00addb5b6d7240884536aaa57846af34a0dd5)	2021-07-20 14:56:28 +08:00
Abhishek Das	450dae7383	HADOOP-17028. ViewFS should initialize mounted target filesystems lazily. Contributed by Abhishek Das (#2260 ) (cherry picked from commit 1dd03cc4b573270dc960117c3b6c74bb78215caa)	2021-07-13 18:23:27 -07:00
Rafal Wojdyla	e3fb63f33f	HADOOP-17402. Add GCS config to the core-site (#2638 ) Contributed by Rafal Wojdyla	2021-07-07 22:43:31 +01:00
liangxs	24b780820c	HADOOP-17749. Remove lock contention in SelectorPool of SocketIOWithTimeout (#3080 ) (cherry picked from commit a5db6831bc674a24a3251cf1b20f22a4fd4fac9f)	2021-07-07 09:41:11 +08:00
Viraj Jasani	b8a98e4f82	HDFS-16075. Use empty array constants present in StorageType and DatanodeInfo to avoid creating redundant objects (#3115 ) Reviewed-by: Hui Fei <ferhui@apache.org> (cherry picked from commit c488abbc79cc1ad2596cbf509a0cde14acc5ad6b)	2021-06-21 10:28:05 +09:00
Takanobu Asanuma	25138c98bf	HADOOP-17760. Delete hadoop.ssl.enabled and dfs.https.enable from docs and core-default.xml (#3099 ) Reviewed-by: Ayush Saxena <ayushsaxena@apache.org> (cherry picked from commit 9e7c7ad129fcf466d9647e0672ecf7dd72213e72)	2021-06-17 10:00:36 +09:00
Steve Loughran	4ac9123619	HADOOP-17631. Configuration ${env.VAR:-FALLBACK} to eval FALLBACK when restrictSystemProps=true (#2977 ) Contributed by Steve Loughran. Change-Id: I9b82109eddeb659c01896152cf603d458e2a04cd	2021-06-08 22:05:00 +01:00
Steve Loughran	464bbd5b7c	HADOOP-17511. Add audit/telemetry logging to S3A connector (#2807 ) The S3A connector supports "an auditor", a plugin which is invoked at the start of every filesystem API call, and whose issued "audit span" provides a context for all REST operations against the S3 object store. The standard auditor sets the HTTP Referrer header on the requests with information about the API call, such as process ID, operation name, path, and even job ID. If the S3 bucket is configured to log requests, this information will be preserved there and so can be used to analyze and troubleshoot storage IO. Contributed by Steve Loughran. Change-Id: Ic0a105c194342ed2d529833ecc42608e8ba2f258	2021-05-25 12:55:38 +01:00

1 2 3 4 5 ...

2192 Commits