hadoop

Author	SHA1	Message	Date
Nikita Eshkeev	d07356e60e	HADOOP-18597. Simplify single node instructions for creating directories for Map Reduce. (#5305 ) Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>	2023-04-20 16:12:44 +05:30
Steve Loughran	405ed1dde6	HADOOP-18470. Hadoop 3.3.5 release wrap-up (#5558 ) Post-release updates of the branches * Add jdiff xml files from 3.3.5 release. * Declare 3.3.5 as the latest stable release. * Copy release notes.	2023-04-18 10:12:07 +01:00
Viraj Jasani	b4bcbb9515	HDFS-16959. RBF: State store cache loading metrics (#5497 )	2023-03-29 10:43:13 -07:00
rdingankar	0ca5686034	HDFS-16917 Add transfer rate quantile metrics for DataNode reads (#5397 ) Co-authored-by: Ravindra Dingankar <rdingankar@linkedin.com>	2023-02-27 18:26:32 +00:00
Arnout Engelen	02fd87a4d8	HADOOP-18627. Add stronger wording in 'secure mode' introduction (#5406 ) Make it more clear that when deploying Hadoop 'secure mode' is generally not optional. Contributed by Arnout Engelen	2023-02-17 16:30:41 +00:00
Steve Loughran	d56977e909	HADOOP-18470. More in the 3.3.5 index.html about security (#5383 ) Expands on the comments in cluster config to tell people they shouldn't be running a cluster without a private VLAN in cloud, that Knox is good here, and unsecured clusters without a VLAN are just computation-as-a-service to crypto miners Contributed by Steve Loughran	2023-02-14 17:22:59 +00:00
Nikita Eshkeev	4de31123ce	Fix "the the" and friends typos (#5267 ) Signed-off-by: Nikita Eshkeev <neshkeev@yandex.ru>	2023-01-17 03:33:59 +08:00
Steve Loughran	84b33b897c	HADOOP-18470. index.md update for 3.3.5 release	2022-12-05 16:13:24 +00:00
GuoPhilipse	069bd973d8	HADOOP-18532. Update command usage in FileSystemShell.md (#5141 ) Signed-off-by: Akira Ajisaka <aajisaka@apache.org>	2022-11-21 15:55:46 +09:00
Ashutosh Gupta	a48e8c9beb	MAPREDUCE-5608. Replace and deprecate mapred.tasktracker.indexcache.mb (#5014 ) Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com> Signed-off-by: Akira Ajisaka <aajisaka@apache.org>	2022-11-14 11:07:40 +09:00
Steve Loughran	38b2ed2151	HADOOP-18442. Remove openstack support (#4855 ) Contributed by Steve Loughran	2022-10-06 11:49:38 +01:00
Ayush Saxena	cc41ad63f9	HADOOP-18388. Allow dynamic groupSearchFilter in LdapGroupsMapping. (#4798 ) * HADOOP-18388. Allow dynamic groupSearchFilter in LdapGroupsMapping.	2022-09-06 18:38:51 -04:00
Mukund Thakur	231e095802	HADOOP-18407. Improve readVectored() api spec (#4760 ) part of HADOOP-18103. Contributed By: Mukund Thakur	2022-08-22 23:19:29 +05:30
Steve Loughran	62dbefd8f2	HADOOP-18305. Release Hadoop 3.3.4: upstream changelog and jdiff files Add the r3.3.4 changelog, release notes and jdiff xml files.	2022-08-05 14:06:22 +01:00
Masatake Iwasaki	3cce41a1f6	Make upstream aware of 3.2.4 release. (cherry picked from commit e1637a57dfd41385dbce5de90620c48a45abb263)	2022-07-22 02:27:19 +00:00
Mukund Thakur	4d1f6f9b99	HADOOP-18106: Handle memory fragmentation in S3A Vectored IO. (#4445 ) part of HADOOP-18103. Handling memory fragmentation in S3A vectored IO implementation by allocating smaller user range requested size buffers and directly filling them from the remote S3 stream and skipping undesired data in between ranges. This patch also adds aborting active vectored reads when stream is closed or unbuffer() is called. Contributed By: Mukund Thakur	2022-06-22 17:29:32 +01:00
Mukund Thakur	5db0f34e29	HADOOP-18104: S3A: Add configs to configure minSeekForVectorReads and maxReadSizeForVectorReads (#3964 ) Part of HADOOP-18103. Introducing fs.s3a.vectored.read.min.seek.size and fs.s3a.vectored.read.max.merged.size to configure min seek and max read during a vectored IO operation in S3A connector. These properties actually define how the ranges will be merged. To completely disable merging set fs.s3a.max.readsize.vectored.read to 0. Contributed By: Mukund Thakur	2022-06-22 17:29:32 +01:00
Mukund Thakur	2daf0a814f	HADOOP-11867. Add a high-performance vectored read API. (#3904 ) part of HADOOP-18103. Add support for multiple ranged vectored read api in PositionedReadable. The default iterates through the ranges to read each synchronously, but the intent is that FSDataInputStream subclasses can make more efficient readers especially in object stores implementation. Also added implementation in S3A where smaller ranges are merged and sliced byte buffers are returned to the readers. All the merged ranged are fetched from S3 asynchronously. Contributed By: Owen O'Malley and Mukund Thakur	2022-06-22 17:29:32 +01:00
Ashutosh Gupta	a77d52284f	HADOOP-18255. Fix fsdatainputstreambuilder.md reference to hadoop branch-3.3 (#4378 ) Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>	2022-06-20 10:54:21 +05:30
Viraj Jasani	e38e13be03	HADOOP-18288. Total requests and total requests per sec served by RPC servers (#4431 ) Reviewed-by: Steve Loughran <stevel@apache.org> Signed-off-by: Tao Li <tomscut@apache.org>	2022-06-18 12:17:20 +08:00
Steve Loughran	e199da3fae	HADOOP-17833. Improve Magic Committer performance (#3289 ) Speed up the magic committer with key changes being * Writes under __magic always retain directory markers * File creation under __magic skips all overwrite checks, including the LIST call intended to stop files being created over dirs. * mkdirs under __magic probes the path for existence but does not look any further. Extra parallelism in task and job commit directory scanning Use of createFile and openFile with parameters which all for HEAD checks to be skipped. The committer can write the summary _SUCCESS file to the path `fs.s3a.committer.summary.report.directory`, which can be in a different file system/bucket if desired, using the job id as the filename. Also: HADOOP-15460. S3A FS to add `fs.s3a.create.performance` Application code can set the createFile() option fs.s3a.create.performance to true to disable the same safety checks when writing under magic directories. Use with care. The createFile option prefix `fs.s3a.create.header.` can be used to add custom headers to S3 objects when created. Contributed by Steve Loughran.	2022-06-17 19:11:35 +01:00
Masatake Iwasaki	3228142f53	Make upstream aware of 2.10.2 release. (cherry picked from commit 3dcb6367edd44c04a5ae0d195d851abac80d10a0) Conflicts: hadoop-project-dist/pom.xml	2022-06-01 00:53:01 +09:00
Steve Loughran	6f261ed4a2	HADOOP-18198. Release 3.3.3: release notes and jdiff files. * Add the changelog and release notes * add all jdiff XML files * update the project pom with the new stable version Change-Id: Iaea846c3e451bbd446b45de146845a48953d580d	2022-05-17 19:00:54 +01:00
Ashutosh Gupta	40a8b9a6a5	HADOOP-17479. Fix the examples of hadoop config prefix (#4197 ) Signed-off-by: Akira Ajisaka <aajisaka@apache.org>	2022-05-08 08:09:24 +09:00
Steve Loughran	1b4dba99b5	HADOOP-16202. Enhanced openFile(): hadoop-common changes. (#2584/1) This defines standard option and values for the openFile() builder API for opening a file: fs.option.openfile.read.policy A list of the desired read policy, in preferred order. standard values are adaptive, default, random, sequential, vector, whole-file fs.option.openfile.length How long the file is. fs.option.openfile.split.start start of a task's split fs.option.openfile.split.end end of a task's split These can be used by filesystem connectors to optimize their reading of the source file, including but not limited to * skipping existence/length probes when opening a file * choosing a policy for prefetching/caching data The hadoop shell commands which read files all declare "whole-file" and "sequential", as appropriate. Contributed by Steve Loughran. Change-Id: Ia290f79ea7973ce8713d4f90f1315b24d7a23da1	2022-04-24 17:33:04 +01:00
Ashutosh Gupta	f84b88dd6b	HADOOP-17564. Fix typo in UnixShellGuide.html (#4195 ) contributed by Ashutosh Gupta	2022-04-22 17:59:41 +01:00
Renukaprasad C	4ff8a5dc73	HDFS-16526. Addendum Add metrics for slow DataNode (#4191 )	2022-04-20 18:57:43 +05:30
Renukaprasad C	f14f305051	HDFS-16526. Add metrics for slow DataNode (#4162 )	2022-04-15 21:37:05 +05:30
GuoPhilipse	5de78ceb0e	HDFS-16516. Fix Fsshell wrong params (#4090 ). Contributed by GuoPhilipse.	2022-04-11 15:54:00 +08:00
litao	34b3275bf4	HDFS-16477. [SPS]: Add metric PendingSPSPaths for getting the number of paths to be processed by SPS (#4009 ). Contributed by tomscut. Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>	2022-04-03 01:36:03 +05:30
Masatake Iwasaki	10876333ac	Make upstream aware of 3.2.3 release.	2022-03-28 08:02:10 +00:00
Steve Loughran	708a0ce21b	HADOOP-13704. Optimized S3A getContentSummary() Optimize the scan for s3 by performing a deep tree listing, inferring directory counts from the paths returned. Contributed by Ahmar Suhail. Change-Id: I26ffa8c6f65fd11c68a88d6e2243b0eac6ffd024	2022-03-22 13:21:12 +00:00
Chao Sun	f800b65b40	Make upstream aware of 3.3.2 release	2022-03-02 19:14:50 -08:00
ted12138	902a7935e9	HADOOP-18128. Fix typo issues of outputstream.md (#4025 )	2022-03-02 18:25:56 +08:00
GuoPhilipse	b68964336d	HDFS-16449. Fix hadoop web site release notes and changelog not available (#3967 ) Reviewed-by: Ayush Saxena <ayushsaxena@apache.org> Signed-off-by: Akira Ajisaka <aajisaka@apache.org>	2022-02-14 05:38:28 +09:00
Steve Loughran	14ba19af06	HADOOP-17409. Remove s3guard from S3A module (#3534 ) Completely removes S3Guard support from the S3A codebase. If the connector is configured to use any metastore other than the null and local stores (i.e. DynamoDB is selected) the s3a client will raise an exception and refuse to initialize. This is to ensure that there is no mix of S3Guard enabled and disabled deployments with the same configuration but different hadoop releases -it must be turned off completely. The "hadoop s3guard" command has been retained -but the supported subcommands have been reduced to those which are not purely S3Guard related: "bucket-info" and "uploads". This is major change in terms of the number of files changed; before cherry picking subsequent s3a patches into older releases, this patch will probably need backporting first. Goodbye S3Guard, your work is done. Time to die. Contributed by Steve Loughran.	2022-01-17 18:08:57 +00:00
Viraj Jasani	f64fda0f00	HADOOP-18055. Async Profiler endpoint for Hadoop daemons (#3824 ) Reviewed-by: Akira Ajisaka <aajisaka@apache.org>	2022-01-06 17:56:49 +08:00
jianghuazhu	43afd1753a	HDFS-16394.RPCMetrics increases the number of handlers in processing. (#3822 )	2021-12-31 16:40:14 +08:00
smarthan	932a78fe38	HADOOP-18023. Allow cp command to run with multi threads. (#3721 )	2021-11-29 12:45:08 +00:00
Steve Loughran	98fe0d0fc3	HADOOP-17979. Add Interface EtagSource to allow FileStatus subclasses to provide etags (#3633 ) Contributed by Steve Loughran	2021-11-24 17:33:12 +00:00
smarthan	63018dc73f	HADOOP-17998. Allow get command to run with multi threads. (#3645 )	2021-11-22 11:37:05 +00:00
litao	c9f95b01ef	HDFS-16315. Add metrics related to Transfer and NativeCopy for DataNode (#3643 ) Reviewed-by: Hui Fei <ferhui@apache.org> Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>	2021-11-16 11:19:14 +09:00
litao	60acf8434d	HDFS-16319. Add metrics doc for ReadLockLongHoldCount and WriteLockLongHoldCount (#3653 ). Contributed by tomscut. Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>	2021-11-14 20:10:24 +05:30
Masatake Iwasaki	46a7117995	HADOOP-17908. Add missing RELEASENOTES and CHANGELOG to upstream. (#3433 )	2021-10-20 13:54:46 +09:00
huhaiyang	68c2accc20	HDFS-16247. RBF: Fix the ProcessingAvgTime and ProxyAvgTime code comments and document metrics describe ms unit (#3511 )	2021-10-04 23:52:26 +08:00
Renukaprasad C	4c516536be	HDFS-16236. Example command for daemonlog is not correct (#3476 )	2021-09-25 18:32:52 +08:00
Rintaro Ikeda	607c20c612	HADOOP-17919. Fix command line example in Hadoop Cluster Setup documentation. (#3453 )	2021-09-17 22:24:44 +09:00
jianghuazhu	4c94831364	HDFS-16173.Improve CopyCommands#Put#executor queue configurability. (#3302 ) Co-authored-by: zhujianghua <zhujianghua@zhujianghuadeMacBook-Pro.local> Reviewed-by: Hui Fei <ferhui@apache.org> Reviewed-by: Viraj Jasani <vjasani@apache.org>	2021-08-27 11:41:44 +08:00
Petre Bogdan Stolojan	a218038960	HADOOP-17139 Re-enable optimized copyFromLocal implementation in S3AFileSystem (#3101 ) This work * Defines the behavior of FileSystem.copyFromLocal in filesystem.md * Implements a high performance implementation of copyFromLocalOperation for S3 * Adds a contract test for the operation: AbstractContractCopyFromLocalTest * Implements the contract tests for Local and S3A FileSystems Contributed by: Bogdan Stolojan	2021-07-30 19:42:08 +01:00
Viraj Jasani	e1d00addb5	HADOOP-16290. Enable RpcMetrics units to be configurable (#3198 ) Signed-off-by: Akira Ajisaka <aajisaka@apache.org>	2021-07-19 23:55:49 -07:00

1 2 3 4 5 ...

482 Commits