hadoop

Author	SHA1	Message	Date
Steve Loughran	e0563fed50	HADOOP-18908. Improve S3A region handling. (#6187 ) S3A region logic improved for better inference and to be compatible with previous releases 1. If you are using an AWS S3 AccessPoint, its region is determined from the ARN itself. 2. If fs.s3a.endpoint.region is set and non-empty, it is used. 3. If fs.s3a.endpoint is an s3.*.amazonaws.com url, the region is determined by by parsing the URL Note: vpce endpoints are not handled by this. 4. If fs.s3a.endpoint.region==null, and none could be determined from the endpoint, use us-east-2 as default. 5. If fs.s3a.endpoint.region=="" then it is handed off to The default AWS SDK resolution process. Consult the AWS SDK documentation for the details on its resolution process, knowing that it is complicated and may use environment variables, entries in ~/.aws/config, IAM instance information within EC2 deployments and possibly even JSON resources on the classpath. Put differently: it is somewhat brittle across deployments. Contributed by Ahmar Suhail	2023-10-17 15:37:36 +01:00
jianghuazhu	8963b25ab3	HADOOP-18926.Add some comments related to NodeFencer. (#6162 )	2023-10-13 15:34:44 -07:00
Steve Loughran	9bc159f4ac	HADOOP-18487. Make protobuf 2.5 an optional runtime dependency. (#4996 ) Protobuf 2.5 JAR is no longer needed at runtime. The option common.protobuf.scope defines whether the protobuf 2.5.0 dependency is marked as provided or not. * New package org.apache.hadoop.ipc.internal for internal only protobuf classes ...with a ShadedProtobufHelper in there which has shaded protobuf refs only, so guaranteed not to need protobuf-2.5 on the CP * All uses of org.apache.hadoop.ipc.ProtobufHelper have been replaced by uses of org.apache.hadoop.ipc.internal.ShadedProtobufHelper * The scope of protobuf-2.5 is set by the option common.protobuf2.scope In this patch is it is still "compile" * There is explicit reference to it in modules where it may be needed. * The maven scope of the dependency can be set with the common.protobuf2.scope option. It can be set to "provided" in a build: -Dcommon.protobuf2.scope=provided * Add new ipc(callable) method to catch and convert shaded protobuf exceptions raised during invocation of the supplied lambda expression * This is adopted in the code where the migration is not traumatically over-complex. RouterAdminProtocolTranslatorPB is left alone for this reason. Contributed by Steve Loughran	2023-10-13 13:48:38 +01:00
Steve Loughran	81edbebdd8	HADOOP-18889. S3A v2 SDK third party support (#6141 ) Tune AWS v2 SDK changes based on testing with third party stores including GCS. Contains HADOOP-18889. S3A v2 SDK error translations and troubleshooting docs * Changes needed to work with multiple third party stores * New third_party_stores document on how to bind to and test third party stores, including google gcs (which works!) * Troubleshooting docs mostly updated for v2 SDK Exception translation/resilience * New AWSUnsupportedFeatureException for unsupported/unavailable errors * Handle 501 method unimplemented as one of these * Error codes > 500 mapped to the AWSStatus500Exception if no explicit handler. * Precondition errors handled a bit better * GCS throttle exception also recognized. * GCS raises 404 on a delete of a file which doesn't exist: swallow it. * Error translation uses reflection to create IOE of the right type. All IOEs at the bottom of an AWS stack chain are regenerated. then a new exception of that specific type is created, with the top level ex its cause. This is done to retain the whole stack chain. * Reduce the number of retries within the AWS SDK * And those of s3a code. * S3ARetryPolicy explicitly declare SocketException as connectivity failure but subclasses BindException * SocketTimeoutException also considered connectivity * Log at debug whenever retry policies looked up * Reorder exceptions to alphabetical order, with commentary * Review use of the Invoke.retry() method The reduction in retries is because its clear when you try to create a bucket which doesn't resolve that the time for even an UnknownHostException to eventually fail over 90s, which then hit the s3a retry code. - Reducing the SDK retries means these escalate to our code better. - Cutting back on our own retries makes it a bit more responsive for most real deployments. - maybeTranslateNetworkException() and s3a retry policy means that unknown host exception is recognised and fails fast. Contributed by Steve Loughran	2023-10-12 17:47:44 +01:00
Kevin Risden	5c22934d90	HADOOP-18922. Race condition in ZKDelegationTokenSecretManager creating znode (#6150 ). Contributed by Kevin Risden. Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>	2023-10-12 23:21:26 +08:00
huangzhaobo	daa78adc88	HDFS-17200. Add some datanode related metrics to Metrics.md. (#6099 ). Contributed by huangzhaobo99 Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>	2023-10-06 12:40:44 +05:30
Viraj Jasani	27cb551821	HADOOP-18829. S3A prefetch LRU cache eviction metrics (#5893 ) Contributed by: Viraj Jasani	2023-09-21 14:31:44 +05:30
Pranav Saxena	f24b73e5f3	HADOOP-18873. ABFS: AbfsOutputStream doesnt close DataBlocks object. (#6010 ) AbfsOutputStream to close the dataBlock object created for the upload. Contributed By: Pranav Saxena	2023-09-20 14:24:36 +05:30
Hexiaoqiao	23c22b2823	HADOOP-18906. Increase default batch size of ZKDTSM token seqnum to reduce overflow speed of zonde dataVersion. (#6097 )	2023-09-18 10:50:53 -07:00
章锡平	60f3a2b101	HDFS-17138 RBF: We changed the hadoop.security.auth_to_local configur… (#5921 )	2023-09-18 09:40:22 -07:00
Vikas Kumar	e283375cdf	HADOOP-18851: Performance improvement for DelegationTokenSecretManager. (#6001 ). Contributed by Vikas Kumar. Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org> Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>	2023-09-15 12:32:47 +08:00
ConfX	23360b3f6b	HADOOP-18824. ZKDelegationTokenSecretManager causes ArithmeticException due to improper numRetries value checking (#6052 )	2023-09-14 15:53:31 -07:00
Steve Loughran	81d90fd65b	HADOOP-18073. S3A: Upgrade AWS SDK to V2 (#5995 ) This patch migrates the S3A connector to use the V2 AWS SDK. This is a significant change at the source code level. Any applications using the internal extension/override points in the filesystem connector are likely to break. This includes but is not limited to: - Code invoking methods on the S3AFileSystem class which used classes from the V1 SDK. - The ability to define the factory for the `AmazonS3` client, and to retrieve it from the S3AFileSystem. There is a new factory API and a special interface S3AInternals to access a limited set of internal classes and operations. - Delegation token and auditing extensions. - Classes trying to integrate with the AWS SDK. All standard V1 credential providers listed in the option fs.s3a.aws.credentials.provider will be automatically remapped to their V2 equivalent. Other V1 Credential Providers are supported, but only if the V1 SDK is added back to the classpath. The SDK Signing plugin has changed; all v1 signers are incompatible. There is no support for the S3 "v2" signing algorithm. Finally, the aws-sdk-bundle JAR has been replaced by the shaded V2 equivalent, "bundle.jar", which is now exported by the hadoop-aws module. Consult the document aws_sdk_upgrade for the full details. Contributed by Ahmar Suhail + some bits by Steve Loughran	2023-09-11 14:30:25 +01:00
Szilard Nemeth	9342ecf6cc	HADOOP-18870. CURATOR-599 change broke functionality introduced in HADOOP-18139 and HADOOP-18709. Contributed by Ferenc Erdelyi	2023-09-06 21:32:36 -04:00
huhaiyang	2831c7ce26	HADOOP-18880. Add some rpc related metrics to Metrics.md (#6015 ) Contributed by Yanghai Hu. Reviewed-by: Inigo Goiri <inigoiri@apache.org> Signed-off-by: Shilun Fan <slfan1989@apache.org>	2023-09-05 17:34:05 +08:00
Steve Loughran	28c533a582	Revert "HADOOP-18860. Upgrade mockito version to 4.11.0 (#5977 )" This reverts commit `1046f9cf98`.	2023-08-31 14:54:53 +01:00
Anmol Asrani	1046f9cf98	HADOOP-18860. Upgrade mockito version to 4.11.0 (#5977 ) As well as the POM update, this patch moves to the (renamed) verify methods. Backporting mockito test changes may now require cherrypicking this patch, otherwise use the old method names. Contributed by Anmol Asrani	2023-08-29 12:12:27 +01:00
Chunyi Yang	42b4525f75	HDFS-17156. Client may receive old state ID which will lead to inconsistent reads. (#5951 ) Reviewed-by: Simbarashe Dzinamarira <sdzinamarira@linkedin.com> Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>	2023-08-18 01:56:34 +09:00
hchaverri	ad2f45c64f	HDFS-17148. RBF: SQLDelegationTokenSecretManager must cleanup expired tokens in SQL (#5936 )	2023-08-11 13:04:32 -07:00
Liangjun He	b6edcb9a84	HADOOP-18840. Add enQueue time to RpcMetrics (#5926 ). Contributed by Liangjun He. Reviewed-by: Shilun Fan <slfan1989@apache.org> Reviewed-by: Xing Lin <linxingnku@gmail.com> Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>	2023-08-10 10:38:48 +08:00
hchaverri	bc48e5cbe8	HDFS-17128. Updating SQLDelegationTokenSecretManager to use LoadingCache so tokens are updated frequently. (#5897 ) Contributed by Hector Sandoval Chaverri. Reviewed-by: Simbarashe Dzinamarira <sdzinamarira@linkedin.com> Reviewed-by: Inigo Goiri <inigoiri@apache.org> Reviewed-by: Shilun Fan <slfan1989@apache.org> Signed-off-by: Shilun Fan <slfan1989@apache.org>	2023-08-08 07:45:14 +08:00
WangYuanben	1e3e246934	HADOOP-18810. Document missing a lot of properties in core-default.xml. (#5912 ) Contributed by WangYuanben. Reviewed-by: Shilun Fan <slfan1989@apache.org> Signed-off-by: Shilun Fan <slfan1989@apache.org>	2023-08-08 07:37:26 +08:00
WangYuanben	440698eb07	HADOOP-18836. Some properties are missing from hadoop-policy.xml (#5922 )	2023-08-07 20:03:23 +08:00
zhangshuyan	c35f31640e	HADOOP-18807. Close child file systems in ViewFileSystem when cache is disabled. (#5847 ) Contributed by Shuyan Zhang	2023-07-20 11:39:13 +01:00
Steve Loughran	b3130056f5	HADOOP-18808. LogExactlyOnce to add a debug() method (#5850 ) Contributed by Steve Loughran	2023-07-18 14:23:19 +01:00
Viraj Jasani	38ac2f7349	HADOOP-18809. S3A prefetch read/write file operations should guard channel close (#5853 ) Contributed by Viraj Jasani	2023-07-18 14:16:12 +01:00
hfutatzhanghb	b95595158f	HADOOP-18801. Delete path directly when it can not be parsed in trash. (#5744 ). Contributed by farmmamba. Signed-off-by: Ayush Saxena <ayushsaxena@apache.org> Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>	2023-07-16 12:20:46 +08:00
Viraj Jasani	e7d74f3d59	HADOOP-18291. S3A prefetch - Implement thread-safe LRU cache for SingleFilePerBlockCache (#5754 ) Contributed by Viraj Jasani	2023-07-14 10:21:01 +01:00
Mehakmeet Singh	fac7d26c5d	HADOOP-18781. ABFS backReference passed down to streams to avoid GC closing the FS. (#5780 ) To avoid the ABFS instance getting closed due to GC while the streams are working, attach the ABFS instance to a backReference opaque object and passing down to the streams so that we have a hard reference while the streams are working. Contributed by: Mehakmeet Singh	2023-07-11 17:57:05 +05:30
WangYuanben	6843f8e4e0	HADOOP-18794. ipc.server.handler.queue.size missing from core-default.xml (#5819 ). Contributed by WangYuanben. Reviewed-by: Hualong Zhang <hualong.z@hotmail.com> Reviewed-by: Shilun Fan <slfan1989@apache.org> Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>	2023-07-11 16:39:50 +05:30
slfan1989	e8590adb7b	HADOOP-18779. Improve hadoop-function.sh#status script. (#5762 )	2023-07-03 08:46:57 -07:00
slfan1989	8a52990150	YARN-11519. [Federation] Add RouterAuditLog to log4j.properties. (#5785 )	2023-06-27 10:52:59 -07:00
Mehakmeet Singh	5db7107b77	HADOOP-18764. fs.azure.buffer.dir to be under Yarn container path on yarn applications (#5737 ) Changing fs.azure.buffer.dir for azure so things clean up better in long-lived yarn clusters. Contributed by: Mehakmeet Singh	2023-06-27 20:22:00 +05:30
Wei-Chiu Chuang	e239d40ab1	Post release update * Add jdiff xml files from 3.3.6 release. * Declare 3.3.6 as the latest stable release. * Copy release notes. (cherry picked from commit `7db9895000`) (cherry picked from commit cc121e2124aa01458dc296a060edc5e21a295268)	2023-06-26 16:08:24 +00:00
Xing Lin	427366b73b	HDFS-17042 Add rpcCallSuccesses and OverallRpcProcessingTime to RpcMetrics for Namenode (#5730 )	2023-06-15 13:59:58 -07:00
Viraj Jasani	a75e378868	HADOOP-18756. S3A prefetch - CachingBlockManager to use AtomicBoolean for closed flag (#5718 ) Contributed by Viraj Jasani	2023-06-14 12:51:54 +01:00
Steve Loughran	7a45ef4164	MAPREDUCE-7435. Manifest Committer OOM on abfs (#5519 ) This modifies the manifest committer so that the list of files to rename is passed between stages as a file of writeable entries on the local filesystem. The map of directories to create is still passed in memory; this map is built across all tasks, so even if many tasks created files, if they all write into the same set of directories the memory needed is O(directories) with the task count not a factor. The _SUCCESS file reports on heap size through gauges. This should give a warning if there are problems. Contributed by Steve Loughran	2023-06-09 17:00:59 +01:00
Viraj Jasani	1dbaba8e70	HADOOP-18740. S3A prefetch cache blocks should be accessed by RW locks (#5675 ) Contributed by Viraj Jasani	2023-06-07 14:05:52 +01:00
Ayush Saxena	1d0c9ab433	Revert "HADOOP-18207. Introduce hadoop-logging module (#5503 )" This reverts commit `03a499821c`.	2023-06-05 09:34:40 +05:30
Szilard Nemeth	e0a339223a	HADOOP-18709. Add curator based ZooKeeper communication support over SSL/TLS into the common library. Contributed by Ferenc Erdelyi	2023-06-04 14:40:41 -04:00
Viraj Jasani	03a499821c	HADOOP-18207. Introduce hadoop-logging module (#5503 ) Reviewed-by: Duo Zhang <zhangduo@apache.org>	2023-06-02 18:07:34 -07:00
Steve Loughran	160b9fc3c9	HADOOP-18755. openFile builder new optLong() methods break hbase-filesystem (#5704 ) This is a followup to HADOOP-18724. Open file fails with NumberFormatException for S3AFileSystem Contributed by Steve Loughran	2023-06-01 14:31:08 +01:00
Patrick GRANDJEAN	4627242c44	HADOOP-18652. Path.suffix raises NullPointerException (#5653 ). Contributed by Patrick Grandjean. Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org> Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>	2023-05-19 05:16:55 +05:30
LiuGuH	f6770dee47	HDFS-16979. RBF: Add proxyuser port in hdfsauditlog (#5552 ). Contributed by liuguanghua. Reviewed-by: Inigo Goiri <inigoiri@apache.org> Reviewed-by: Simbarashe Dzinamarira <sdzinamarira@linkedin.com> Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>	2023-05-19 05:02:16 +05:30
Steve Loughran	a90c722143	HADOOP-18724. [FOLLOW-UP] cherrypick changes from branch-3.3 backport (#5662 ) * move FileContext.copy() onto optLong() * move FileUtil onto optLong() This brings trunk into sync with the branch-3.3 changes	2023-05-16 18:16:24 +01:00
Viraj Jasani	bef40e9427	HADOOP-18688. S3A audit header to include count of items in delete ops (#5621 ) The auditor-generated http referrer URL now includes the count of keys to delete in the "ks" query parameter Contributed by Viraj Jasani	2023-05-16 10:40:16 +01:00
Steve Loughran	ad1e3a0f5b	HADOOP-18724. (followup) remove deprecation on optLong/optDouble methods (#5650 ) Somehow @Deprecated crept in to the declaration of the new FSBuilder optLong/optDouble methods.	2023-05-12 15:22:37 +01:00
WangYuanben	905bfa84a8	HDFS-16965. Add switch to decide whether to enable native codec. (#5520 ). Contributed by WangYuanben. Reviewed-by: Tao Li <tomscut@apache.org> Reviewed-by: Shilun Fan <slfan1989@apache.org> Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>	2023-05-12 04:12:02 +05:30
Steve Loughran	e76c09ac3b	HADOOP-18724. Open file fails with NumberFormatException for S3AFileSystem (#5611 ) This: 1. Adds optLong, optDouble, mustLong and mustDouble methods to the FSBuilder interface to let callers explicitly passin long and double arguments. 2. The opt() and must() builder calls which take float/double values now only set long values instead, so as to avoid problems related to overloaded methods resulting in a ".0" being appended to a long value. 3. All of the relevant opt/must calls in the hadoop codebase move to the new methods 4. And the s3a code is resilient to parse errors in is numeric options -it will downgrade to the default. This is nominally incompatible, but the floating-point builder methods were never used: nothing currently expects floating point numbers. For anyone who wants to safely set numeric builder options across all compatible releases, convert the number to a string and then use the opt(String, String) and must(String, String) methods. Contributed by Steve Loughran	2023-05-11 17:57:25 +01:00
slfan1989	a2dda0ce03	HADOOP-18359. Update commons-cli from 1.2 to 1.5. (#5095 ). Contributed by Shilun Fan. Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>	2023-05-10 01:42:12 +05:30

1 2 3 4 5 ...

4864 Commits