Commit Graph

27471 Commits

Author SHA1 Message Date
Steve Loughran
7999db55da
HADOOP-19330. S3A: Add LeakReporter; use in S3AInputStream (#7151)
If a file is opened for reading through the S3A connector
is not closed, then when garbage collection takes place

* An error message is reported at WARN, including the file name.
* A stack trace of where the stream was created is reported
  at INFO.
* A best-effort attempt is made to release any active HTTPS
  connection.
* The filesystem IOStatistic stream_leaks is incremented.

The intent is to make it easier to identify where streams
are being opened and not closed -as these consume resources
including often HTTPS connections from the connection pool
of limited size.

It MUST NOT be relied on as a way to clean up open
files/streams automatically; some of the normal actions of
the close() method are omitted.

Instead: view the warning messages and IOStatistics as a
sign of a problem, the stack trace as a way of identifying
what application code/library needs to be investigated.

Contributed by Steve Loughran
2024-11-14 17:02:25 +00:00
Syed Shameerur Rahman
2273278d0b
HADOOP-18708: S3A: Support S3 Client Side Encryption(CSE) (#6884)
Add support for S3 client side encryption (CSE).

CSE can configured in two modes:
- CSE-KMS where keys are provided by AWS KMS
- CSE-CUSTOM where custom keys are provided by implementing
  a custom keyring.

CSE requires an encryption library:

  amazon-s3-encryption-client-java.jar

This is _not_ included in the shaded bundle.jar
and is released separately.

The version used is currently 3.1.1

Contributed by Syed Shameerur Rahman.
2024-11-14 13:39:56 +00:00
Dominik Diedrich
9a743bd17f
HADOOP-19315. Upgrade Apache Avro to 1.11.4 (#7128)
* All field access is now via setter/getter methods
* To use Avro to marshal Serializable objects,
  the packages they are in must be declared in the system property
  "org.apache.avro.SERIALIZABLE_PACKAGES"
  
This is required to address
- CVE-2024-47561
- CVE-2023-39410  

This change is not backwards compatible.

Contributed by Dominik Diedrich
2024-11-11 15:46:36 +00:00
Joseph Dell'Aringa
9657276492
HDFS-17642. Add target node list, exclude source node list, and exclude target node list parameters to balancer (#7127)
HDFS-17642. Add target node list, exclude source node list, and exclude target node list parameters to balancer (#7127)

---------

Co-authored-by: Joseph DellAringa <jdellari@linkedin.com>
2024-11-07 10:15:51 -08:00
Anuj Modi
487727a5d1
HADOOP-18960: [ABFS] Making Contract tests run in sequential and Other Test Fixes (#7104)
Contributed by: Anuj Modi
2024-11-05 16:12:03 -06:00
muskan1012
f7651e2f63
HADOOP-19243. Upgrade Mockito version to 4.11.0 (#6968)
Mockito is now at a JDK-17 compatible version.

Contributed by Muskan Mishra
2024-11-05 17:35:53 +00:00
Sebastian Klemke
51ebc3c20e
HADOOP-18583. Fix loading of OpenSSL 3.x symbols (#5256)
Contributed by Sebastian Klemke
2024-11-05 17:14:36 +00:00
yanmin
9ae01bdbe8
HADOOP-19143. Upgrade commons-cli to 1.9.0 (#7126) Contributed by Min Yan.
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2024-11-05 10:39:49 +08:00
yanmin
df979e70de
HADOOP-19297. [JDK17] Upgrade maven.plugin-tools.version to 3.10.2 (#7125) Contributed by Min Yan.
Reviewed-by: Steve Loughran <stevel@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2024-11-05 07:46:35 +08:00
Christos Bisias
66baf1eb51
HADOOP-18682. Move hadoop docker scripts under the main source code (#6483). Contributed by Christos Bisias. 2024-11-04 22:22:37 +05:30
Lei313
e4789a2fd3
HDFS-17607. Reduce the number of times conf is loaded when DataNode startUp (#7012). Contributed by lei w.
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2024-11-04 20:10:08 +08:00
hfutatzhanghb
4f3abd2f48
HDFS-17654. Fix bugs in TestRouterMountTable (#7137). Contributed by farmmamba.
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2024-11-02 17:27:06 +08:00
Zhaobo Huang
00cddf5bea
HDFS-17646. Add Option to limit Balancer overUtilized nodes num in each iteration. (#7120). Contributed by Zhaobo Huang.
Reviewed-by: Haiyang Hu <huhaiyang926@126.com>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2024-11-02 17:06:18 +08:00
slfan1989
7a7b346b0a
Revert "HADOOP-19298. [JDK17] Add a JDK17 profile. (#7085) Contributed by Shilun Fan." (#7132)
This reverts commit f931ede86b.
2024-10-28 09:39:16 +08:00
Lei Yang
eb1e30395b
HDFS-17644:Add log when a node selection is rejected by BPP UpgradeDomain (#7109)
Some checks failed
website / build (push) Has been cancelled
2024-10-26 17:29:43 +08:00
Syed Shameerur Rahman
0b3755347c
HADOOP-19309: S3A: CopyFromLocalFile operation fails when the source file does not contain file scheme (#7113)
Contributed by Syed Shameerur Rahman
2024-10-25 11:11:52 +01:00
Gautham B A
d1ce965645
HDFS-17636. Don't add declspec for Windows (#7096)
* Windows doesn't want the
  macro _JNI_IMPORT_OR_EXPORT_
  to be defined in the function
  definition. It fails to compile with
  the following error -
  "definition of dllimport function
  not allowed".
* However, Linux needs it. Hence,
  we're going to add this macro
  based on the OS.
* Also, we'll be compiling the `hdfs`
  target as an object library so that
  we can avoid linking to `jvm`
  library for `get_jni_test` target.
2024-10-22 23:15:23 +05:30
Felix Nguyen
09b348753f
HDFS-17634. RBF: Fix web UI missing DN last block report (#7080) 2024-10-22 19:54:16 +08:00
slfan1989
f931ede86b
HADOOP-19298. [JDK17] Add a JDK17 profile. (#7085) Contributed by Shilun Fan.
Some checks failed
website / build (push) Has been cancelled
Reviewed-by: Steve Loughran <stevel@apache.org>
Reviewed-by: Attila Doroszlai <adoroszlai@apache.org>
Reviewed-by: Cheng Pan <chengpan@apache.org>
Reviewed-by: Min Yan <yaommen@gmail.com>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2024-10-18 17:16:33 +08:00
LiuGuH
6589d9f6aa
HDFS-17631. Fix RedundantEditLogInputStream.nextOp() state error when EditLogInputStream.skipUntil() throw IOException (#7066). Contributed by liuguanghua.
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2024-10-16 21:16:18 +08:00
Tao Yang
c63aafd7d1
YARN-11732. Fix potential NPE when calling SchedulerNode#reservedContainer for CapacityScheduler (#7065). Contributed by Tao Yang.
Reviewed-by: Syed Shameerur Rahman <syedthameem1@gmail.com>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2024-10-16 21:11:31 +08:00
Davin Tjong
78a08b3b78
MAPREDUCE-7494. File stream leak when LineRecordReader is interrupted (#7117)
Contributed by Davin Tjong
2024-10-16 11:41:18 +01:00
Cheng Pan
9321e322d2
HADOOP-19310. Add JPMS options required by Java 17+ (#7114) Contributed by Cheng Pan.
Reviewed-by: Attila Doroszlai <adoroszlai@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2024-10-16 14:15:01 +08:00
Mukund Thakur
e4b070025b
HADOOP-19291. RawLocalFileSystem to allow overlapping ranges (#7101)
ChecksumFileSystem creates the chunked ranges based on the checksum chunk size and then calls
readVectored on Raw Local which may lead to overlapping ranges in some cases.

Contributed by: Mukund Thakur
2024-10-09 08:34:47 -05:00
Steve Loughran
dc56fc385a
HADOOP-19295. S3A: large uploads can timeout over slow links (#7089)
This sets a different timeout for data upload PUT/POST calls to all
other requests, so that slow block uploads do not trigger timeouts
as rapidly as normal requests. This was always the behavior
in the V1 AWS SDK; for V2 we have to explicitly set it on the operations
we want to give extended timeouts. 

Option:  fs.s3a.connection.part.upload.timeout
Default: 15m

Contributed by Steve Loughran
2024-10-07 17:57:13 +01:00
Steve Loughran
50e6b49e05
HADOOP-19299. HttpReferrerAuditHeader resilience (#7095)
* HttpReferrerAuditHeader is thread safe, copying the lists/maps passed
  in and using synchronized methods when necessary.
* All exceptions raised when building referrer header are caught
  and swallowed.
* The first such error is logged at warn,  
* all errors plus stack are logged at debug

Contributed by Steve Loughran
2024-10-07 13:53:01 +01:00
zhtttylz
1f0d9df887
HDFS-17637. Fix spotbugs in HttpFSFileSystem#getXAttr (#7099) Contributed by Hualong Zhang.
Reviewed-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2024-10-06 09:16:00 +08:00
Syed Shameerur Rahman
5ea3a1bd0a
HADOOP-19286: S3A: Support cross region access when S3 region/endpoint is set (ADDENDUM) (#7098)
Contributed by Syed Shameerur Rahman
2024-10-04 14:58:53 +01:00
slfan1989
4e6432a0ab
HADOOP-19296. [JDK17] Upgrade maven-war-plugin to 3.4.0. (#7086) Contributed by Shilun Fan.
Some checks failed
website / build (push) Has been cancelled
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
Reviewed-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2024-10-03 22:32:11 +08:00
Susheel Gupta
1b5a2a7f65
YARN-11708: Setting maximum-application-lifetime using AQCv2 templates doesn't apply on the first submitted app (#7041) 2024-10-03 15:55:28 +02:00
zhtttylz
b781882020
YARN-11734. Fix spotbugs in ServiceScheduler#load (#7088) Contributed by Hualong Zhang.
Reviewed-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2024-10-03 06:15:17 +08:00
Cheng Pan
3f637efaa2
HADOOP-19219. Add JPMS options required by hadoop-common (#7084) Contributed by Cheng Pan.
Some checks failed
website / build (push) Has been cancelled
Reviewed-by: Steve Loughran <stevel@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2024-10-02 10:45:25 +08:00
Syed Shameerur Rahman
e9ed21c065
HADOOP-19286: S3A: Support cross region access when S3 region/endpoint is set (#7067)
Adds new option
   s3a.cross.region.access.enabled
Which is true by default

This enables cross region access as a separate config and enable/disables it irrespective of region/endpoint is set.

Contributed by Syed Shameerur Rahman
2024-10-01 20:11:11 +01:00
cxzl25
4ff0dceebd
HADOOP-19288. hadoop-client-runtime to exclude dnsjava InetAddressResolverProvider (#7070)
Some checks are pending
website / build (push) Waiting to run
Contributed by dzcxzl.
2024-10-01 14:48:48 +01:00
Steve Loughran
45b1c86fe5
HADOOP-19294. NPE on maven enforcer with -Pnative on arm mac (#7082)
Update maven-enforcer-plugin.version to 3.5.0

Contributed by Steve Loughran
2024-10-01 14:34:05 +01:00
Sammi Chen
6fd4fea748
HADOOP-19261. Support force close a DomainSocket for server service (#7057) 2024-09-30 10:06:07 -07:00
Manish Bhatt
9aca73481e
HADOOP-19280. [ABFS] Initialize client timer only if metric collection is enabled (#7061)
Contributed by Manish Bhatt
2024-09-30 16:56:18 +01:00
litao
a9b7913d56
HDFS-17626. Reduce lock contention at datanode startup (#7053). Contributed by Tao Li.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2024-09-29 10:58:13 +08:00
Ayush Saxena
3fda243419
HADOOP-19290. Operating on / in ChecksumFileSystem throws NPE. (#7074). Contributed by Ayush Saxena. 2024-09-28 19:35:32 +05:30
Sarveksha Yeshavantha Raju
01401d71ef
HADOOP-19281. MetricsSystemImpl should not print INFO message in CLI (#7071)
Replaced all LOG.info with LOG.debug

Contributed by Sarveksha Yeshavantha Raju
2024-09-27 14:20:11 +01:00
fuchaohong
3d81dde28b
HDFS-17624. Fix DFSNetworkTopology#chooseRandomWithStorageType() availableCount when excluded node is not in selected scope. (#7042). Contributed by fuchaohong.
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2024-09-27 16:56:09 +08:00
Anuj Modi
21cdb450ef
HADOOP-19284: [ABFS] Allow "fs.azure.account.hns.enabled" to be set as Account Specific Config (#7062) 2024-09-27 10:16:28 +05:30
Sadanand Shenoy
49a495803a
HDFS-17381. Distcp of EC files should not be limited to DFS. (#6551)
Contributed by Sadanand Shenoy
2024-09-25 17:54:09 +01:00
Syed Shameerur Rahman
21ec686be3
YARN-11702: Fix Yarn over allocating containers (#6990) Contributed by Syed Shameerur Rahman.
Reviewed-by: Akira Ajisaka <aajisaka@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2024-09-25 12:10:15 +08:00
Nihal Jain
e602c601dd
HADOOP-15760. Upgrade commons-collections to commons-collections4 (#7006)
This moves Hadoop to Apache commons-collections4.

Apache commons-collections has been removed and is completely banned from the source code.

Contributed by Nihal Jain
2024-09-24 16:50:22 +01:00
Ayush Saxena
f90a703e48
HADOOP-19165. Drop protobuf 2.5.0 from the distribution (#7051). Contributed by Ayush Saxena. 2024-09-24 20:58:41 +05:30
Peter Szucs
b078f86d69
YARN-11733. Fix the order of updating CPU controls with cgroup v1 (#7069) 2024-09-24 17:13:28 +02:00
Attila Magyar
68315744f0
HDFS-17040. Namenode web UI should set content type to application/octet-stream when uploading a file. (#5721) 2024-09-23 12:21:38 -07:00
Steve Loughran
37a74f0692
HADOOP-19285. [ABFS] Restore ETAGS_AVAILABLE to abfs path capabilities (#7064)
Caused by HADOOP-19131  

Contributed by: Steve Loughran
2024-09-23 12:52:05 -05:00
Felix Nguyen
fccc268cde
HADOOP-19283. Move all DistCp execution logic to execute() (#7060)
Co-authored-by: Felix Nguyen <kokonguyen191@gmail.com>
2024-09-23 15:39:56 +08:00