25131 Commits

Author SHA1 Message Date
PJ Fanning
d88a6ee962
HADOOP-18512: upgrade woodstox-core to 5.4.0 for security fix (#5087). Contributed by PJ Fanning.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-11-02 00:14:01 +05:30
sabertiger
1fdc6c5322
HADOOP-18233. Possible race condition with TemporaryAWSCredentialsProvider (#5024)
This fixes a race condition with the TemporaryAWSCredentialProvider
one which has existed for a long time but which only surfaced
(usually in Spark) when the bucket existence probe was disabled
by setting fs.s3a.bucket.probe to 0 -a performance speedup
which was made the default in HADOOP-17454.

Contributed by Jimmy Wong.
2022-10-31 17:50:49 +00:00
PJ Fanning
41e3c7edaf
HADOOP-18472. Upgrade to snakeyaml 1.33 (#4958)
Reviewed-by: Dinesh Chitlangia <dineshc@apache.org>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit d6a65a41809410cd7ee7a57f603a1b249ccfc161)

 Conflicts:
	LICENSE-binary
	hadoop-project/pom.xml
2022-10-30 02:32:44 +09:00
Chris Nauroth
33293d4ba4 YARN-11360: Add number of decommissioning/shutdown nodes to YARN cluster metrics. (#5060)
(cherry picked from commit bfb84cd7f66fb6fe98809cc5d6c59864995855b1)
2022-10-28 18:13:57 +00:00
Mehakmeet Singh
6accb7809f
HADOOP-18499. S3A to support HTTPS web proxies (#5083)
The option "fs.s3a.proxy.ssl.enabled" controls
whether the s3a connects to a proxy over HTTP (default) or HTTPS.
Set to "true" to use HTTPS.

Contributed by Mehakmeet Singh
2022-10-27 20:17:57 +05:30
M1eyu2018
cbac2c4875 HDFS-16716. Improve appendToFile command: support appending on file with new block (#4697)
Reviewed-by: xuzq <15040255127@163.com>
Signed-off-by: Tao Li <tomscut@apache.org>
2022-10-27 19:11:51 +08:00
Takanobu Asanuma
53143409a8 HDFS-16822. HostRestrictingAuthorizationFilter should pass through requests if they don't access WebHDFS API. (#5079)
Reviewed-by: Ashutosh Gupta <ashugpt@amazon.com>
Reviewed-by: Tao Li <tomscut@apache.org>
(cherry picked from commit 545a556883ab8f126f72efeeeed29265974feaf8)
2022-10-27 14:40:07 +09:00
PJ Fanning
bd276092b0 MAPREDUCE-7411: use secure XML parsers in mapreduce modules (#4980)
Lockdown of parsers in hadoop-mapreduce.

Follow-on to HADOOP-18469. Add secure XML parser factories to XMLUtils

Contributed by P J Fanning
2022-10-26 11:04:29 +01:00
Viraj Jasani
36a0e818ec HDFS-16016. BPServiceActor to provide new thread to handle IBR (#2998)
Contributed by Viraj Jasani

(cherry picked from commit c1bf3cb0daf0b6212aebb449c97b772af2133d98)
2022-10-24 15:16:38 +09:00
Takanobu Asanuma
198bc444de
HDFS-16566 Erasure Coding: Recovery may causes excess replicas when busy DN exsits (#4252) (#5059)
(cherry picked from commit 9376b659896e1e42bacc6fdeaac9ac3d8eb41c49)

Co-authored-by: RuinanGu <57645247+RuinanGu@users.noreply.github.com>
Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org>
Reviewed-by: Ashutosh Gupta <ashugpt@amazon.com>
2022-10-22 13:14:04 +09:00
Steve Loughran
19f8e4f34d
YARN-11330. use secure XML parsers (#4981)
Move construction of XML parsers in YARN
modules to using the locked-down parser factory
of HADOOP-18469.

One exception: GpuDeviceInformationParser still supports DTD resolution;
all other features are disabled.

Contributed by P J Fanning
2022-10-21 14:16:22 +01:00
SevenAddSix
237814a9b3 HDFS-16480. Fix typo: indicies -> indices (#4020)
(cherry picked from commit 5eab9719cbf6b9bddbdb4454a5f8e1ae12495492)
2022-10-21 17:32:58 +09:00
Hui Fei
0c2234fd8e HDFS-15803. EC: Remove unnecessary method (getWeight) in StripedReconstructionInfo. Contributed by huhaiyang
(cherry picked from commit 66ecee333e0aeeaba3467a425cedcae3ab790739)
2022-10-21 17:31:30 +09:00
FuzzingTeam
1f414ab847
HADOOP-18471. Fixed ArrayIndexOutOfBoundsException in DefaultStringifier (#4957)
Contributed by FuzzingTeam
2022-10-20 18:14:24 +01:00
Steve Loughran
75b04010a2
HDFS-16795. Use secure XML parsers (#4979)
Move construction of XML parsers in HDFS
modules to using the locked-down parser factory
of HADOOP-18469.

Contributed by P J Fanning
2022-10-20 17:48:58 +01:00
Steve Loughran
5b7cbe2075
HADOOP-18156. Address JavaDoc warnings in classes like MarkerTool, S3ObjectAttributes, etc (#4965)
Contributed by Ankit Saurabh
2022-10-20 17:46:46 +01:00
PJ Fanning
ea851c5e4a
HADOOP-15983. Use jersey-json that is built to use jackson2 ((#3988)
Moves from com.sun.jersey 1.19 to the artifact
com.github.pjfanning:jersey-json:1.20

This allows jackson 1 to be removed from the classpath.

Contains

* HADOOP-16908. Prune Jackson 1 from the codebase and restrict
   its usage for future
* HADOOP-18219. Fix shaded client test failure

These are needed for the HADOOP-15983 changes to build.

Contributed by PJ Fanning.
2022-10-20 17:37:56 +01:00
Daniel Carl Jones
2778bc8d90
HADOOP-18465. Fix S3A SSE test skip when encryption is disabled (#4925)
Contributed by Daniel Carl Jones
2022-10-19 16:02:36 +01:00
Daniel Carl Jones
c30b2f0b8c
HADOOP-18304. Improve user-facing S3A committers documentation (#4478)
Contributed by: Daniel Carl Jones
2022-10-19 13:08:27 +01:00
Steve Loughran
7a18ceb269
HADOOP-18476. Abfs and S3A FileContext bindings to close wrapped filesystems in finalizer (#4966)
This is to try and close the underlying filesystems when the FileContext APIs are used.
Without this, threads may be leaked

Contributed by Steve Loughran
2022-10-18 15:28:55 +01:00
Hexiaoqiao
84c7fd909b
HADOOP-18497. Upgrade commons-text version to 1.10.0 to fix CVE-2022-42889. (#5037).
Contributed by PJ Fanning.
2022-10-18 15:05:08 +01:00
slfan1989
2e3f91bdf5
HADOOP-18360. Update commons-csv from 1.0 to 1.9.0. (#4928). Contributed by fanshilun.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-10-17 10:23:13 +05:30
PJ Fanning
96d4b9e6a7
HADOOP-18493: upgrade jackson-databind to 2.12.7.1 (#5011). Contributed by PJ Fanning.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-10-17 10:04:21 +05:30
Steve Loughran
cd856b7195
HADOOP-17563. Upgrade BouncyCastle to 1.68 (#3980) (#5015)
Addresses CVE-2020-15522 and CVE-2020-26939.

This can break builds with older maven shade plugins or
other code using asm.jar which is not aware of recent java bytecodes
and/or multi-release JARs. fix: use a later version of asm.jar

Contributed by PJ Fanning
2022-10-15 15:09:05 +01:00
ahmarsuhail
08760fc4c1
HADOOP-18481. AWS v2 SDK upgrade log to not about standard AWS Credential Providers. (#4973)
The AWS SDKV2 upgrade log no longer warns about instantiation
of the v1 SDK credential providers which are commonly used in
s3a configurations:

* com.amazonaws.auth.EnvironmentVariableCredentialsProvider
* com.amazonaws.auth.EC2ContainerCredentialsProviderWrapper
* com.amazonaws.auth.InstanceProfileCredentialsProvider

When the hadoop-aws module moves to the v2 SDK, references to these
credential providers will be rewritten to their v2 equivalents.

Follow-on to HADOOP-18382. "Upgrade AWS SDK to V2 - Prerequisites"

Contributed by Ahmar Suhail
2022-10-14 11:46:14 +01:00
ahmarsuhail
47c1c8eddc
HADOOP-18382. AWS SDK v2 upgrade prerequisites (#4698)
This patch prepares the hadoop-aws module for a future
migration to using the v2 AWS SDK (HADOOP-18073)

That upgrade will be incompatible; this patch prepares
for it:
-marks some credential providers and other
 classes and methods as @deprecated.
-updates site documentation
-reduces the visibility of the s3 client;
 other than for testing, it is kept private to
 the S3AFileSystem class.
-logs some warnings when deprecated APIs are used.

The warning messages are printed only once
per JVM's life. To disable them, set the
log level of org.apache.hadoop.fs.s3a.SDKV2Upgrade
to ERROR

Contributed by Ahmar Suhail
2022-10-14 11:45:43 +01:00
monthonk
52eca61a3e
HADOOP-18292. Fix s3 select tests when running against unsupported storage class (#4489)
Follow-on from HADOOP-12020.

Contributed by Monthon Klongklaew
2022-10-13 13:37:35 +01:00
belugabehr
6253bf72b6
HADOOP-17779: Lock File System Creator Semaphore Uninterruptibly (#3158)
Contributed by David Mollitor.
2022-10-11 13:07:42 +01:00
Xing Lin
760144f135
HDFS-16628. RBF: Correct target directory when move to trash for kerberos login user. (#4974)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-10-11 16:14:39 +09:00
Ashutosh Gupta
6847ec0647
HADOOP-11245. Update NFS gateway to use Netty4 (#2832) (#4997)
Reviewed-by: Tsz-Wo Nicholas Sze <szetszwo@apache.org>

Co-authored-by: Wei-Chiu Chuang <weichiu@apache.org>
2022-10-11 05:27:43 +08:00
Mukund Thakur
77cb778a44
HADOOP-18460. checkIfVectoredIOStopped before populating the buffers (#4986)
Contributed by Mukund Thakur
2022-10-10 11:18:22 +01:00
Steve Loughran
80525615e5
HADOOP-18480. Upgrade aws sdk to 1.12.316 (#4972)
Contributed by Steve Loughran
2022-10-10 10:29:41 +01:00
Steve Loughran
e360e7620c
HADOOP-18468: Upgrade jettison to 1.5.1 to fix CVE-2022-40149 (#4937)
Contributed by PJ Fanning
2022-10-10 10:05:39 +01:00
Xing Lin
7d7f7a9e9b
HDFS-16024. RBF: Rename data to the Trash should be based on src location (#4962)
(cherry picked from commit e18d8062126951b01d04d14374cc04053167735b)

Reviewed-by: Dinesh Chitlangia <dineshc@apache.org>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-10-10 00:33:48 +09:00
Steve Loughran
61e1603750
HADOOP-18401. No ARM binaries in branch-3.3.x releases. (#4953)
Fix the branch-3.3 docker image and create-release scripts to work on arm 64 and macbook m1

Contributed by Ayush Saxena and Steve Loughran
2022-10-07 15:58:51 +01:00
Steve Loughran
c70b8709cc
HADOOP-18442. Remove openstack support (#4855)
The swift:// connector for openstack support has been removed.
The hadoop-openstack jar remains, only now it is empty of code. 
This is to ensure that projects which declare the JAR a dependency
will still have successful builds.

Contributed by Steve Loughran
2022-10-07 12:03:08 +01:00
Steve Loughran
80781306dd
HADOOP-18469. Add secure XML parser factories to XMLUtils (#4940)
Add to XMLUtils a set of methods to create secure XML Parsers/transformers,
locking down DTD, schema, XXE exposure.

Use these wherever XML parsers are created.

Contributed by PJ Fanning
2022-10-07 10:47:55 +01:00
Ashutosh Gupta
725cd90712 MAPREDUCE-7370. Parallelize MultipleOutputs#close call (#4248). Contributed by Ashutosh Gupta.
Reviewed-by: Akira Ajisaka <aajisaka@apache.org>
Signed-off-by: Chris Nauroth <cnauroth@apache.org>
(cherry picked from commit 062c50db6bebfabae38aec9e17be2483a11c3f7f)
2022-10-06 23:14:38 +00:00
Ashutosh Gupta
1c3bf42ad0
YARN-11303. Upgrade jquery ui to 1.13.2 to mitigate CVE-2022-31160 (#4895)
Contributed by Ashutosh Gupta
2022-10-05 12:09:11 +01:00
Mukund Thakur
0d772b353f HADOOP-18463. Add an integration test to process data asynchronously during vectored read. (#4921)
part of HADOOP-18103.

Contributed by: Mukund Thakur
2022-09-28 15:38:41 -05:00
Mukund Thakur
bbe841e601 HADOOP-18347. S3A Vectored IO to use bounded thread pool. (#4918)
part of HADOOP-18103.

Also introducing a config fs.s3a.vectored.active.ranged.reads
to configure the maximum number of number of range reads a
single input stream can have active (downloading, or queued)
to the central FileSystem instance's pool of queued operations.
This stops a single stream overloading the shared thread pool.

Contributed by: Mukund Thakur
 Conflicts:
	hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java
	hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
2022-09-28 15:34:31 -05:00
Mehakmeet Singh
e5a566c91f
HADOOP-18416. fix ITestS3AIOStatisticsContext test failure (#4931)
Follow on to HADOOP-17461.

Contributed by: Mehakmeet Singh
2022-09-28 14:17:56 +05:30
Ashutosh Gupta
dea018ef23
HDFS-16766. XML External Entity (XXE) attacks can occur while processing XML received from an untrusted source (#4886)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit d9f435f6acabb28ab8a670a4a9081f0164008b1e)
2022-09-27 15:44:58 +09:00
Ashutosh Gupta
51605f9dcc
HADOOP-18443. Upgrade snakeyaml to 1.32 (#4873)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-09-25 23:50:46 +09:00
Xing Lin
f1c1ad52c5 HADOOP-18444 Add Support for localized trash for ViewFileSystem in Trash.moveToAppropriateTrash (#4869)
* HADOOP-18444 Add Support for localized trash for ViewFileSystem in Trash.moveToAppropriateTrash

Signed-off-by: Xing Lin <xinglin@linkedin.com>
2022-09-23 11:06:23 -07:00
Steve Loughran
af0a6d7987
HADOOP-18456. NullPointerException in ObjectListingIterator. (#4909)
This problem surfaced in impala integration tests
   IMPALA-11592. TestLocalCatalogRetries.test_fetch_metadata_retry fails in S3 build
after the change
  HADOOP-17461. Add thread-level IOStatistics Context
The actual GC race condition came with
 HADOOP-18091. S3A auditing leaks memory through ThreadLocal references

The fix for this is, if our hypothesis is correct, in WeakReferenceMap.create()
where a strong reference to the new value is kept in a local variable
*and referred to later* so that the JVM will not GC it.

Along with the fix, extra assertions ensure that if the problem is not fixed,
applications will fail faster/more meaningfully.

Contributed by Steve Loughran.
2022-09-23 09:57:49 +01:00
Kidd5368
ceec19e61a HDFS-16776 Erasure Coding: The length of targets should be checked when DN gets a reconstruction task (#4901)
(cherry picked from commit 9a29075f915173e24c77cf8aea2908da0aa328e3)
2022-09-23 12:29:39 +09:00
PJ Fanning
d66dea300e
HADOOP-18341: upgrade commons-configuration2 to 2.8.0 and commons-text to 1.9 (#4916) 2022-09-22 10:44:27 +09:00
Ashutosh Gupta
683fa264ee
HADOOP-16769. LocalDirAllocator to provide diagnostics when file creation fails (#4896)
The patch provides detailed diagnostics of file creation failure in LocalDirAllocator.

Contributed by: Ashutosh Gupta
2022-09-21 11:54:47 +05:30
Ashutosh Gupta
3af155ceeb HADOOP-18400. Fix file split duplicating records from a succeeding split when reading BZip2 text files (#4732)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit 30c36ef25a335bc123fdae90b3366e582ad1b37a)
2022-09-19 13:45:47 +09:00