This update ensures that the timeout set in fs.s3a.connection.request.timeout is passed down
to calls to CreateSession made in the AWS SDK to get S3 Express session tokens.
Contributed by Steve Loughran
This addresses
- [sonatype-2021-4916] CWE-327: Use of a Broken or Risky Cryptographic Algorithm
- [sonatype-2019-0673] CWE-400: Uncontrolled Resource Consumption ('Resource Exhaustion')
Contributed by Murali Krishna
With this upgrade, it is possible to connect to an Amazon S3 Express One Zone bucket.
Some tests from the S3A test suite will currently fail against a one zone bucket, as one zone buckets
do not support some S3 standard features (eg: SSE-KMS), and certain operations behave slightly
differently (eg: listMPU will return a directory that has incomplete MPUs).
Contributed by Ahmar Suhail
Followup to the previous HADOOP-18487 patch: changes the scope of
protobuf-2.5 in hadoop-common and elsewhere from "compile" to "provided".
This means that protobuf-2.5 is
* No longer included in hadoop distributions
* No longer exported by hadoop common POM files
* No longer exported transitively by other hadoop modules.
* No longer listed in LICENSE-binary.
Contributed by Steve Loughran
v1 => 1.12.565
v2 => 2.20.160
Only the v2 one is distributed; v1 is needed in deployments only to support v1 credential providers
Contributed by Steve Loughran
Protobuf 2.5 JAR is no longer needed at runtime.
The option common.protobuf.scope defines whether the protobuf 2.5.0
dependency is marked as provided or not.
* New package org.apache.hadoop.ipc.internal for internal only protobuf classes
...with a ShadedProtobufHelper in there which has shaded protobuf refs
only, so guaranteed not to need protobuf-2.5 on the CP
* All uses of org.apache.hadoop.ipc.ProtobufHelper have
been replaced by uses of org.apache.hadoop.ipc.internal.ShadedProtobufHelper
* The scope of protobuf-2.5 is set by the option common.protobuf2.scope
In this patch is it is still "compile"
* There is explicit reference to it in modules where it may be needed.
* The maven scope of the dependency can be set with the common.protobuf2.scope
option. It can be set to "provided" in a build:
-Dcommon.protobuf2.scope=provided
* Add new ipc(callable) method to catch and convert shaded protobuf
exceptions raised during invocation of the supplied lambda expression
* This is adopted in the code where the migration is not traumatically
over-complex. RouterAdminProtocolTranslatorPB is left alone for this
reason.
Contributed by Steve Loughran
All uses of jersey-json in the yarn and other hadoop modules now
exclude the obsolete org.codehaus.jettison/jettison and so avoid
all security issues which can come from the library.
Contributed by PJ Fanning
This patch migrates the S3A connector to use the V2 AWS SDK.
This is a significant change at the source code level.
Any applications using the internal extension/override points in
the filesystem connector are likely to break.
This includes but is not limited to:
- Code invoking methods on the S3AFileSystem class
which used classes from the V1 SDK.
- The ability to define the factory for the `AmazonS3` client, and
to retrieve it from the S3AFileSystem. There is a new factory
API and a special interface S3AInternals to access a limited
set of internal classes and operations.
- Delegation token and auditing extensions.
- Classes trying to integrate with the AWS SDK.
All standard V1 credential providers listed in the option
fs.s3a.aws.credentials.provider will be automatically remapped to their
V2 equivalent.
Other V1 Credential Providers are supported, but only if the V1 SDK is
added back to the classpath.
The SDK Signing plugin has changed; all v1 signers are incompatible.
There is no support for the S3 "v2" signing algorithm.
Finally, the aws-sdk-bundle JAR has been replaced by the shaded V2
equivalent, "bundle.jar", which is now exported by the hadoop-aws module.
Consult the document aws_sdk_upgrade for the full details.
Contributed by Ahmar Suhail + some bits by Steve Loughran
As well as the POM update, this patch moves to the (renamed) verify methods.
Backporting mockito test changes may now require cherrypicking this patch, otherwise
use the old method names.
Contributed by Anmol Asrani
POM and LICENSE fixup of transient dependencies
* Exclude hadoop-cloud-storage imports which come in with hadoop-common
* Add explicit import of hadoop's org.codehaus.jettison declaration
to hadoop-aliyun
* Tune aliyun jars imports
* Update LICENSE-binary for the current set of libraries.
Contributed by Steve Loughran
Expands on the comments in cluster config to tell people
they shouldn't be running a cluster without a private VLAN
in cloud, that Knox is good here, and unsecured clusters
without a VLAN are just computation-as-a-service to crypto miners
Contributed by Steve Loughran
Addresses CVE-2021-37533, which *only* relates to FTP.
Applications not using the ftp:// filesystem, which, as
anyone who has used it will know is very minimal and
so rarely used, is not a critical part of the project.
Furthermore, the FTP-related issue is at worst information leakage
if someone connects to a malicious server.
This is a due diligence PR rather than an emergency fix.
Contributed by Steve Loughran
Updates okhttp3 and okio so their transitive dependency on Kotlin
stdlib is free from recent CVEs.
okhttp3:okhttp => 4.10.0
okio:okio => 3.2.0
kotlin stdlib => 1.6.20
kotlin CVEs fixed:
CVE-2022-24329
CVE-2020-29582
Contributed by PJ Fanning.
This addresses an issue where the plugin's default classpath for executing tests fails to include org.junit.platform.launcher.core.LauncherFactory.
Contributed by: Steve Vaughan Jr
Fixes CVE-2018-7489 in shaded jackson.
+Add more commands in testing.md
to the CLI tests needed when qualifying
a release
Contributed by Steve Loughran
This downgrades jackson from the version switched to in
HADOOP-18033 (2.13.0), to Jackson 2.12.7.
This removes the dependency on javax.ws.rs-api,
so avoiding runtime problems with applications using
jersey-core v1 and/or jsr311-api.
The 2.12.7 release still contains the fix for CVE-2020-36518.
Contributed by PJ Fanning
Update the dependencies of the LDAP libraries used for testing:
ldap-api.version = 2.0.0
apacheds.version = 2.0.0.AM26
Contributed by Colm O hEigeartaigh.
part of HADOOP-18103.
Add support for multiple ranged vectored read api in PositionedReadable.
The default iterates through the ranges to read each synchronously,
but the intent is that FSDataInputStream subclasses can make more
efficient readers especially in object stores implementation.
Also added implementation in S3A where smaller ranges are merged and
sliced byte buffers are returned to the readers. All the merged ranged are
fetched from S3 asynchronously.
Contributed By: Owen O'Malley and Mukund Thakur