This patch migrates the S3A connector to use the V2 AWS SDK.
This is a significant change at the source code level.
Any applications using the internal extension/override points in
the filesystem connector are likely to break.
This includes but is not limited to:
- Code invoking methods on the S3AFileSystem class
which used classes from the V1 SDK.
- The ability to define the factory for the `AmazonS3` client, and
to retrieve it from the S3AFileSystem. There is a new factory
API and a special interface S3AInternals to access a limited
set of internal classes and operations.
- Delegation token and auditing extensions.
- Classes trying to integrate with the AWS SDK.
All standard V1 credential providers listed in the option
fs.s3a.aws.credentials.provider will be automatically remapped to their
V2 equivalent.
Other V1 Credential Providers are supported, but only if the V1 SDK is
added back to the classpath.
The SDK Signing plugin has changed; all v1 signers are incompatible.
There is no support for the S3 "v2" signing algorithm.
Finally, the aws-sdk-bundle JAR has been replaced by the shaded V2
equivalent, "bundle.jar", which is now exported by the hadoop-aws module.
Consult the document aws_sdk_upgrade for the full details.
Contributed by Ahmar Suhail + some bits by Steve Loughran
As well as the POM update, this patch moves to the (renamed) verify methods.
Backporting mockito test changes may now require cherrypicking this patch, otherwise
use the old method names.
Contributed by Anmol Asrani
POM and LICENSE fixup of transient dependencies
* Exclude hadoop-cloud-storage imports which come in with hadoop-common
* Add explicit import of hadoop's org.codehaus.jettison declaration
to hadoop-aliyun
* Tune aliyun jars imports
* Update LICENSE-binary for the current set of libraries.
Contributed by Steve Loughran
Expands on the comments in cluster config to tell people
they shouldn't be running a cluster without a private VLAN
in cloud, that Knox is good here, and unsecured clusters
without a VLAN are just computation-as-a-service to crypto miners
Contributed by Steve Loughran
Addresses CVE-2021-37533, which *only* relates to FTP.
Applications not using the ftp:// filesystem, which, as
anyone who has used it will know is very minimal and
so rarely used, is not a critical part of the project.
Furthermore, the FTP-related issue is at worst information leakage
if someone connects to a malicious server.
This is a due diligence PR rather than an emergency fix.
Contributed by Steve Loughran
Updates okhttp3 and okio so their transitive dependency on Kotlin
stdlib is free from recent CVEs.
okhttp3:okhttp => 4.10.0
okio:okio => 3.2.0
kotlin stdlib => 1.6.20
kotlin CVEs fixed:
CVE-2022-24329
CVE-2020-29582
Contributed by PJ Fanning.