With this upgrade, it is possible to connect to an Amazon S3 Express One Zone bucket.
Some tests from the S3A test suite will currently fail against a one zone bucket, as one zone buckets
do not support some S3 standard features (eg: SSE-KMS), and certain operations behave slightly
differently (eg: listMPU will return a directory that has incomplete MPUs).
Contributed by Ahmar Suhail
Followup to the previous HADOOP-18487 patch: changes the scope of
protobuf-2.5 in hadoop-common and elsewhere from "compile" to "provided".
This means that protobuf-2.5 is
* No longer included in hadoop distributions
* No longer exported by hadoop common POM files
* No longer exported transitively by other hadoop modules.
* No longer listed in LICENSE-binary.
Contributed by Steve Loughran
v1 => 1.12.565
v2 => 2.20.160
Only the v2 one is distributed; v1 is needed in deployments only to support v1 credential providers
Contributed by Steve Loughran
Protobuf 2.5 JAR is no longer needed at runtime.
The option common.protobuf.scope defines whether the protobuf 2.5.0
dependency is marked as provided or not.
* New package org.apache.hadoop.ipc.internal for internal only protobuf classes
...with a ShadedProtobufHelper in there which has shaded protobuf refs
only, so guaranteed not to need protobuf-2.5 on the CP
* All uses of org.apache.hadoop.ipc.ProtobufHelper have
been replaced by uses of org.apache.hadoop.ipc.internal.ShadedProtobufHelper
* The scope of protobuf-2.5 is set by the option common.protobuf2.scope
In this patch is it is still "compile"
* There is explicit reference to it in modules where it may be needed.
* The maven scope of the dependency can be set with the common.protobuf2.scope
option. It can be set to "provided" in a build:
-Dcommon.protobuf2.scope=provided
* Add new ipc(callable) method to catch and convert shaded protobuf
exceptions raised during invocation of the supplied lambda expression
* This is adopted in the code where the migration is not traumatically
over-complex. RouterAdminProtocolTranslatorPB is left alone for this
reason.
Contributed by Steve Loughran
All uses of jersey-json in the yarn and other hadoop modules now
exclude the obsolete org.codehaus.jettison/jettison and so avoid
all security issues which can come from the library.
Contributed by PJ Fanning
This patch migrates the S3A connector to use the V2 AWS SDK.
This is a significant change at the source code level.
Any applications using the internal extension/override points in
the filesystem connector are likely to break.
This includes but is not limited to:
- Code invoking methods on the S3AFileSystem class
which used classes from the V1 SDK.
- The ability to define the factory for the `AmazonS3` client, and
to retrieve it from the S3AFileSystem. There is a new factory
API and a special interface S3AInternals to access a limited
set of internal classes and operations.
- Delegation token and auditing extensions.
- Classes trying to integrate with the AWS SDK.
All standard V1 credential providers listed in the option
fs.s3a.aws.credentials.provider will be automatically remapped to their
V2 equivalent.
Other V1 Credential Providers are supported, but only if the V1 SDK is
added back to the classpath.
The SDK Signing plugin has changed; all v1 signers are incompatible.
There is no support for the S3 "v2" signing algorithm.
Finally, the aws-sdk-bundle JAR has been replaced by the shaded V2
equivalent, "bundle.jar", which is now exported by the hadoop-aws module.
Consult the document aws_sdk_upgrade for the full details.
Contributed by Ahmar Suhail + some bits by Steve Loughran
As well as the POM update, this patch moves to the (renamed) verify methods.
Backporting mockito test changes may now require cherrypicking this patch, otherwise
use the old method names.
Contributed by Anmol Asrani
POM and LICENSE fixup of transient dependencies
* Exclude hadoop-cloud-storage imports which come in with hadoop-common
* Add explicit import of hadoop's org.codehaus.jettison declaration
to hadoop-aliyun
* Tune aliyun jars imports
* Update LICENSE-binary for the current set of libraries.
Contributed by Steve Loughran
Expands on the comments in cluster config to tell people
they shouldn't be running a cluster without a private VLAN
in cloud, that Knox is good here, and unsecured clusters
without a VLAN are just computation-as-a-service to crypto miners
Contributed by Steve Loughran