Reviewed-by: Steve Loughran <stevel@apache.org>
Reviewed-by: Attila Doroszlai <adoroszlai@apache.org>
Reviewed-by: Cheng Pan <chengpan@apache.org>
Reviewed-by: Min Yan <yaommen@gmail.com>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
ChecksumFileSystem creates the chunked ranges based on the checksum chunk size and then calls
readVectored on Raw Local which may lead to overlapping ranges in some cases.
Contributed by: Mukund Thakur
This sets a different timeout for data upload PUT/POST calls to all
other requests, so that slow block uploads do not trigger timeouts
as rapidly as normal requests. This was always the behavior
in the V1 AWS SDK; for V2 we have to explicitly set it on the operations
we want to give extended timeouts.
Option: fs.s3a.connection.part.upload.timeout
Default: 15m
Contributed by Steve Loughran
* HttpReferrerAuditHeader is thread safe, copying the lists/maps passed
in and using synchronized methods when necessary.
* All exceptions raised when building referrer header are caught
and swallowed.
* The first such error is logged at warn,
* all errors plus stack are logged at debug
Contributed by Steve Loughran
Adds new option
s3a.cross.region.access.enabled
Which is true by default
This enables cross region access as a separate config and enable/disables it irrespective of region/endpoint is set.
Contributed by Syed Shameerur Rahman
This moves Hadoop to Apache commons-collections4.
Apache commons-collections has been removed and is completely banned from the source code.
Contributed by Nihal Jain
As part of work done under HADOOP-19120 [ABFS]: ApacheHttpClient adaptation as network library - ASF JIRA
Apache HTTP Client was introduced as an alternative Network Library that can be used with ABFS Driver. Earlier JDK Http Client was the only supported network library.
Apache HTTP Client was found to be more helpful in terms of controls and knobs it provides to manage the Network aspects of the driver better. Hence, the default Network Client was made to be used with the ABFS Driver.
Recently while running scale workloads, we observed a regression where some unexpected wait time was observed while establishing connections. A possible fix has been identified and we are working on getting it fixed.
There was also a possible NPE scenario was identified on the new network client code.
Until we are done with the code fixes and revalidated the whole Apache client flow, we would like to make JDK Client as default client again. The support will still be there, but it will be disabled behind a config.
Contributed by: manika137
Disables all logging below error in the AWS SDK Transfer Manager.
This is done in ClientManagerImpl construction so is automatically done
during S3A FS initialization.
ITests verify that
* It is possible to restore the warning log. This verifies the validity of
the test suite, and will identify when an SDK update fixes this regression.
* Constructing an S3A FS instance will disable the logging.
The log manipulation code is lifted from Cloudstore, where it was used to
dynamically enable logging. It uses reflection to load the Log4J binding;
all uses of the API catch and swallow exceptions.
This is needed to avoid failures when running against different log backends
This is an emergency fix -we could come up with a better design for
the reflection based code using the new DynMethods classes.
But this is based on working code, which is always good.
Contributed by Steve Loughran
This is a major change which handles 400 error responses when uploading
large files from memory heap/buffer (or staging committer) and the remote S3
store returns a 500 response from a upload of a block in a multipart upload.
The SDK's own streaming code seems unable to fully replay the upload;
at attempts to but then blocks and the S3 store returns a 400 response
"Your socket connection to the server was not read from or written to
within the timeout period. Idle connections will be closed.
(Service: S3, Status Code: 400...)"
There is an option to control whether or not the S3A client itself
attempts to retry on a 50x error other than 503 throttling events
(which are independently processed as before)
Option: fs.s3a.retry.http.5xx.errors
Default: true
500 errors are very rare from standard AWS S3, which has a five nines
SLA. It may be more common against S3 Express which has lower
guarantees.
Third party stores have unknown guarantees, and the exception may
indicate a bad server configuration. Consider setting
fs.s3a.retry.http.5xx.errors to false when working with
such stores.
Signification Code changes:
There is now a custom set of implementations of
software.amazon.awssdk.http.ContentStreamProvidercontent in
the class org.apache.hadoop.fs.s3a.impl.UploadContentProviders.
These:
* Restart on failures
* Do not copy buffers/byte buffers into new private byte arrays,
so avoid exacerbating memory problems..
There new IOStatistics for specific http error codes -these are collected
even when all recovery is performed within the SDK.
S3ABlockOutputStream has major changes, including handling of
Thread.interrupt() on the main thread, which now triggers and briefly
awaits cancellation of any ongoing uploads.
If the writing thread is interrupted in close(), it is mapped to
an InterruptedIOException. Applications like Hive and Spark must
catch these after cancelling a worker thread.
Contributed by Steve Loughran