Co-authored-by: Wei-Chiu Chuang <weichiu@apache.org>
Includes HADOOP-18354. Upgrade reload4j to 1.22.2 due to XXE vulnerability (#4607).
Log4j 1.2.17 has been replaced by reloadj 1.22.2
SLF4J is at 1.7.36
The AWS landsat data previously used in some S3A tests is no
longer accessible
This PR moves to the new external file
s3a://noaa-cors-pds/raw/2024/001/akse/AKSE001x.24_.gz
* Large enough file for scale tests
* Bucket supports anonymous access
* Ends in .gz to keep codec tests happy
* No spaces in path to keep bucket-info happy
Test Code Changes
* Leaves the test key name alone: fs.s3a.scale.test.csvfile
* Rename all methods and fields move remove "csv" from their names and
move to "external file" we no longer require it to be CSV.
* Path definition and helper methods have been moved to PublicDatasetTestUtils
* Improve error reporting in ITestS3AInputStreamPerformance if the file
is too short
With S3 Select removed, there is no need for the file to be
a CSV file; there is a test which tries to unzip it; other
tests have a minimum file size.
Consult the JIRA for the settings to add to auth-keys.xml
to switch earlier builds to this same file.
Contributed by Steve Loughran
This is a followup to PR:
HADOOP-19045. S3A: Validate CreateSession Timeout Propagation (#6470)
Remove all declarations of fs.s3a.connection.request.timeout
in
- hadoop-common/src/main/resources/core-default.xml
- hadoop-aws/src/test/resources/core-site.xml
New test in TestAwsClientConfig to verify that the value
defined in fs.s3a.Constants class is used.
This is brittle to someone overriding it in their test setups,
but as this test is intended to verify that the option is not
explicitly set, there's no workaround.
Contributed by Steve Loughran
The option fs.s3a.classloader.isolation (default: true) can be set to false to disable s3a classloader isolation;
This can assist in using custom credential providers and other extension points.
Contributed by Antonio Murgia
* HADOOP-19061 - Capture exception from rpcRequestSender.start() in IPC.Connection.run() and proper cleaning is followed if an exception is thrown.
---------
Co-authored-by: Xing Lin <xinglin@linkedin.com>
Improves region handling in the S3A connector, including enabling cross-region support
when that is considered necessary.
Consult the documentation in connecting.md/connecting.html for the current
resolution process.
Contributed by Viraj Jasani
Cut out S3 Select
* leave public/unstable constants alone
* s3guard tool will fail with error
* s3afs. path capability will fail
* openFile() will fail with specific error
* s3 select doc updated
* Cut eventstream jar
* New test: ITestSelectUnsupported verifies new failure
handling above
Contributed by Steve Loughran
New test ITestCreateSessionTimeout to verify that the duration set
in fs.s3a.connection.request.timeout is passed all the way down.
This is done by adding a sleep() in a custom signer and verifying
that it is interrupted and that an AWSApiCallTimeoutException is
raised.
+ Fix testRequestTimeout()
* doesn't skip if considered cross-region
* sets a minimum duration of 0 before invocation
* resets the minimum afterwards
Contributed by Steve Loughran
Address JDK bug JDK-8314978 related to handling of HTTP 100
responses.
https://bugs.openjdk.org/browse/JDK-8314978
In the AbfsHttpOperation, after sendRequest() we call processResponse()
method from AbfsRestOperation.
Even if the conn.getOutputStream() fails due to expect-100 error,
we consume the exception and let the code go ahead.
This may call getHeaderField() / getHeaderFields() / getHeaderFieldLong() after
getOutputStream() has failed. These invocation all lead to server calls.
This commit aims to prevent this.
If connection.getOutputStream() fails due to an Expect-100 error,
the ABFS client does not invoke getHeaderField(), getHeaderFields(),
getHeaderFieldLong() or getInputStream().
getResponseCode() is safe as on the failure it sets the
responseCode variable in HttpUrlConnection object.
Contributed by Pranav Saxena
This update ensures that the timeout set in fs.s3a.connection.request.timeout is passed down
to calls to CreateSession made in the AWS SDK to get S3 Express session tokens.
Contributed by Steve Loughran