This adds borad support for Amazon S3 Express One Zone to the S3A connector,
particularly resilience of other parts of the codebase to LIST operations returning
paths under which only in-progress uploads are taking place.
hadoop-common and hadoop-mapreduce treewalking routines all cope with this;
distcp is left alone.
There are still some outstanding followup issues, and we expect more to surface
with extended use.
Contains HADOOP-18955. AWS SDK v2: add path capability probe "fs.s3a.capability.aws.v2
* lets us probe for AWS SDK version
* bucket-info reports it
Contains HADOOP-18961 S3A: add s3guard command "bucket"
hadoop s3guard bucket -create -region us-west-2 -zone usw2-az2 \
s3a://stevel--usw2-az2--x-s3/
* requires -zone if bucket is zonal
* rejects it if not
* rejects zonal bucket suffixes if endpoint is not aws (safety feature)
* imperfect, but a functional starting point.
New path capability "fs.s3a.capability.zonal.storage"
* Used in tests to determine whether pending uploads manifest paths
* cli tests can probe for this
* bucket-info reports it
* some tests disable/change assertions as appropriate
----
Shell commands fail on S3Express buckets if pending uploads.
New path capability in hadoop-common
"fs.capability.directory.listing.inconsistent"
1. S3AFS returns true on a S3 Express bucket
2. FileUtil.maybeIgnoreMissingDirectory(fs, path, fnfe)
decides whether to swallow the exception or not.
3. This is used in: Shell, FileInputFormat, LocatedFileStatusFetcher
Fixes with tests
* fs -ls -R
* fs -du
* fs -df
* fs -find
* S3AFS.getContentSummary() (maybe...should discuss)
* mapred LocatedFileStatusFetcher
* Globber, HADOOP-15478 already fixed that when dealing with
S3 inconsistencies
* FileInputFormat
S3Express CreateSession request is permitted outside audit spans.
S3 Bulk Delete calls request the store to return the list of deleted objects
if RequestFactoryImpl is set to trace.
log4j.logger.org.apache.hadoop.fs.s3a.impl.RequestFactoryImpl=TRACE
Test Changes
* ITestS3AMiscOperations removes all tests which require unencrypted
buckets. AWS S3 defaults to SSE-S3 everywhere.
* ITestBucketTool to test new tool without actually creating new
buckets.
* S3ATestUtils add methods to skip test suites/cases if store is/is not
S3Express
* Cutting down on "is this a S3Express bucket" logic to trailing --x-s3 string
and not worrying about AZ naming logic. commented out relevant tests.
* ITestTreewalkProblems validated against standard and S3Express stores
Outstanding
* Distcp: tests show it fails. Proposed: release notes.
---
x-amz-checksum header not found when signing S3Express messages
This modifies the custom signer in ITestCustomSigner to be a subclass
of AwsS3V4Signer with a goal of preventing signing problems with
S3 Express stores.
----
RemoteFileChanged renaming multipart file
Maps 412 status code to RemoteFileChangedException
Modifies huge file tests
-Adds a check on etag match for stat vs list
-ITestS3AHugeFilesByteBufferBlocks renames parent dirs, rather than
files, to replicate distcp better.
----
S3Express custom Signing cannot handle bulk delete
Copy custom signer into production JAR, so enable downstream testing
Extend ITestCustomSigner to cover more filesystem operations
- PUT
- POST
- COPY
- LIST
- Bulk delete through delete() and rename()
- list + abort multipart uploads
Suite is parameterized on bulk delete enabled/disabled.
To use the new signer for a full test run:
<property>
<name>fs.s3a.custom.signers</name>
<value>CustomSdkSigner:org.apache.hadoop.fs.s3a.auth.CustomSdkSigner</value>
</property>
<property>
<name>fs.s3a.s3.signing-algorithm</name>
<value>CustomSdkSigner</value>
</property>
Increases existing pool sizes, as with server scale and vector
IO, larger pools are needed
fs.s3a.connection.maximum 200
fs.s3a.threads.max 96
Adds new configuration options for v2 sdk internal timeouts,
both with default of 60s:
fs.s3a.connection.acquisition.timeout
fs.s3a.connection.idle.time
All the pool/timoeut options are covered in performance.md
Moves all timeout/duration options in the s3a FS to taking
temporal units (h, m, s, ms,...); retaining the previous default
unit (normally millisecond)
Adds a minimum duration for most of these, in order to recover from
deployments where a timeout has been set on the assumption the unit
was seconds, not millis.
Uses java.time.Duration throughout the codebase;
retaining the older numeric constants in
org.apache.hadoop.fs.s3a.Constants for backwards compatibility;
these are now deprecated.
Adds new class AWSApiCallTimeoutException to be raised on
sdk-related methods and also gateway timeouts. This is a subclass
of org.apache.hadoop.net.ConnectTimeoutException to support
existing retry logic.
+ reverted default value of fs.s3a.create.performance to false;
inadvertently set to true during testing.
Contributed by Steve Loughran.
Add a new option:
fs.s3a.optimized.copy.from.local.enabled
This will enable (default) or disable the
optimized CopyFromLocalOperation upload operation
when copyFromLocalFile() is invoked.
When false the superclass implementation is used; duration
statistics are still collected, though audit span entries
in logs will be for the individual fs operations, not the
overall operation.
Contributed by Steve Loughran
If fs.s3a.create.performance is set on a bucket
- All file overwrite checks are skipped, even if the caller says
otherwise.
- All directory existence checks are skipped.
- Marker deletion is *always* skipped.
This eliminates a HEAD and a LIST for every creation.
* New path capability "fs.s3a.create.performance.enabled" true
if the option is enabled.
* Parameterize ITestS3AContractCreate to expect the different
outcomes
* Parameterize ITestCreateFileCost similarly, with
changed cost assertions there.
* create(/) raises an IOE. existing bug only noticed here.
Contributed by Steve Loughran
S3A directory delete and rename will optionally abort all pending multipart uploads
in their under their to-be-deleted paths when.
fs.s3a.directory.operations.purge.upload is true
It is off by default.
The filesystems hasPathCapability("fs.s3a.directory.operations.purge.upload")
probe will return true when this feature is enabled.
Multipart uploads may accrue from interrupted data writes, uncommitted
staging/magic committer jobs and other operations/applications. On AWS S3
lifecycle rules are the recommended way to clean these; this change improves
support for stores which lack these rules.
Contributed by Steve Loughran
This restores asynchronous retrieval/refresh of any AWS credentials provided by the
EC2 instance/container in which the process is running.
Contributed by Steve Loughran
S3A region logic improved for better inference and
to be compatible with previous releases
1. If you are using an AWS S3 AccessPoint, its region is determined
from the ARN itself.
2. If fs.s3a.endpoint.region is set and non-empty, it is used.
3. If fs.s3a.endpoint is an s3.*.amazonaws.com url,
the region is determined by by parsing the URL
Note: vpce endpoints are not handled by this.
4. If fs.s3a.endpoint.region==null, and none could be determined
from the endpoint, use us-east-2 as default.
5. If fs.s3a.endpoint.region=="" then it is handed off to
The default AWS SDK resolution process.
Consult the AWS SDK documentation for the details on its resolution
process, knowing that it is complicated and may use environment variables,
entries in ~/.aws/config, IAM instance information within
EC2 deployments and possibly even JSON resources on the classpath.
Put differently: it is somewhat brittle across deployments.
Contributed by Ahmar Suhail
Tune AWS v2 SDK changes based on testing with third party stores
including GCS.
Contains HADOOP-18889. S3A v2 SDK error translations and troubleshooting docs
* Changes needed to work with multiple third party stores
* New third_party_stores document on how to bind to and test
third party stores, including google gcs (which works!)
* Troubleshooting docs mostly updated for v2 SDK
Exception translation/resilience
* New AWSUnsupportedFeatureException for unsupported/unavailable errors
* Handle 501 method unimplemented as one of these
* Error codes > 500 mapped to the AWSStatus500Exception if no explicit
handler.
* Precondition errors handled a bit better
* GCS throttle exception also recognized.
* GCS raises 404 on a delete of a file which doesn't exist: swallow it.
* Error translation uses reflection to create IOE of the right type.
All IOEs at the bottom of an AWS stack chain are regenerated.
then a new exception of that specific type is created, with the top level ex
its cause. This is done to retain the whole stack chain.
* Reduce the number of retries within the AWS SDK
* And those of s3a code.
* S3ARetryPolicy explicitly declare SocketException as connectivity failure
but subclasses BindException
* SocketTimeoutException also considered connectivity
* Log at debug whenever retry policies looked up
* Reorder exceptions to alphabetical order, with commentary
* Review use of the Invoke.retry() method
The reduction in retries is because its clear when you try to create a bucket
which doesn't resolve that the time for even an UnknownHostException to
eventually fail over 90s, which then hit the s3a retry code.
- Reducing the SDK retries means these escalate to our code better.
- Cutting back on our own retries makes it a bit more responsive for most real
deployments.
- maybeTranslateNetworkException() and s3a retry policy means that
unknown host exception is recognised and fails fast.
Contributed by Steve Loughran
The default value for fs.azure.data.blocks.buffer is changed from "disk" to "bytebuffer"
This will speed up writing to azure storage, at the risk of running out of memory
-especially if there are many threads writing to abfs at the same time and the
upload bandwidth is limited.
If jobs do run out of memory writing to abfs, change the option back to "disk"
Contributed by Anmol Asrani
Jobs which commit their work to S3 thr
magic committer now use a unique magic
containing the job ID:
__magic_job-${jobid}
This allows for multiple jobs to write
to the same destination simultaneously.
Contributed by Syed Shameerur Rahman
* The multipart flag fs.s3a.multipart.uploads.enabled is passed to the async client created
* s3A connector bypasses the transfer manager entirely if disabled or for small files.
Contributed by Steve Loughran
All uses of jersey-json in the yarn and other hadoop modules now
exclude the obsolete org.codehaus.jettison/jettison and so avoid
all security issues which can come from the library.
Contributed by PJ Fanning
This patch migrates the S3A connector to use the V2 AWS SDK.
This is a significant change at the source code level.
Any applications using the internal extension/override points in
the filesystem connector are likely to break.
This includes but is not limited to:
- Code invoking methods on the S3AFileSystem class
which used classes from the V1 SDK.
- The ability to define the factory for the `AmazonS3` client, and
to retrieve it from the S3AFileSystem. There is a new factory
API and a special interface S3AInternals to access a limited
set of internal classes and operations.
- Delegation token and auditing extensions.
- Classes trying to integrate with the AWS SDK.
All standard V1 credential providers listed in the option
fs.s3a.aws.credentials.provider will be automatically remapped to their
V2 equivalent.
Other V1 Credential Providers are supported, but only if the V1 SDK is
added back to the classpath.
The SDK Signing plugin has changed; all v1 signers are incompatible.
There is no support for the S3 "v2" signing algorithm.
Finally, the aws-sdk-bundle JAR has been replaced by the shaded V2
equivalent, "bundle.jar", which is now exported by the hadoop-aws module.
Consult the document aws_sdk_upgrade for the full details.
Contributed by Ahmar Suhail + some bits by Steve Loughran
Adds a class DelegationBindingInfo which contains binding info
beyond just the AWS credential list.
The binding class can be expanded when needed. Until then, all existing
implementations will work, as the new method
DelegationBindingInfo deploy(AbstractS3ATokenIdentifier retrievedIdentifier)
falls back to the original methods.
To avoid the ABFS instance getting closed due to GC while the streams are working, attach the ABFS instance to a backReference opaque object and passing down to the streams so that we have a hard reference while the streams are working.
Contributed by: Mehakmeet Singh
This modifies the manifest committer so that the list of files
to rename is passed between stages as a file of
writeable entries on the local filesystem.
The map of directories to create is still passed in memory;
this map is built across all tasks, so even if many tasks
created files, if they all write into the same set of directories
the memory needed is O(directories) with the
task count not a factor.
The _SUCCESS file reports on heap size through gauges.
This should give a warning if there are problems.
Contributed by Steve Loughran
This
1. changes the default value of fs.s3a.directory.marker.retention
to "keep"
2. no longer prints a message when an S3A FS instances is
instantiated with any option other than delete.
Switching to marker retention improves performance
on any S3 bucket as there are no needless marker DELETE requests
-leading to a reduction in write IOPS and and any delays waiting
for the DELETE call to finish.
There are *very* significant improvements on versioned buckets,
where tombstone markers slow down LIST operations: the more
tombstones there are, the worse query planning gets.
Having versioning enabled on production stores is the foundation
of any data protection strategy, so this has tangible benefits
in production.
It is *not* compatible with older hadoop releases; specifically
- Hadoop branch 2 < 2.10.2
- Any release of Hadoop 3.0.x and Hadoop 3.1.x
- Hadoop 3.2.0 and 3.2.1
- Hadoop 3.3.0
Incompatible releases have no problems reading data in stores
where markers are retained, but can get confused when deleting
or renaming directories.
If you are still using older versions to write to data, and cannot
yet upgrade, switch the option back to "delete"
Contributed by Steve Loughran
This:
1. Adds optLong, optDouble, mustLong and mustDouble
methods to the FSBuilder interface to let callers explicitly
passin long and double arguments.
2. The opt() and must() builder calls which take float/double values
now only set long values instead, so as to avoid problems
related to overloaded methods resulting in a ".0" being appended
to a long value.
3. All of the relevant opt/must calls in the hadoop codebase move to
the new methods
4. And the s3a code is resilient to parse errors in is numeric options
-it will downgrade to the default.
This is nominally incompatible, but the floating-point builder methods
were never used: nothing currently expects floating point numbers.
For anyone who wants to safely set numeric builder options across all compatible
releases, convert the number to a string and then use the opt(String, String)
and must(String, String) methods.
Contributed by Steve Loughran
Explicitly sets the fs.s3a.endpoint.region to eu-west-1 so
the ARN-referenced fs creation fails with unknown store
rather than IllegalArgumentException.
Steve Loughran
To support recovery of network failures during rename, the abfs client
fetches the etag of the source file, and when recovering from a
failure, uses this tag to determine whether the rename succeeded
before the failure happened.
* This works for files, but not directories
* It adds the overhead of a HEAD request before each rename.
* The option can be disabled by setting "fs.azure.enable.rename.resilience"
to false
Contributed by Sree Bhattacharyya
This change lets the client react pre-emptively to server load without getting to 503 and the exponential backoff
which follows. This stops performance suffering so much as capacity limits are approached for an account.
Contributed by Anmol Asranii
POM and LICENSE fixup of transient dependencies
* Exclude hadoop-cloud-storage imports which come in with hadoop-common
* Add explicit import of hadoop's org.codehaus.jettison declaration
to hadoop-aliyun
* Tune aliyun jars imports
* Update LICENSE-binary for the current set of libraries.
Contributed by Steve Loughran
Followup to the original HADOOP-18582.
Temporary path cleanup is re-enabled for -append jobs
as these will create temporary files when creating or overwriting files.
Contributed by Ayush Saxena
Adding toggleable support for modification time during distcp -update between two stores with incompatible checksum comparison.
Contributed by: Mehakmeet Singh <mehakmeet.singh.behl@gmail.com>
Fixes a javadoc error which came with
HADOOP-18577. ABFS: Add probes of readahead fix (#5205)
Part of the HADOOP-18521 ABFS readahead fix; MUST be included.
Contributed by Steve Loughran
Followup patch to HADOOP-18456 as part of HADOOP-18521,
ABFS ReadBufferManager buffer sharing across concurrent HTTP requests
Add probes of readahead fix aid in checking safety of
hadoop ABFS client across different releases.
* ReadBufferManager constructor logs the fact it is safe at TRACE
* AbfsInputStream declares it is fixed in toString()
by including fs.azure.capability.readahead.safe" in the
result.
The ABFS FileSystem hasPathCapability("fs.azure.capability.readahead.safe")
probe returns true to indicate the client's readahead manager has been fixed
to be safe when prefetching.
All Hadoop releases for which probe this returns false
and for which the probe "fs.capability.etags.available"
returns true at risk of returning invalid data when reading
ADLS Gen2/Azure storage data.
Contributed by Steve Loughran.
This has triggered an OOM in a process which was churning through s3a fs
instances; the increased memory footprint of IOStatistics amplified what
must have been a long-standing issue with FS instances being created
and not closed()
* Makes sure instrumentation is closed when the FS is closed.
* Uses a weak reference from metrics to instrumentation, so even
if the FS wasn't closed (see HADOOP-18478), this back reference
would not cause the S3AInstrumentation reference to be retained.
* If S3AFileSystem is configured to log at TRACE it will log the
calling stack of initialize(), so help identify where the
instance is being created. This should help track down
the cause of instance leakage.
Contributed by Steve Loughran.
This is a followup to the original HADOOP-18546
patch; cherry-picks of that should include this
or follow up with it.
Removes risk of race conditions in assertions
of ITestReadBufferManager on the state of the in-progress
and completed queues by removing assertions brittle
to race conditions in scheduling/network IO
* Waits for all the executor pool shutdown to complete before
making any assertions
* Assertions that there are no in progress reads MUST be
cut as there may be some and they won't be cancelled.
* Assertions that the completed list is without buffers
of a closed stream are brittle because if there was
an in progress stream which completed after stream.close()
then it will end up in the list.
Contributed by Steve Loughran
This addresses HADOOP-18521, "ABFS ReadBufferManager buffer sharing
across concurrent HTTP requests" by not trying to cancel
in progress reads.
It supercedes HADOOP-18528, which disables the prefetching.
If that patch is applied *after* this one, prefetching
will be disabled.
As well as changing the default value in the code,
core-default.xml is updated to set
fs.azure.enable.readahead = true
As a result, if Configuration.get("fs.azure.enable.readahead")
returns a non-null value, then it can be inferred that
it was set in or core-default.xml (the fix is present)
or in core-site.xml (someone asked for it).
Contributed by Pranav Saxena.
This allows abfs request throttling to be shared across all
abfs connections talking to containers belonging to the same abfs storage
account -as that is the level at which IO throttling is applied.
The option is enabled/disabled in the configuration option
"fs.azure.account.throttling.enabled";
The default is "true"
Contributed by Anmol Asrani
This commit parses SAS Tokens and removes the unwanted prefix of '?' from them, if present.
At present, SAS Tokens are provided to the driver through customer implementations of the SASTokenProvider interface. The SAS token providers should not assume that the token will be the first query parameter in the URIs that communicate with the backend. However, it was observed that certain public interfaces provided by Storage to generate SAS can include the '?' as the first character of the SAS Token, which would ideally be the case when it is the first query parameter. Thus, tokens that contain this prefix will lead to an error in the driver due to a clash of query parameters.
To avoid failures for use of such SAS tokens, after receiving the SAS Token from the provider, the code checks for whether any ? prefix is present or not. If yes, it is removed before further usage of the token. This way, users would not have to manually remove the prefix before passing it on as a configuration.
Contributed by Sree Bhattacharya
Disables block prefetching on ABFS InputStreams, by setting
fs.azure.enable.readahead to false in core-default.xml and
the matching java constant.
This prevents
HADOOP-18521. ABFS ReadBufferManager buffer sharing across concurrent HTTP requests.
Once a fix for that is committed, this change can be reverted.
Contributed by Mehakmeet Singh.
* HADOOP-18517. ABFS: Add fs.azure.enable.readahead option to disable readahead
Adds new config option to turn off readahead
* also allows it to be passed in through openFile(),
* extends ITestAbfsReadWriteAndSeek to use the option, including one
replicated test...that shows that turning it off is slower.
Important: this does not address the critical data corruption issue
HADOOP-18521. ABFS ReadBufferManager buffer sharing across concurrent HTTP requests
What is does do is provide a way to completely bypass the ReadBufferManager.
To mitigate the problem, either fs.azure.enable.readahead needs to be set to false,
or set "fs.azure.readaheadqueue.depth" to 0 -this still goes near the (broken)
ReadBufferManager code, but does't trigger the bug.
For safe reading of files through the ABFS connector, readahead MUST be disabled
or the followup fix to HADOOP-18521 applied
Contributed by Steve Loughran
This fixes a race condition with the TemporaryAWSCredentialProvider
one which has existed for a long time but which only surfaced
(usually in Spark) when the bucket existence probe was disabled
by setting fs.s3a.bucket.probe to 0 -a performance speedup
which was made the default in HADOOP-17454.
Contributed by Jimmy Wong.
The option "fs.s3a.proxy.ssl.enabled" controls
whether the s3a connects to a proxy over HTTP (default) or HTTPS.
Set to "true" to use HTTPS.
Contributed by Mehakmeet Singh
The AWS SDKV2 upgrade log no longer warns about instantiation
of the v1 SDK credential providers which are commonly used in
s3a configurations:
* com.amazonaws.auth.EnvironmentVariableCredentialsProvider
* com.amazonaws.auth.EC2ContainerCredentialsProviderWrapper
* com.amazonaws.auth.InstanceProfileCredentialsProvider
When the hadoop-aws module moves to the v2 SDK, references to these
credential providers will be rewritten to their v2 equivalents.
Follow-on to HADOOP-18382. "Upgrade AWS SDK to V2 - Prerequisites"
Contributed by Ahmar Suhail
Add to XMLUtils a set of methods to create secure XML Parsers/transformers, locking down DTD, schema, XXE exposure.
Use these wherever XML parsers are created.
Contributed by PJ Fanning
Make S3APrefetchingInputStream.seek() completely lazy. Calls to seek() will not affect the current buffer nor interfere with prefetching, until read() is called.
This change allows various usage patterns to benefit from prefetching, e.g. when calling readFully(position, buffer) in a loop for contiguous positions the intermediate internal calls to seek() will be noops and prefetching will have the same performance as in a sequential read.
Contributed by Alessandro Passaro.
part of HADOOP-18103.
Also introducing a config fs.s3a.vectored.active.ranged.reads
to configure the maximum number of number of range reads a
single input stream can have active (downloading, or queued)
to the central FileSystem instance's pool of queued operations.
This stops a single stream overloading the shared thread pool.
Contributed by: Mukund Thakur
part of HADOOP-18103.
While merging the ranges in CheckSumFs, they are rounded up based on the
value of checksum bytes size which leads to some ranges crossing the EOF
thus they need to be fixed else it will cause EOFException during actual reads.
Contributed By: Mukund Thakur
Follow-up to HADOOP-12020 Support configuration of different S3 storage classes;
S3 storage class is now set when buffering to heap/bytebuffers, and when
creating directory markers
Contributed by Monthon Klongklaew
HADOOP-16202 "Enhance openFile()" added asynchronous draining of the
remaining bytes of an S3 HTTP input stream for those operations
(unbuffer, seek) where it could avoid blocking the active
thread.
This patch fixes the asynchronous stream draining to work and so
return the stream back to the http pool. Without this, whenever
unbuffer() or seek() was called on a stream and an asynchronous
drain triggered, the connection was not returned; eventually
the pool would be empty and subsequent S3 requests would
fail with the message "Timeout waiting for connection from pool"
The root cause was that even though the fields passed in to drain() were
converted to references through the methods, in the lambda expression
passed in to submit, they were direct references
operation = client.submit(
() -> drain(uri, streamStatistics,
false, reason, remaining,
object, wrappedStream)); /* here */
Those fields were only read during the async execution, at which
point they would have been set to null (or even a subsequent read).
A new SDKStreamDrainer class peforms the draining; this is a Callable
and can be submitted directly to the executor pool.
The class is used in both the classic and prefetching s3a input streams.
Also, calling unbuffer() switches the S3AInputStream from adaptive
to random IO mode; that is, it is considered a cue that future
IO will not be sequential, whole-file reads.
Contributed by Steve Loughran.
This patch prepares the hadoop-aws module for a future
migration to using the v2 AWS SDK (HADOOP-18073)
That upgrade will be incompatible; this patch prepares
for it:
-marks some credential providers and other
classes and methods as @deprecated.
-updates site documentation
-reduces the visibility of the s3 client;
other than for testing, it is kept private to
the S3AFileSystem class.
-logs some warnings when deprecated APIs are used.
The warning messages are printed only once
per JVM's life. To disable them, set the
log level of org.apache.hadoop.fs.s3a.SDKV2Upgrade
to ERROR
Contributed by Ahmar Suhail
This is the the preview release of the HADOOP-18028 S3A performance input stream.
It is still stabilizing, but ready to test.
Contains
HADOOP-18028. High performance S3A input stream (#4109)
Contributed by Bhalchandra Pandit.
HADOOP-18180. Replace use of twitter util-core with java futures (#4115)
Contributed by PJ Fanning.
HADOOP-18177. Document prefetching architecture. (#4205)
Contributed by Ahmar Suhail
HADOOP-18175. fix test failures with prefetching s3a input stream (#4212)
Contributed by Monthon Klongklaew
HADOOP-18231. S3A prefetching: fix failing tests & drain stream async. (#4386)
* adds in new test for prefetching input stream
* creates streamStats before opening stream
* updates numBlocks calculation method
* fixes ITestS3AOpenCost.testOpenFileLongerLength
* drains stream async
* fixes failing unit test
Contributed by Ahmar Suhail
HADOOP-18254. Disable S3A prefetching by default. (#4469)
Contributed by Ahmar Suhail
HADOOP-18190. Collect IOStatistics during S3A prefetching (#4458)
This adds iOStatisticsConnection to the S3PrefetchingInputStream class, with
new statistic names in StreamStatistics.
This stream is not (yet) IOStatisticsContext aware.
Contributed by Ahmar Suhail
HADOOP-18379 rebase feature/HADOOP-18028-s3a-prefetch to trunk
HADOOP-18187. Convert s3a prefetching to use JavaDoc for fields and enums.
HADOOP-18318. Update class names to be clear they belong to S3A prefetching
Contributed by Steve Loughran
JobID.toString() and TaskID.toString() to only be called
when the IDs are not null.
This doesn't surface in MapReduce, but Spark SQL can trigger
in job abort, where it may invok abortJob() with an
incomplete TaskContext.
This patch MUST be applied to branches containing
HADOOP-17833. "Improve Magic Committer Performance."
Contributed by Steve Loughran.
jobId.toString() to only be called when the ID isn't null.
this doesn't surface in MR, but spark seems to manage it
Change-Id: I06692ef30a4af510c660d7222292932a8d4b5147
The name of the option to enable/disable thread level statistics is
"fs.iostatistics.thread.level.enabled";
There is also an enabled() probe in IOStatisticsContext which can
be used to see if the thread level statistics is active.
Contributed by Viraj Jasani
Fixes CVE-2018-7489 in shaded jackson.
+Add more commands in testing.md
to the CLI tests needed when qualifying
a release
Contributed by Steve Loughran
This adds a thread-level collector of IOStatistics, IOStatisticsContext,
which can be:
* Retrieved for a thread and cached for access from other
threads.
* reset() to record new statistics.
* Queried for live statistics through the
IOStatisticsSource.getIOStatistics() method.
* Queries for a statistics aggregator for use in instrumented
classes.
* Asked to create a serializable copy in snapshot()
The goal is to make it possible for applications with multiple
threads performing different work items simultaneously
to be able to collect statistics on the individual threads,
and so generate aggregate reports on the total work performed
for a specific job, query or similar unit of work.
Some changes in IOStatistics-gathering classes are needed for
this feature
* Caching the active context's aggregator in the object's
constructor
* Updating it in close()
Slightly more work is needed in multithreaded code,
such as the S3A committers, which collect statistics across
all threads used in task and job commit operations.
Currently the IOStatisticsContext-aware classes are:
* The S3A input stream, output stream and list iterators.
* RawLocalFileSystem's input and output streams.
* The S3A committers.
* The TaskPool class in hadoop-common, which propagates
the active context into scheduled worker threads.
Collection of statistics in the IOStatisticsContext
is disabled process-wide by default until the feature
is considered stable.
To enable the collection, set the option
fs.thread.level.iostatistics.enabled
to "true" in core-site.xml;
Contributed by Mehakmeet Singh and Steve Loughran
ABFS rename fails intermittently when the Storage-blob tracking
metadata is in an incomplete state. This surfaces as the error code
404 and an error message of "RenameDestinationParentPathNotFound"
To mitigate this issue, when a request fails with this response.
the ABFS client issues a HEAD call on the source file
and then retries the rename operation again
ABFS filesystem statistics track when this occurs with new counters
rename_recovery
metadata_incomplete_rename_failures
rename_path_attempts
This is very rare occurrence and appears to be triggered under certain
heavy load conditions, just as with HADOOP-18163.
Contributed by Mehakmeet Singh.
part of HADOOP-18103.
Handling memory fragmentation in S3A vectored IO implementation by
allocating smaller user range requested size buffers and directly
filling them from the remote S3 stream and skipping undesired
data in between ranges.
This patch also adds aborting active vectored reads when stream is
closed or unbuffer() is called.
Contributed By: Mukund Thakur