This adds a thread-level collector of IOStatistics, IOStatisticsContext,
which can be:
* Retrieved for a thread and cached for access from other
threads.
* reset() to record new statistics.
* Queried for live statistics through the
IOStatisticsSource.getIOStatistics() method.
* Queries for a statistics aggregator for use in instrumented
classes.
* Asked to create a serializable copy in snapshot()
The goal is to make it possible for applications with multiple
threads performing different work items simultaneously
to be able to collect statistics on the individual threads,
and so generate aggregate reports on the total work performed
for a specific job, query or similar unit of work.
Some changes in IOStatistics-gathering classes are needed for
this feature
* Caching the active context's aggregator in the object's
constructor
* Updating it in close()
Slightly more work is needed in multithreaded code,
such as the S3A committers, which collect statistics across
all threads used in task and job commit operations.
Currently the IOStatisticsContext-aware classes are:
* The S3A input stream, output stream and list iterators.
* RawLocalFileSystem's input and output streams.
* The S3A committers.
* The TaskPool class in hadoop-common, which propagates
the active context into scheduled worker threads.
Collection of statistics in the IOStatisticsContext
is disabled process-wide by default until the feature
is considered stable.
To enable the collection, set the option
fs.thread.level.iostatistics.enabled
to "true" in core-site.xml;
Contributed by Mehakmeet Singh and Steve Loughran
* This PR ensures that the Protobuf generated headers
are always included first, even when these headers
are included transitively.
* This problem is specific to Windows only.
* The library target hdfspp_test_shim_static is
built using the following sources, which
causes duplicate symbols to be defined -
- hdfs_shim.c
- ${LIBHDFSPP_BINDING_C}/hdfs.cc
* ${LIBHDFSPP_BINDING_C}/hdfs.cc is redundant
and removing this fixes the issue.
* This PR passes the necessary CMake args in the
pom.xml needed for building HDFS native client
on Windows.
* These arguments are exposed as maven options
and can be passed from the command-line.
This downgrades jackson from the version switched to in
HADOOP-18033 (2.13.0), to Jackson 2.12.7.
This removes the dependency on javax.ws.rs-api,
so avoiding runtime problems with applications using
jersey-core v1 and/or jsr311-api.
The 2.12.7 release still contains the fix for CVE-2020-36518.
Contributed by PJ Fanning
* The check_c_source_compiles fails on Windows
while linking with an "unable to resolve
external symbol" error.
* This PR links OpenSSL lib for this check to
fix this issue.
Reduce the ExitUtil synchronized block scopes so System.exit
and Runtime.halt calls aren't within their boundaries,
so ExitUtil wrappers do not block each other.
Enlarged catches to all Throwables (not just Exceptions).
Contributed by Remi Catherinot
* HDFS-16466. Implement Linux permission flags on Windows
* statinfo.cc uses POSIX permission flags.
These flags aren't available for Windows.
* This PR implements the equivalent flags
on Windows to make this cross platform
compatible.
* HADOOP-18321.Fix when to read an additional record from a BZip2 text file split
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com> and Reviewed by Akira Ajisaka.
* YARN-10287.Update scheduler-conf corrupts the CS configuration when removing queue which is referred in queue mapping
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>