* Windows doesn't want the
macro _JNI_IMPORT_OR_EXPORT_
to be defined in the function
definition. It fails to compile with
the following error -
"definition of dllimport function
not allowed".
* However, Linux needs it. Hence,
we're going to add this macro
based on the OS.
* Also, we'll be compiling the `hdfs`
target as an object library so that
we can avoid linking to `jvm`
library for `get_jni_test` target.
This moves Hadoop to Apache commons-collections4.
Apache commons-collections has been removed and is completely banned from the source code.
Contributed by Nihal Jain
To support YARN deployments in clusters without HDFS
some changes have been made in packaging
* new hadoop-common class org.apache.hadoop.fs.HdfsCommonConstants
* hdfs class org.apache.hadoop.hdfs.protocol.datatransfer.IOStreamPair moved
from hdfs-client to hadoop-common
* YARN handlers for DSQuotaExceededException replaced by use of superclass
ClusterStorageCapacityExceededException.
Contributed by Syed Shameerur Rahman
1. The class WrappedIO has been extended with more filesystem operations
- openFile()
- PathCapabilities
- StreamCapabilities
- ByteBufferPositionedReadable
All these static methods raise UncheckedIOExceptions rather than
checked ones.
2. The adjacent class org.apache.hadoop.io.wrappedio.WrappedStatistics
provides similar access to IOStatistics/IOStatisticsContext classes
and operations.
Allows callers to:
* Get a serializable IOStatisticsSnapshot from an IOStatisticsSource or
IOStatistics instance
* Save an IOStatisticsSnapshot to file
* Convert an IOStatisticsSnapshot to JSON
* Given an object which may be an IOStatisticsSource, return an object
whose toString() value is a dynamically generated, human readable summary.
This is for logging.
* Separate getters to the different sections of IOStatistics.
* Mean values are returned as a Map.Pair<Long, Long> of (samples, sum)
from which means may be calculated.
There are examples of the dynamic bindings to these classes in:
org.apache.hadoop.io.wrappedio.impl.DynamicWrappedIO
org.apache.hadoop.io.wrappedio.impl.DynamicWrappedStatistics
These use DynMethods and other classes in the package
org.apache.hadoop.util.dynamic which are based on the
Apache Parquet equivalents.
This makes re-implementing these in that library and others
which their own fork of the classes (example: Apache Iceberg)
3. The openFile() option "fs.option.openfile.read.policy" has
added specific file format policies for the core filetypes
* avro
* columnar
* csv
* hbase
* json
* orc
* parquet
S3A chooses the appropriate sequential/random policy as a
A policy `parquet, columnar, vector, random, adaptive` will use the parquet policy for
any filesystem aware of it, falling back to the first entry in the list which
the specific version of the filesystem recognizes
4. New Path capability fs.capability.virtual.block.locations
Indicates that locations are generated client side
and don't refer to real hosts.
Contributed by Steve Loughran