Adds the option fs.s3a.requester.pays.enabled, which, if set to true, allows
the client to access S3 buckets where the requester is billed for the IO.
Contributed by Daniel Carl Jones
* The source files for hdfs_ls
uses getopt for parsing the
command line arguments.
* getopt is available only on
Linux and thus, isn't cross
platform.
* Thus, we need to replace
getopt with
boost::program_options to
make this tool cross platform.
Optimize the scan for s3 by performing a deep tree listing,
inferring directory counts from the paths returned.
Contributed by Ahmar Suhail.
Change-Id: I26ffa8c6f65fd11c68a88d6e2243b0eac6ffd024
RBF proxy. There is a new configuration knob dfs.namenode.ip-proxy-users that configures
the list of users than can set their client ip address using the client context.
Fixes#4081
* The source files for hdfs_find uses
getopt for parsing the command
line arguments. getopt is available
only on Linux and thus, isn't cross
platform.
* Thus, we need to replace getopt
with boost::program_options to
make hdfs_find cross platform.
Follow-on patch to MAPREDUCE-7341, adding ABFS support and tests
* resilient rename
* tests for job commit through the manifest committer.
contains
- HADOOP-17976. ABFS etag extraction inconsistent between LIST and HEAD calls
- HADOOP-16204. ABFS tests to include terasort
Contributed by Steve Loughran.
Change-Id: I0a7d4043bdf19bcb00c033fc389730109b93b77f
This is a mapreduce/spark output committer optimized for
performance and correctness on Azure ADLS Gen 2 storage
(via the abfs connector) and Google Cloud Storage
(via the external gcs connector library).
* It is safe to use with HDFS, however it has not been optimized
for that use.
* It is *not* safe for use with S3, and will fail if an attempt
is made to do so.
Contributed by Steve Loughran
Change-Id: I6f3502e79c578b9fd1a8c1485f826784b5421fca
* New statistic names in StoreStatisticNames
(for joint use with s3a committers)
* Improvements to IOStatistics implementation classes
* RateLimiting wrapper to guava RateLimiter
* S3A committer Tasks moved over as TaskPool and
added support for RemoteIterator
* JsonSerialization.load() to fail fast if source does not exist
+ tests.
This commit is a prerequisite for the main MAPREDUCE-7341 Manifest Committer
patch.
Contributed by Steve Loughran
Change-Id: Ia92e2ab5083ac3d8d3d713a4d9cb3e9e0278f654
To get the new behavior, define fs.viewfs.trash.force-inside-mount-point to be true.
If the trash root for path p is in the same mount point as path p,
and one of:
* The mount point isn't at the top of the target fs.
* The resolved path of path is root (eg it is the fallback FS).
* The trash root isn't in user's target fs home directory.
get the corresponding viewFS path for the trash root and return it.
Otherwise, use <mnt>/.Trash/<user>.
Signed-off-by: Owen O'Malley <oomalley@linkedin.com>
Multi object delete of size more than 1000 is not supported by S3 and
fails with MalformedXML error. So implementing paging of requests to
reduce the number of keys in a single request. Page size can be configured
using "fs.s3a.bulk.delete.page.size"
Contributed By: Mukund Thakur
* Centos 8 has reached its
End-of-Life and thus its
packages are no longer
accessible from
mirror.centos.org.
* This PR switches the baseurl
to vault.centos.org where
the packages are archived.