Commit Graph

1585 Commits

Author SHA1 Message Date
Steve Loughran
5b7f68ac76
HADOOP-17771. S3AFS creation fails "Unable to find a region via the region provider chain." (#3133)
This addresses the regression in Hadoop 3.3.1 where if no S3 endpoint
is set in fs.s3a.endpoint, S3A filesystem creation may fail on
non-EC2 deployments, depending on the local host environment setup.

* If fs.s3a.endpoint is empty/null, and fs.s3a.endpoint.region
  is null, the region is set to "us-east-1".
* If fs.s3a.endpoint.region is explicitly set to "" then the client
  falls back to the SDK region resolution chain; this works on EC2
* Details in troubleshooting.md, including a workaround for Hadoop-3.3.1+
* Also contains some minor restructuring of troubleshooting.md

Contributed by Steve Loughran.
2021-06-24 16:37:27 +01:00
Takanobu Asanuma
9e7c7ad129
HADOOP-17760. Delete hadoop.ssl.enabled and dfs.https.enable from docs and core-default.xml (#3099)
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
2021-06-17 09:58:47 +09:00
snehavarma
35e4c31fff
HADOOP-17714 ABFS: testBlobBackCompatibility, testRandomRead & WasbAbfsCompatibility tests fail when triggered with default configs (#3035) 2021-06-13 23:52:29 +05:30
Anoop Sam John
5970c632d4
HADOOP-17645 Fix test failures in org.apache.hadoop.fs.azure.ITestOutputStreamSemantics. (#2926) 2021-06-13 23:07:10 +05:30
Petre Bogdan Stolojan
de9ca9f155
HADOOP-17547 Magic committer to downgrade abort in cleanup if list uploads fails with access denied (#3051)
Contributed by Bogdan Stolojan
2021-06-12 17:45:12 +01:00
Anoop Sam John
2cf952baf4
HADOOP-17643 WASB : Make metadata checks case insensitive (#2972) 2021-06-12 15:25:03 +05:30
Viraj Jasani
4ef27a596f
HADOOP-17753. Keep restrict-imports-enforcer-rule for Guava Lists in top level hadoop-main pom (#3087) 2021-06-11 12:15:52 +09:00
snehavarma
4c039fafeb
HADOOP-17715 ABFS: Append blob tests with non HNS accounts fail (#3028) 2021-06-09 10:54:10 +05:30
Viraj Jasani
00d372b663
HADOOP-17725. Improve error message for token providers in ABFS (#3041)
Contributed by Viraj Jasani.
2021-06-08 22:03:03 +01:00
Akira Ajisaka
57a3613e5d
HDFS-16050. Some dynamometer tests fail. (#3079)
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2021-06-07 14:37:30 +09:00
Viraj Jasani
f4b24c68e7
HADOOP-17743. Replace Guava Lists usage by Hadoop's own Lists in hadoop-common, hadoop-tools and cloud-storage projects (#3072) 2021-06-07 13:24:09 +09:00
sumangala-patki
76d92eb2a2
HADOOP-17596. ABFS: Change default Readahead Queue Depth from num(processors) to const (#2795)
. Contributed by Sumangala Patki.
2021-06-03 14:26:15 +05:30
Akira Ajisaka
9983ab8a99
HDFS-16046. TestBalancerProcedureScheduler and TestDistCpProcedure timeouts. (#3060)
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
2021-05-29 23:04:48 +09:00
zhengchenyu
d5ad181684
MAPREDUCE-7287. Distcp will delete exists file , If we use "-delete and -update" options and distcp file. (#2852)
Contributed by zhengchenyu
2021-05-28 20:21:37 +01:00
Viraj Jasani
986d0a4f1d
HADOOP-17732. Keep restrict-imports-enforcer-rule for Guava Sets in hadoop-main pom (#3049)
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2021-05-26 17:14:31 +09:00
Steve Loughran
832a3c6a89
HADOOP-17511. Add audit/telemetry logging to S3A connector (#2807)
The S3A connector supports
"an auditor", a plugin which is invoked
at the start of every filesystem API call,
and whose issued "audit span" provides a context
for all REST operations against the S3 object store.

The standard auditor sets the HTTP Referrer header
on the requests with information about the API call,
such as process ID, operation name, path,
and even job ID.

If the S3 bucket is configured to log requests, this
information will be preserved there and so can be used
to analyze and troubleshoot storage IO.

Contributed by Steve Loughran.
2021-05-25 10:25:41 +01:00
Mehakmeet Singh
5f400032b6
HADOOP-17705. S3A to add Config to set AWS region (#3020)
The option `fs.s3a.endpoint.region` can be used
to explicitly set the AWS region of a bucket.

This is needed when using AWS Private Link, as
the region cannot be automatically determined.

Contributed by Mehakmeet Singh
2021-05-24 13:08:45 +01:00
Mehakmeet Singh
c665ab02ed
HADOOP-17670. S3AFS and ABFS to log IOStats at DEBUG mode or optionally at INFO level in close() (#2963)
When the S3A and ABFS filesystems are closed,
their IOStatistics are logged at debug in the log:

org.apache.hadoop.fs.statistics.IOStatisticsLogging

Set `fs.iostatistics.logging.level` to `info` for the statistics 
to be logged at info. (also: `warn` or `error` for even higher
log levels).


Contributed by: Mehakmeet Singh
2021-05-24 13:02:11 +01:00
Viraj Jasani
e4062ad027
HADOOP-17115. Replace Guava Sets usage by Hadoop's own Sets in hadoop-common and hadoop-tools (#2985)
Signed-off-by: Sean Busbey <busbey@apache.org>
2021-05-20 10:47:04 -05:00
Takanobu Asanuma
207210263a
HADOOP-17375. Fix the error of TestDynamometerInfra. (#2471)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2021-05-07 13:52:17 +09:00
Steve Loughran
68425eb469
HADOOP-16742. NullPointerException in S3A MultiObjectDeleteSupport
Contributed by Tor Arvid Lund.

Change-Id: Iadfe9b2f355cf373031075bfbe681705a2c65bdc
2021-05-04 11:23:01 +01:00
bilaharith
f54e7646cf
HADOOP-17536. ABFS: Supporting customer provided encryption key (#2707)
Contributed by bilahari t h
2021-04-27 11:15:52 +01:00
Steve Loughran
88a550bc3a
HADOOP-17112. S3A committers can't handle whitespace in paths. (#2953)
Contributed by Krzysztof Adamski.
2021-04-25 18:33:55 +01:00
Steve Loughran
027c8fb257
HADOOP-17597. Optionally downgrade on S3A Syncable calls (#2801)
Followup to HADOOP-13327, which changed S3A output stream hsync/hflush calls
to raise an exception.

Adds a new option fs.s3a.downgrade.syncable.exceptions

When true, calls to Syncable hsync/hflush on S3A output streams will
log once at warn (for entire process life, not just the stream), then
increment IOStats with the relevant operation counter

With the downgrade option false (default)
* IOStats are incremented
* The UnsupportedOperationException current raised includes a link to the
  JIRA.

Contributed by Steve Loughran.
2021-04-23 18:44:41 +01:00
Ayush Saxena
6800b21e3b
HADOOP-17620. DistCp: Use Iterator for listing target directory as well. (#2861). Contributed by Ayush Saxena.
Signed-off-by: Vinayakumar B <vinayakumarb@apache.org>
2021-04-23 22:48:15 +05:30
Mehakmeet Singh
6085f09db5
HADOOP-17471. ABFS to collect IOStatistics (#2731)
The ABFS Filesystem and its input and output streams now implement
the IOStatisticSource interface and provide IOStatistics on
their interactions with Azure Storage.

This includes the min/max/mean durations of all REST API calls.

Contributed by Mehakmeet Singh <mehakmeet.singh@cloudera.com>
2021-04-23 10:28:31 +01:00
Steve Loughran
5221322b96
HADOOP-17535. ABFS: ITestAzureBlobFileSystemCheckAccess test failure if no oauth key. (#2920)
Contributed by Steve Loughran.
2021-04-21 16:06:06 +01:00
Steve Loughran
2dd1e04010
HADOOP-17641. ITestWasbUriAndConfiguration failing. (#2937)
This moves the mock account name --which is required to never exist-- from
"mockAccount"  to an account name containing a static UUID.

Contributed by Steve Loughran.
2021-04-20 15:32:01 +01:00
billierinaldi
c1fde4fe94
HADOOP-16948. Support infinite lease dirs. (#1925)
* HADOOP-16948. Support single writer dirs.

* HADOOP-16948. Fix findbugs and checkstyle problems.

* HADOOP-16948. Fix remaining checkstyle problems.

* HADOOP-16948. Add DurationInfo, retry policy for acquiring lease, and javadocs

* HADOOP-16948. Convert ABFS client to use an executor for lease ops

* HADOOP-16948. Fix ABFS lease test for non-HNS

* HADOOP-16948. Fix checkstyle and javadoc

* HADOOP-16948. Address review comments

* HADOOP-16948. Use daemon threads for ABFS lease ops

* HADOOP-16948. Make lease duration configurable

* HADOOP-16948. Add error messages to test assertions

* HADOOP-16948. Remove extra isSingleWriterKey call

* HADOOP-16948. Use only infinite lease duration due to cost of renewal ops

* HADOOP-16948. Remove acquire/renew/release lease methods

* HADOOP-16948. Rename single writer dirs to infinite lease dirs

* HADOOP-16948. Fix checkstyle

* HADOOP-16948. Wait for acquire lease future

* HADOOP-16948. Add unit test for acquire lease failure
2021-04-12 19:47:59 -04:00
sumangala-patki
6f640abbaf
HADOOP-17576. ABFS: Disable throttling update for auth failures (#2761)
Contributed by Sumangala Patki
2021-04-09 09:31:23 +05:30
Viraj Jasani
3f2682b92b
HADOOP-17622. Avoid usage of deprecated IOUtils#cleanup API. (#2862)
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2021-04-06 13:39:10 +09:00
Steve Loughran
85d3bba555
HADOOP-17476. ITestAssumeRole.testAssumeRoleBadInnerAuth failure. (#2777)
Contributed by Steve Loughran.
2021-03-24 16:47:55 +00:00
Steve Loughran
04880f076d
HADOOP-13551. AWS metrics wire-up (#2778)
Moves to the builder API for AWS S3 client creation, and
offers a similar style of API to the S3A FileSystem and tests, hiding
the details of which options are client, which are in AWS Conf,
and doing the wiring up of S3A statistics interfaces to the AWS
SDK internals. S3A Statistics, including IOStatistics, should now
count throttling events handled in the AWS SDK itself.

This patch restores endpoint determination by probes to US-East-1
if the client isn't configured with fs.s3a.endpoint.

Explicitly setting the endpoint will save the cost of these probe
HTTP requests.

Contributed by Steve Loughran.
2021-03-24 13:32:54 +00:00
Ayush Saxena
03cfc85279
HADOOP-17531. DistCp: Reduce memory usage on copying huge directories. (#2732). Contributed by Ayush Saxena.
Signed-off-by: Steve Loughran <stevel@apache.org>
2021-03-24 02:36:26 +05:30
Jack Jiang
d8ec8ab965
HADOOP-17599. Remove NULL checks before instanceof (#2804) 2021-03-23 08:46:11 -07:00
Ayush Saxena
4781761dc2
HADOOP-17594. DistCp: Expose the JobId for applications executing through run method (#2786). Contributed by Ayush Saxena.
Signed-off-by: Mingliang Liu <liuml07@apache.org>
Signed-off-by: Steve Loughran <stevel@apache.org>
2021-03-19 14:19:49 +05:30
Chao Sun
9b2f812996 HADOOP-17532. Yarn Job execution get failed when LZ4 Compression Codec is used. Contributed Bhavik Patel. 2021-03-14 21:15:08 -07:00
sumangala-patki
fe633d4739
HADOOP-17548. ABFS: Toggle Store Mkdirs request overwrite parameter (#2729)
Contributed by Sumangala Patki.
2021-03-14 13:35:02 +05:30
Steve Loughran
bcd9c67082
HADOOP-16721. Improve S3A rename resilience (#2742)
The S3A connector's rename() operation now raises FileNotFoundException if
the source doesn't exist; a FileAlreadyExistsException if the destination
exists and is unsuitable for the source file/directory.

When renaming to a path which does not exist, the connector no longer checks
for the destination parent directory existing -instead it simply verifies
that there is no file immediately above the destination path.
This is needed to avoid race conditions with delete() and rename()
calls working on adjacent subdirectories.

Contributed by Steve Loughran.
2021-03-11 12:47:39 +00:00
Akira Ajisaka
23b343aed1
HADOOP-16870. Use spotbugs-maven-plugin instead of findbugs-maven-plugin (#2753)
Removed findbugs from the hadoop build images and added spotbugs instead.
Upgraded SpotBugs to 4.2.2 and spotbugs-maven-plugin to 4.2.0.

Reviewed-by: Masatake Iwasaki <iwasakims@apache.org>
2021-03-11 10:56:07 +09:00
Pierrick Hymbert
ebfba0b6fa
[HADOOP-17567] typo in MagicCommitTracker (#2749)
Contributed by Pierrick Hymbert
2021-03-10 15:39:55 +00:00
Chao Sun
176bd88890
HADOOP-16080. hadoop-aws does not work with hadoop-client-api. (#2522)
Contributed by Chao Sun.

(Cherry-picked via PR #2575)
2021-03-09 20:01:29 +00:00
Peter Bacsko
c3aa413ee3 YARN-10679. Better logging of uncaught exceptions throughout SLS. Contributed by Szilard Nemeth. 2021-03-09 14:02:12 +01:00
Peter Bacsko
099f58f8f4 YARN-10681. Fix assertion failure message in BaseSLSRunnerTest. Contributed by Szilard Nemeth. 2021-03-09 13:22:48 +01:00
Peter Bacsko
7f522c92fa YARN-10677. Logger of SLSFairScheduler is provided with the wrong class. Contributed by Szilard Nemeth. 2021-03-09 12:53:32 +01:00
Peter Bacsko
ea90cd3556 YARN-10678. Try blocks without catch blocks in SLS scheduler classes can swallow other exceptions. Contributed by Szilard Nemeth. 2021-03-09 12:03:53 +01:00
Ahmed Hussein
e04bcb3a06
MAPREDUCE-7320. organize test directories for ClusterMapReduceTestCase (#2722). Contributed by Ahmed Hussein 2021-02-26 13:42:33 -06:00
sumangala-patki
7f64030314
HADOOP-17537. ABFS: Correct assertion reversed in HADOOP-13327
Contributed Sumangala Patki.
2021-02-22 11:45:58 +00:00
Akira Ajisaka
9a298d180d
Revert "HADOOP-16870. Use spotbugs-maven-plugin instead of findbugs-maven-plugin (#2454)"
This reverts commit 4cf3531583.
2021-02-19 11:09:10 +09:00
Akira Ajisaka
4cf3531583
HADOOP-16870. Use spotbugs-maven-plugin instead of findbugs-maven-plugin (#2454)
Use spotbugs instead of findbugs. Removed findbugs from the hadoop build images,
and added spotbugs in the images instead.

Reviewed-by: Masatake Iwasaki <iwasakims@apache.org>
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Reviewed-by: Dinesh Chitlangia <dineshc@apache.org>
2021-02-17 10:38:20 +09:00
Anoop Sam John
1bb4101b59
HADOOP-17038 Support disabling buffered reads in ABFS positional reads. (#2646)
- Contributed by @anoopsjohn
2021-02-16 22:27:52 +05:30
Steve Loughran
78905d7e3f
HADOOP-16906. Abortable (#2684)
Adds an Abortable.abort() interface for streams to enable output streams to be terminated; this
is implemented by the S3A connector's output stream. It allows for commit protocols
to be implemented which commit/abort work by writing to the final destination and
using the abort() call to cancel any write which is not intended to be committed.
Consult the specification document for information about the interface and its use.

Contributed by Jungtaek Lim and Steve Loughran.
2021-02-11 17:37:20 +00:00
Steve Loughran
798df6d699
HADOOP-13327 Output Stream Specification. (#2587)
This defines what output streams and especially those which implement
Syncable are meant to do, and documents where implementations (HDFS; S3)
don't. With tests.

The file:// FileSystem now supports Syncable if an application calls
FileSystem.setWriteChecksum(false) before creating a file -checksumming
and Syncable.hsync() are incompatible.

Contributed by Steve Loughran.
2021-02-10 10:28:59 +00:00
bilaharith
5f34271bb1
HADOOP-17475. ABFS : add high performance listStatusIterator (#2548)
The ABFS connector now implements listStatusIterator() with
asynchronous prefetching of the next page(s) of results.
For listing large directories this can provide tangible speedups.

If for any reason this needs to be disabled, set
fs.azure.enable.abfslistiterator to false.

Contributed by Bilahari T H.
2021-02-04 13:36:19 +00:00
Steve Loughran
26b9d480e8
HADOOP-17337. S3A NetworkBinding has a runtime dependency on shaded httpclient. (#2599)
Contributed by Steve Loughran.
2021-02-03 14:29:56 +00:00
Steve Loughran
f37bf65199
HADOOP-15710. ABFS checkException to map 403 to AccessDeniedException. (#2648)
When 403 is returned from an ABFS HTTP call, an AccessDeniedException is raised.
The exception text is unchanged, for any application string matching on the getMessage() contents.

Contributed by Steve Loughran.
2021-02-02 18:13:41 +00:00
Steve Loughran
0bb52a42e5
HADOOP-17483. Magic committer is enabled by default. (#2656)
* core-default.xml updated so that fs.s3a.committer.magic.enabled = true
* CommitConstants updated to match
* All tests which previously enabled the magic committer now rely on
  default settings. This helps make sure it is enabled.
* Docs cover the switch, mention its enabled and explain why you may
  want to disable it.
Note: this doesn't switch to using the committer -it just enables the path
rewriting magic which it depends on.

Contributed by Steve Loughran.
2021-01-27 19:04:22 +00:00
Steve Loughran
28cc912a5c
HADOOP-17493. Revert name of DELEGATION_TOKENS_ISSUED constant/statistic (#2649)
Follow-on to HADOOP-16830/HADOOP-17271.

Contributed by Steve Loughran.
2021-01-27 16:39:29 +00:00
Steve Loughran
80c7404b51
HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark (#2530)
This needs SPARK-33739 in the matching spark branch in order to work

Contributed by Steve Loughran.
2021-01-26 19:30:51 +00:00
Ayush Saxena
e40f99f6d5 HDFS-15767. RBF: Router federation rename of directory. Contributed by Jinglun. 2021-01-26 14:25:27 +05:30
Steve Loughran
06a5d3437f
HADOOP-17480. Document that AWS S3 is consistent and that S3Guard is not needed (#2636)
Contributed by Steve Loughran.
2021-01-25 13:21:34 +00:00
Maksim Bober
e2f8503ebd
HADOOP-17484. Typo in hadop-aws index.md (#2634)
Contributed by Maksim Bober.
2021-01-21 17:30:58 +00:00
Steve Loughran
68bc721841
HADOOP-17433. Skipping network I/O in S3A getFileStatus(/) breaks ITestAssumeRole. (#2600)
Contributed by Steve Loughran.
2021-01-19 17:19:27 +00:00
Szilard Nemeth
6cd540e964 YARN-7200. SLS generates a realtimetrack.json file but that file is missing the closing ']'. Contributed by Agshin Kazimli 2021-01-15 22:32:30 +01:00
Steve Loughran
724edb0354
HADOOP-17451. IOStatistics test failures in S3A code. (#2594)
Caused by HADOOP-16830 and HADOOP-17271.

Fixes tests which fail intermittently based on configs and
in the case of the HugeFile tests, bulk runs with existing
FS instances meant statistic probes sometimes ended up probing those
of a previous FS.

Contributed by Steve Loughran.

Change-Id: I65ba3f44444e59d298df25ac5c8dc5a8781dfb7d
2021-01-12 17:30:32 +00:00
Steve Loughran
05c9c2ed02 Revert "HADOOP-17451. IOStatistics test failures in S3A code. (#2594)"
This reverts commit d3014e01f3.
(fixing commit text before it is frozen)
2021-01-12 17:29:59 +00:00
Steve Loughran
d3014e01f3
HADOOP-17451. IOStatistics test failures in S3A code. (#2594)
Caused by HADOOP-16380 and HADOOP-17271.

Fixes tests which fail intermittently based on configs and
in the case of the HugeFile tests, bulk runs with existing
FS instances meant statistic probes sometimes ended up probing those
of a previous FS.

Contributed by Steve Loughran.
2021-01-12 17:25:14 +00:00
Mehakmeet Singh
0a6ddfa145
HADOOP-17272. ABFS Streams to support IOStatistics API (#2604)
Contributed by Mehakmeet Singh.
2021-01-12 15:48:09 +00:00
bilaharith
612330661b
HADOOP-17459. ADLS Gen1: Fixes for rename contract tests #2607
Contributed by Bilaharith
2021-01-12 14:00:48 +00:00
Sneha Vijayarajan
b612c310c2
HADOOP-17404. ABFS: Small write - Merge append and flush
- Contributed by Sneha Vijayarajan
2021-01-06 10:43:37 -08:00
bilaharith
d21c1c6576
HADOOP-17444. ADLS Gen1: Update adls SDK to 2.3.9 (#2551)
Contributed by bilaharith
2021-01-06 14:32:13 +00:00
Gabor Bota
42eb9ff68e
HADOOP-17454. [s3a] Disable bucket existence check - set fs.s3a.bucket.probe to 0 (#2593)
Also fixes HADOOP-16995. ITestS3AConfiguration proxy tests failures when bucket probes == 0
The improvement should include the fix, ebcause the test would fail by default otherwise.

Change-Id: I9a7e4b5e6d4391ebba096c15e84461c038a2ec59
2021-01-05 15:43:01 +01:00
Ayush Saxena
77299ae992 HDFS-15748. RBF: Move the router related part from hadoop-federation-balance module to hadoop-hdfs-rbf. Contributed by Jinglun. 2021-01-05 00:05:03 +05:30
bilaharith
1448add08f
HADOOP-17347. ABFS: Read optimizations
- Contributed by Bilahari T H
2021-01-02 10:37:10 -08:00
Sneha Vijayarajan
5ca1ea89b3
HADOOP-17407. ABFS: Fix NPE on delete idempotency flow
- Contributed by Sneha Vijayarajan
2021-01-02 10:22:10 -08:00
Steve Loughran
617af28e80
HADOOP-17271. S3A connector to support IOStatistics. (#2580)
S3A connector to support the IOStatistics API of HADOOP-16830,

This is a major rework of the S3A Statistics collection to

* Embrace the IOStatistics APIs
* Move from direct references of S3AInstrumention statistics
  collectors to interface/implementation classes in new packages.
* Ubiquitous support of IOStatistics, including:
  S3AFileSystem, input and output streams, RemoteIterator instances
  provided in list calls.
* Adoption of new statistic names from hadoop-common

Regarding statistic collection, as well as all existing
statistics, the connector now records min/max/mean durations
of HTTP GET and HEAD requests, and those of LIST operations.

Contributed by Steve Loughran.
2020-12-31 21:55:39 +00:00
Sumangala
a35fc3871b
HADOOP-17422: ABFS: Set default ListMaxResults to max server limit (#2535)
Contributed by Sumangala Patki

TEST RESULTS:

namespace.enabled=true
auth.type=SharedKey
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 90, Failures: 0, Errors: 0, Skipped: 0
Tests run: 462, Failures: 0, Errors: 0, Skipped: 24
Tests run: 208, Failures: 0, Errors: 0, Skipped: 24

namespace.enabled=true
auth.type=OAuth
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 90, Failures: 0, Errors: 0, Skipped: 0
Tests run: 462, Failures: 0, Errors: 0, Skipped: 70
Tests run: 208, Failures: 0, Errors: 0, Skipped: 141
2020-12-21 06:40:36 +00:00
yzhangal
3d2193cd64
HADOOP-17338. Intermittent S3AInputStream failures: Premature end of Content-Length delimited message body etc (#2497)
Yongjun Zhang <yongjunzhang@pinterest.com>
2020-12-18 19:08:10 +00:00
bilaharith
4c033bafa0
HADOOP-17191. ABFS: Run the tests with various combinations of configurations and publish a consolidated results
- Contributed by Bilahari T H
2020-12-16 10:34:59 -08:00
Sneha Vijayarajan
5bf977e6b1
Hadoop-17413. Release elastic byte buffer pool at close
- Contributed by Sneha Vijayarajan
2020-12-14 20:45:37 -08:00
Ankit Kumar
aaf9e3d320
YARN-10491. Fix deprecation warnings in SLSWebApp.java (#2519)
Signed-off-by: Akira Ajisaka <ajisaka@apache.org>
2020-12-09 10:52:31 +09:00
Thomas Marquardt
717b835068
HADOOP-17397: ABFS: SAS Test updates for version and permission update
DETAILS:

    The previous commit for HADOOP-17397 was not the correct fix.  DelegationSASGenerator.getDelegationSAS
    should return sp=p for the set-permission and set-acl operations.  The tests have also been updated as
    follows:

    1. When saoid and suoid are not specified, skoid must have an RBAC role assignment which grants
       Microsoft.Storage/storageAccounts/blobServices/containers/blobs/modifyPermissions/action and sp=p
       to set permissions or set ACL.

    2. When saoid or suiod is specified, same as 1) but furthermore the saoid or suoid must be an owner of
       the file or directory in order for the operation to succeed.

    3. When saoid or suiod is specified, the ownership check is bypassed by also including 'o' (ownership)
       in the SAS permission (for example, sp=op).  Note that 'o' grants the saoid or suoid the ability to
       change the file or directory owner to themself, and they can also change the owning group. Generally
       speaking, if a trusted authorizer would like to give a user the ability to change the permissions or
       ACL, then that user should be the file or directory owner.

TEST RESULTS:

    namespace.enabled=true
    auth.type=SharedKey
    -------------------
    $mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
    Tests run: 90, Failures: 0, Errors: 0, Skipped: 0
    Tests run: 462, Failures: 0, Errors: 0, Skipped: 24
    Tests run: 208, Failures: 0, Errors: 0, Skipped: 24

    namespace.enabled=true
    auth.type=OAuth
    -------------------
    $mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
    Tests run: 90, Failures: 0, Errors: 0, Skipped: 0
    Tests run: 462, Failures: 0, Errors: 0, Skipped: 70
    Tests run: 208, Failures: 0, Errors: 0, Skipped: 141
2020-12-03 13:11:17 +00:00
Sneha Vijayarajan
142941b96e
HADOOP-17296. ABFS: Force reads to be always of buffer size.
Contributed by Sneha Vijayarajan.
2020-11-27 14:22:34 +00:00
Mukund Thakur
03b4e98971
HADOOP-17398. Skipping network I/O in S3A getFileStatus(/) breaks some tests (#2493)
Follow-on to HADOOP-17323.

Contributed by Mukund Thakur.
2020-11-26 20:25:32 +00:00
Steve Loughran
67dc0928c1
HADOOP-17385. ITestS3ADeleteCost.testDirMarkersFileCreation failure (#2473). Contributed by Steve Loughran
The addition of deprecated S3A configuration options in HADOOP-17318
triggered a reload of default (xml resource) configurations, which breaks
tests which fail if there's a per-bucket setting inconsistent with test
setup.

Creating an S3AFS instance before creating the Configuration() instance
for test runs gets that reload out the way before test setup takes
place.

Along with the fix, extra changes in the failing test suite to fail
fast when marker policy isn't as expected, and to log FS state better.

Rather than create and discard an instance, add a new static method
to S3AFS and invoke it in test setup. This forces the load

Change-Id: Id52b1c46912c6fedd2ae270e2b1eb2222a360329
2020-11-26 13:50:33 +01:00
Sneha Vijayarajan
cf43a7eaae
HADOOP-17397. ABFS: SAS Test updates for version and permission update (#2492)
Contributed by Sneha Vijayarajan.
2020-11-26 10:21:01 +00:00
Sneha Vijayarajan
009ce4f02a
HADOOP-17396. ABFS: testRenameFileOverExistingFile fails (#2491)
Contributed by Sneha  Vijayarajan.
2020-11-26 10:11:25 +00:00
Steve Loughran
ac7045b75f
HADOOP-17313. FileSystem.get to support slow-to-instantiate FS clients. (#2396)
This adds a semaphore to throttle the number of FileSystem instances which
can be created simultaneously, set in "fs.creation.parallel.count".

This is designed to reduce the impact of many threads in an application calling
FileSystem.get() on a filesystem which takes time to instantiate -for example
to an object where HTTPS connections are set up during initialization.
Many threads trying to do this may create spurious delays by conflicting
for access to synchronized blocks, when simply limiting the parallelism
diminishes the conflict, so speeds up all threads trying to access
the store.

The default value, 64, is larger than is likely to deliver any speedup -but
it does mean that there should be no adverse effects from the change.

If a service appears to be blocking on all threads initializing connections to
abfs, s3a or store, try a smaller (possibly significantly smaller) value.

Contributed by Steve Loughran.
2020-11-25 14:31:02 +00:00
bilaharith
3193d8c793
HADOOP-17311. ABFS: Logs should redact SAS signature (#2422)
Contributed by bilaharith.
2020-11-25 14:22:10 +00:00
Mukund Thakur
5fee95076b
HADOOP-17323. S3A getFileStatus("/") to skip IO (#2479)
Contributed by Mukund Thakur.
2020-11-24 11:06:56 +00:00
Steve Loughran
9b4faf2b51
HADOOP-17332. S3A MarkerTool -min and -max are inverted. (#2425)
This patch
* fixes the inversion
* adds a precondition check
* if the commands are supplied inverted, swaps them with a warning.
  This is to stop breaking any tests written to cope with the existing
  behavior.

Contributed by Steve Loughran
2020-11-23 20:49:42 +00:00
Steve Loughran
07b7d07388
HADOOP-17325. WASB Test Failures
Contributed by Ayush Saxena and Steve Loughran

Change-Id: I4bb76815bc1d11d1804dc67bafde68b6a995b974
2020-11-23 17:22:13 +00:00
Steve Loughran
fb79be932c
HADOOP-17343. Upgrade AWS SDK to 1.11.901 (#2468)
Contributed by Steve Loughran.
2020-11-23 14:08:12 +00:00
Jungtaek Lim
f3c629c27e
HADOOP-17388. AbstractS3ATokenIdentifier to issue date in UTC. (#2477)
Followup to HADOOP-17379.

Contributed by Jungtaek Lim.
2020-11-20 10:38:42 +00:00
Ahmed Hussein
07050339e0
HADOOP-17367. Add InetAddress api to ProxyUsers.authorize (#2449). Contributed by Daryn Sharp and Ahmed Hussein 2020-11-19 14:37:14 -06:00
Steve Loughran
ce7827c82a
HADOOP-17318. Support concurrent S3A commit jobs with same app attempt ID. (#2399)
See also [SPARK-33402]: Jobs launched in same second have duplicate MapReduce JobIDs

Contributed by Steve Loughran.

Change-Id: Iae65333cddc84692997aae5d902ad8765b45772a
2020-11-18 13:34:51 +00:00
Steve Loughran
e3c08f285a
HADOOP-17244. S3A directory delete tombstones dir markers prematurely. (#2310)
This fixes the S3Guard/Directory Marker Retention integration so that when
fs.s3a.directory.marker.retention=keep, failures during multipart delete
are handled correctly, as are incremental deletes during
directory tree operations.

In both cases, when a directory marker with children is deleted from
S3, the directory entry in S3Guard is not deleted, because it is still
critical to representing the structure of the store.

Contributed by Steve Loughran.

Change-Id: I4ca133a23ea582cd42ec35dbf2dc85b286297d2f
2020-11-18 12:18:11 +00:00
Jungtaek Lim
a7b923c80c
HADOOP-17379. AbstractS3ATokenIdentifier to set issue date == now. (#2466)
Unless you explicitly set it, the issue date of a delegation token identifier is 0, which confuses spark renewal (SPARK-33440). This patch makes sure that all S3A DT identifiers have the current time as issue date, fixing the problem as far as S3A tokens are concerned.

Contributed by Jungtaek Lim.
2020-11-17 14:43:29 +00:00
Doroszlai, Attila
dd85a90da6
HADOOP-17376. ITestS3AContractRename failing against stricter tests. (#2462)
Contributed by Attila Doroszlai.
2020-11-16 11:24:00 +00:00
jianghuazhu
375900049c
HDFS-15608.Reset the DistCp#CLEANUP variable definition. (#2351). Contributed by JiangHua Zhu.
Co-authored-by: zhujianghua <zhujianghua@zhujianghuadeMacBook-Pro.local>
2020-11-10 13:02:29 -08:00