Commit Graph

186 Commits

Author SHA1 Message Date
Steve Loughran
cf4efcab3b
HADOOP-16118. S3Guard to support on-demand DDB tables.
This is the first step for on-demand operations: things recognize when they are using on-demand tables,
as do the tests.

Contributed by Steve Loughran.
2019-04-11 17:12:12 -07:00
Steve Loughran
215ffc792e HADOOP-16197 S3AUtils.translateException to map CredentialInitializationException to AccessDeniedException
Contributed by Steve Loughran.

Change-Id: Ie98ca5210bf0009f297edbcacf1fc6dfe5ea70cd.
2019-04-04 21:14:18 +01:00
Steve Loughran
366186d999
HADOOP-16233. S3AFileStatus to declare that isEncrypted() is always true (#685)
This is needed to fix up some confusion about caching of job.addCache() handling of S3A paths; all parent dirs -the files are downloaded by the NM without  using the DTs of the user submitting the job. This means that when you submit jobs to an EC2 cluster with lower IAM permissions than the user, cached resources don't get downloaded and the job doesn't start.

Production code changes:
* S3AFileStatus Adds "true" to the superclass's encrypted flag during construction.

Tests
* Base AbstractContractOpenTest can control whether zero byte files created in tests are encrypted. Not done via an XML attribute, just a subclass point. Thoughts?
* Verify that the filecache considers paths to not have the permissions which trigger reduce-privilege downloads
* And extend ITestDelegatedMRJob to test a completely different bucket (open street map), to verify that cached resources do get their tokens picked up

Docs:
* Advise FS developers to say all files are encrypted. It's otherwise harmless and it'll stop other people seeing impossible to debug error messages on app launch.

Contributed by Steve Loughran.

Change-Id: Ifaae4c9d735ccc5eafeebd2584b65daf2d4e5da3
2019-04-03 21:23:40 +01:00
Steve Loughran
df578c07ec HADOOP-16195 MarshalledCredentials toString
Change-Id: I4f1bdd2be0d5760c5501dce6edb6122499108b53
2019-03-28 17:01:57 +00:00
Gabor Bota
b5db238383
HADOOP-15999. S3Guard: Better support for out-of-band operations.
Author:    Gabor Bota
2019-03-28 15:59:25 +00:00
Gabor Bota
cfb0186903
HADOOP-16186. S3Guard: NPE in DynamoDBMetadataStore.lambda$listChildren.
Author:    Gabor Bota
2019-03-28 15:49:56 +00:00
Lokesh Jain
ae2eb2dd42 HADOOP-16201: S3AFileSystem#innerMkdirs builds needless lists (#636) 2019-03-22 11:42:00 +00:00
Ben Roling
6fa229891e
HADOOP-15625. S3A input stream to use etags/version number to detect changed source files.
Author: Ben Roling <ben.roling@gmail.com>

Initial patch from Brahma Reddy Battula.
2019-03-13 20:37:11 +00:00
Steve Loughran
0cbe9ad8c2
HADOOP-16109. Parquet reading S3AFileSystem causes EOF
Nobody gets seek right. No matter how many times they think they have.

Reproducible test from: Dave Christianson
Fixed seek() logic: Steve Loughran
2019-03-09 16:00:34 +00:00
Abhishek Modi
52b2eab575
HADOOP-16093. Move DurationInfo from hadoop-aws to hadoop-common org.apache.hadoop.util.
Contributed by Abhishek Modi
2019-02-26 17:10:41 +00:00
Adam Antal
1e0ae6ed15
HADOOP-15843. s3guard bucket-info command to not print a stack trace on bucket-not-found.
Contributed by Adam Antal.

(Revised patch applied after stevel committed the wrong one; that has been reverted)
2019-02-19 11:33:02 +00:00
Steve Loughran
920a89627d
Revert "HADOOP-15843. s3guard bucket-info command to not print a stack trace on bucket-not-found."
This reverts commit c4a00d1ad3.
2019-02-18 14:57:22 +00:00
Masatake Iwasaki
6c999fe4b0 HADOOP-16098. Fix javadoc warnings in hadoop-aws. Contributed by Masatake Iwasaki. 2019-02-12 06:07:47 +09:00
Steve Loughran
f365957c63
HADOOP-15229. Add FileSystem builder-based openFile() API to match createFile();
S3A to implement S3 Select through this API.

The new openFile() API is asynchronous, and implemented across FileSystem and FileContext.

The MapReduce V2 inputs are moved to this API, and you can actually set must/may
options to pass in.

This is more useful for setting things like s3a seek policy than for S3 select,
as the existing input format/record readers can't handle S3 select output where
the stream is shorter than the file length, and splitting plain text is suboptimal.
Future work is needed there.

In the meantime, any/all filesystem connectors are now free to add their own filesystem-specific
configuration parameters which can be set in jobs and used to set filesystem input stream
options (seek policy, retry, encryption secrets, etc).

Contributed by Steve Loughran
2019-02-05 11:51:02 +00:00
Steve Loughran
6d0bffe17e
HADOOP-14556. S3A to support Delegation Tokens.
Contributed by Steve Loughran and Daryn Sharp.
2019-01-14 17:59:27 +00:00
Adam Antal
c4a00d1ad3
HADOOP-15843. s3guard bucket-info command to not print a stack trace on bucket-not-found.
Contributed by Adam Antal.
2019-01-14 17:27:00 +00:00
Akira Ajisaka
7f78397036
Revert "HADOOP-14556. S3A to support Delegation Tokens."
This reverts commit d7152332b3.
2019-01-08 14:51:30 +09:00
Steve Loughran
d7152332b3
HADOOP-14556. S3A to support Delegation Tokens.
Contributed by Steve Loughran.
2019-01-07 13:18:03 +00:00
Sean Mackrory
82b798581d HADOOP-15988. DynamoDBMetadataStore#innerGet should support empty directory flag when using authoritative listings. Contributed by Gabor Bota. 2018-12-12 09:30:13 -07:00
Sean Mackrory
1a25bbe9ec HADOOP-15845. Require explicit URI on CLI for s3guard init and destroy. Contributed by Gabor Bota. 2018-12-11 08:33:13 -07:00
Sean Mackrory
3ff8580f22 HADOOP-15428. s3guard bucket-info will create s3guard table if FS is set to do this automatically. (Contributed by Gabor Bota) 2018-12-10 14:03:08 -07:00
Sean Mackrory
7eb0d3a324 HADOOP-14927. ITestS3GuardTool failures in testDestroyNoBucket(). Contributed by Gabor Bota. 2018-11-29 09:36:39 -07:00
Ewan Higgs
c1d24f8483
HDFS-13713. Add specification of Multipart Upload API to FS specification, with contract tests.
Contributed by Ewan Higgs and Steve Loughran.
2018-11-29 15:12:17 +00:00
Sean Mackrory
085f10e75d HADOOP-15947. Fix ITestDynamoDBMetadataStore test error issues. Contributed by Gabor Bota. 2018-11-28 10:45:09 -07:00
Sean Mackrory
e148c3ff09 HADOOP-15798. LocalMetadataStore put() does not retain isDeleted in parent listing. Contributed by Gabor Bota. 2018-11-28 10:45:09 -07:00
Sean Mackrory
5d96b74f33 HADOOP-15370. S3A log message on rm s3a://bucket/ not intuitive. Contributed by Gabor Bota. 2018-11-28 10:45:09 -07:00
Steve Loughran
4c106fca0c
HADOOP-15932. Oozie unable to create sharelib in s3a filesystem.
Contributed by Steve Loughran.
2018-11-27 20:39:54 +00:00
Steve Loughran
d59ca43bff
HADOOP-15826. @Retries annotation of putObject() call & uses wrong.
Contributed by Steve Loughran and Ewan Higgs.
2018-10-16 20:02:54 +01:00
Steve Loughran
ee816f1fd7
HADOOP-15837. DynamoDB table Update can fail S3A FS init.
Contributed by Steve Loughran.
2018-10-11 14:57:38 +01:00
Steve Loughran
7ba1cfdea7
HADOOP-15827. NPE in DynamoDBMetadataStore.lambda$listChildren for root + auth S3Guard.
Contributed by Gabor Bota
2018-10-09 10:46:41 +01:00
Aaron Fabbri
4f752d442b
HADOOP-15621 2/2 S3Guard: Implement time-based (TTL) expiry for Authoritative Directory Listing. Contributed by Gabor Bota 2018-10-03 00:24:29 -07:00
Aaron Fabbri
046b8768af
HADOOP-15621 S3Guard: Implement time-based (TTL) expiry for Authoritative Directory Listing. Contributed by Gabor Bota 2018-10-02 21:22:49 -07:00
Sunil G
d060cbea48 HDFS-13937. Multipart Uploader APIs to be marked as private/unstable in 3.2.0. Contributed by Steve Loughran. 2018-09-24 21:19:47 +05:30
Steve Loughran
d7c0a08a1c
HADOOP-15426 Make S3guard client resilient to DDB throttle events and network failures (Contributed by Steve Loughran) 2018-09-12 21:04:49 -07:00
Aaron Fabbri
d32a8d5d58
HADOOP-14734 add option to tag DDB table(s) created. (Contributed by Gabor Bota and Abe Fine) 2018-09-12 16:36:01 -07:00
Sean Mackrory
47b72c87eb HADOOP-15635. s3guard set-capacity command to fail fast if bucket is unguarded.
Contributed by Gabor Bota.
2018-09-12 09:12:38 -06:00
Mingliang Liu
87f63b6479 HADOOP-14833. Remove s3a user:secret authentication. Contributed by Steve Loughran 2018-09-11 17:18:42 -07:00
Gabor Bota
36c7c78260
HADOOP-15709 Move S3Guard LocalMetadataStore constants to org.apache.hadoop.fs.s3a.Constants (Contributed by Gabor Bota) 2018-09-07 10:25:20 -07:00
Steve Loughran
5a0babf765
HADOOP-15107. Stabilize/tune S3A committers; review correctness & docs.
Contributed by Steve Loughran.
2018-08-30 14:49:53 +01:00
Steve Loughran
2e6c1109dc
HADOOP-15667. FileSystemMultipartUploader should verify that UploadHandle has non-0 length.
Contributed by Ewan Higgs
2018-08-30 14:33:16 +01:00
Aaron Fabbri
d7232857d8
HADOOP-14154 Persist isAuthoritative bit in DynamoDBMetaStore (Contributed by Gabor Bota) 2018-08-17 10:15:39 -07:00
Steve Loughran
da9a39eed1
HADOOP-15583. Stabilize S3A Assumed Role support.
Contributed by Steve Loughran.
2018-08-08 22:57:24 -07:00
Ewan Higgs
2ec97abb2e HADOOP-15576. S3A Multipart Uploader to work with S3Guard and encryption Originally contributed by Ewan Higgs with refinements by Steve Loughran. 2018-08-08 13:50:23 +02:00
Sean Mackrory
a08812a1b1 HADOOP-15349. S3Guard DDB retryBackoff to be more informative on limits exceeded. Contributed by Gabor Bota. 2018-07-12 17:24:01 +02:00
Sean Mackrory
d503f65b66 HADOOP-15541. [s3a] Shouldn't try to drain stream before aborting
connection in case of timeout.
2018-07-10 17:52:57 +02:00
Aaron Fabbri
93ac01cb59
HADOOP-15215 s3guard set-capacity command to fail on read/write of 0 (Gabor Bota) 2018-07-03 13:50:11 -07:00
Akira Ajisaka
2b2399d623
HADOOP-15495. Upgrade commons-lang version to 3.7 in hadoop-common-project and hadoop-tools. Contributed by Takanobu Asanuma. 2018-06-28 14:37:22 +09:00
Sean Mackrory
c687a6617d HADOOP-15423. Merge fileCache and dirCache into ine single cache in LocalMetadataStore. Contributed by Gabor Bota. 2018-06-25 14:59:41 -06:00
Sean Mackrory
55fad6a3de HADOOP-15416. Clear error message in S3Guard diff if source not found. Contributed by Gabor Bota. 2018-06-22 11:36:56 -06:00
Sean Mackrory
b089a06793 HADOOP-14918. Remove the Local Dynamo DB test option. Contributed by Gabor Bota. 2018-06-20 16:45:08 -06:00