Commit Graph

171 Commits

Author SHA1 Message Date
Sahil Takiar
2382f63fc0
HADOOP-14747. S3AInputStream to implement CanUnbuffer.
Author:    Sahil Takiar <stakiar@cloudera.com>
2019-04-12 18:12:02 -07:00
Steve Loughran
cf4efcab3b
HADOOP-16118. S3Guard to support on-demand DDB tables.
This is the first step for on-demand operations: things recognize when they are using on-demand tables,
as do the tests.

Contributed by Steve Loughran.
2019-04-11 17:12:12 -07:00
Steve Loughran
366186d999
HADOOP-16233. S3AFileStatus to declare that isEncrypted() is always true (#685)
This is needed to fix up some confusion about caching of job.addCache() handling of S3A paths; all parent dirs -the files are downloaded by the NM without  using the DTs of the user submitting the job. This means that when you submit jobs to an EC2 cluster with lower IAM permissions than the user, cached resources don't get downloaded and the job doesn't start.

Production code changes:
* S3AFileStatus Adds "true" to the superclass's encrypted flag during construction.

Tests
* Base AbstractContractOpenTest can control whether zero byte files created in tests are encrypted. Not done via an XML attribute, just a subclass point. Thoughts?
* Verify that the filecache considers paths to not have the permissions which trigger reduce-privilege downloads
* And extend ITestDelegatedMRJob to test a completely different bucket (open street map), to verify that cached resources do get their tokens picked up

Docs:
* Advise FS developers to say all files are encrypted. It's otherwise harmless and it'll stop other people seeing impossible to debug error messages on app launch.

Contributed by Steve Loughran.

Change-Id: Ifaae4c9d735ccc5eafeebd2584b65daf2d4e5da3
2019-04-03 21:23:40 +01:00
Gabor Bota
b5db238383
HADOOP-15999. S3Guard: Better support for out-of-band operations.
Author:    Gabor Bota
2019-03-28 15:59:25 +00:00
Gabor Bota
cfb0186903
HADOOP-16186. S3Guard: NPE in DynamoDBMetadataStore.lambda$listChildren.
Author:    Gabor Bota
2019-03-28 15:49:56 +00:00
Steve Loughran
9f1c017f44
HADOOP-16058. S3A tests to include Terasort.
Contributed by Steve Loughran.

This includes
 - HADOOP-15890. Some S3A committer tests don't match ITest* pattern; don't run in maven
 - MAPREDUCE-7090. BigMapOutput example doesn't work with paths off cluster fs
 - MAPREDUCE-7091. Terasort on S3A to switch to new committers
 - MAPREDUCE-7092. MR examples to work better against cloud stores
2019-03-21 11:15:37 +00:00
Ben Roling
6fa229891e
HADOOP-15625. S3A input stream to use etags/version number to detect changed source files.
Author: Ben Roling <ben.roling@gmail.com>

Initial patch from Brahma Reddy Battula.
2019-03-13 20:37:11 +00:00
Steve Loughran
0cbe9ad8c2
HADOOP-16109. Parquet reading S3AFileSystem causes EOF
Nobody gets seek right. No matter how many times they think they have.

Reproducible test from: Dave Christianson
Fixed seek() logic: Steve Loughran
2019-03-09 16:00:34 +00:00
Abhishek Modi
52b2eab575
HADOOP-16093. Move DurationInfo from hadoop-aws to hadoop-common org.apache.hadoop.util.
Contributed by Abhishek Modi
2019-02-26 17:10:41 +00:00
Adam Antal
1e0ae6ed15
HADOOP-15843. s3guard bucket-info command to not print a stack trace on bucket-not-found.
Contributed by Adam Antal.

(Revised patch applied after stevel committed the wrong one; that has been reverted)
2019-02-19 11:33:02 +00:00
Steve Loughran
920a89627d
Revert "HADOOP-15843. s3guard bucket-info command to not print a stack trace on bucket-not-found."
This reverts commit c4a00d1ad3.
2019-02-18 14:57:22 +00:00
Andrew Olson
de804e53b9
HADOOP-15281. Distcp to add no-rename copy option.
Contributed by Andrew Olson.
2019-02-07 10:07:22 +00:00
Steve Loughran
f365957c63
HADOOP-15229. Add FileSystem builder-based openFile() API to match createFile();
S3A to implement S3 Select through this API.

The new openFile() API is asynchronous, and implemented across FileSystem and FileContext.

The MapReduce V2 inputs are moved to this API, and you can actually set must/may
options to pass in.

This is more useful for setting things like s3a seek policy than for S3 select,
as the existing input format/record readers can't handle S3 select output where
the stream is shorter than the file length, and splitting plain text is suboptimal.
Future work is needed there.

In the meantime, any/all filesystem connectors are now free to add their own filesystem-specific
configuration parameters which can be set in jobs and used to set filesystem input stream
options (seek policy, retry, encryption secrets, etc).

Contributed by Steve Loughran
2019-02-05 11:51:02 +00:00
Akira Ajisaka
1129288cf5
HADOOP-14178. Move Mockito up to version 2.23.4. Contributed by Akira Ajisaka and Masatake Iwasaki. 2019-01-29 18:29:56 -08:00
Steve Loughran
6d0bffe17e
HADOOP-14556. S3A to support Delegation Tokens.
Contributed by Steve Loughran and Daryn Sharp.
2019-01-14 17:59:27 +00:00
Adam Antal
c4a00d1ad3
HADOOP-15843. s3guard bucket-info command to not print a stack trace on bucket-not-found.
Contributed by Adam Antal.
2019-01-14 17:27:00 +00:00
Gabor Bota
04fcbef9c9
HADOOP-16043. NPE in ITestDynamoDBMetadataStore when fs.s3a.s3guard.ddb.table is not set.
Contributed by Gabor Bota.
2019-01-14 13:12:05 +00:00
Akira Ajisaka
7f78397036
Revert "HADOOP-14556. S3A to support Delegation Tokens."
This reverts commit d7152332b3.
2019-01-08 14:51:30 +09:00
Steve Loughran
d7152332b3
HADOOP-14556. S3A to support Delegation Tokens.
Contributed by Steve Loughran.
2019-01-07 13:18:03 +00:00
Sean Mackrory
d8f670ff28 HADOOP-15819. FileSystem cache misused in S3A integration tests. Contributed by Adam Antal. 2018-12-27 08:19:25 -07:00
Sean Mackrory
82b798581d HADOOP-15988. DynamoDBMetadataStore#innerGet should support empty directory flag when using authoritative listings. Contributed by Gabor Bota. 2018-12-12 09:30:13 -07:00
Sean Mackrory
1a25bbe9ec HADOOP-15845. Require explicit URI on CLI for s3guard init and destroy. Contributed by Gabor Bota. 2018-12-11 08:33:13 -07:00
Sean Mackrory
c35de95a22 HADOOP-15987. ITestDynamoDBMetadataStore should check if table configured properly. Contributed by Gabor Bota. 2018-12-11 08:29:39 -07:00
Sean Mackrory
3ff8580f22 HADOOP-15428. s3guard bucket-info will create s3guard table if FS is set to do this automatically. (Contributed by Gabor Bota) 2018-12-10 14:03:08 -07:00
Ewan Higgs
c1d24f8483
HDFS-13713. Add specification of Multipart Upload API to FS specification, with contract tests.
Contributed by Ewan Higgs and Steve Loughran.
2018-11-29 15:12:17 +00:00
Sean Mackrory
085f10e75d HADOOP-15947. Fix ITestDynamoDBMetadataStore test error issues. Contributed by Gabor Bota. 2018-11-28 10:45:09 -07:00
Sean Mackrory
e148c3ff09 HADOOP-15798. LocalMetadataStore put() does not retain isDeleted in parent listing. Contributed by Gabor Bota. 2018-11-28 10:45:09 -07:00
Steve Loughran
4c106fca0c
HADOOP-15932. Oozie unable to create sharelib in s3a filesystem.
Contributed by Steve Loughran.
2018-11-27 20:39:54 +00:00
Steve Loughran
ee816f1fd7
HADOOP-15837. DynamoDB table Update can fail S3A FS init.
Contributed by Steve Loughran.
2018-10-11 14:57:38 +01:00
Aaron Fabbri
4f752d442b
HADOOP-15621 2/2 S3Guard: Implement time-based (TTL) expiry for Authoritative Directory Listing. Contributed by Gabor Bota 2018-10-03 00:24:29 -07:00
Aaron Fabbri
046b8768af
HADOOP-15621 S3Guard: Implement time-based (TTL) expiry for Authoritative Directory Listing. Contributed by Gabor Bota 2018-10-02 21:22:49 -07:00
Mingliang Liu
c07715e378 HADOOP-15781 S3A assumed role tests failing due to changed error text in AWS exceptions. Contributed by Steve Loughran 2018-09-24 12:53:21 -07:00
Steve Loughran
26d0c63a1e
HADOOP-15754. s3guard: testDynamoTableTagging should clear existing config.
Contributed by Gabor Bota.
2018-09-17 22:40:08 +01:00
Steve Loughran
d7c0a08a1c
HADOOP-15426 Make S3guard client resilient to DDB throttle events and network failures (Contributed by Steve Loughran) 2018-09-12 21:04:49 -07:00
Aaron Fabbri
d32a8d5d58
HADOOP-14734 add option to tag DDB table(s) created. (Contributed by Gabor Bota and Abe Fine) 2018-09-12 16:36:01 -07:00
Mingliang Liu
1f6c4545cf HADOOP-15750. Remove obsolete S3A test ITestS3ACredentialsInURL. Contributed by Steve Loughran 2018-09-12 10:58:39 -07:00
Sean Mackrory
47b72c87eb HADOOP-15635. s3guard set-capacity command to fail fast if bucket is unguarded.
Contributed by Gabor Bota.
2018-09-12 09:12:38 -06:00
Mingliang Liu
87f63b6479 HADOOP-14833. Remove s3a user:secret authentication. Contributed by Steve Loughran 2018-09-11 17:18:42 -07:00
Steve Loughran
5a0babf765
HADOOP-15107. Stabilize/tune S3A committers; review correctness & docs.
Contributed by Steve Loughran.
2018-08-30 14:49:53 +01:00
Aaron Fabbri
d7232857d8
HADOOP-14154 Persist isAuthoritative bit in DynamoDBMetaStore (Contributed by Gabor Bota) 2018-08-17 10:15:39 -07:00
Akira Ajisaka
3e3963b035
HADOOP-15552. Move logging APIs over to slf4j in hadoop-tools - Part2. Contributed by Ian Pickering. 2018-08-16 00:31:59 +09:00
Ewan Higgs
a13929ddcb HADOOP-15645. ITestS3GuardToolLocal.testDiffCommand fails if bucket has per-bucket binding to DDB. Contributed by Steve Loughran. 2018-08-13 12:57:45 +02:00
Steve Loughran
da9a39eed1
HADOOP-15583. Stabilize S3A Assumed Role support.
Contributed by Steve Loughran.
2018-08-08 22:57:24 -07:00
Ewan Higgs
2ec97abb2e HADOOP-15576. S3A Multipart Uploader to work with S3Guard and encryption Originally contributed by Ewan Higgs with refinements by Steve Loughran. 2018-08-08 13:50:23 +02:00
Steve Loughran
48673bc2a8
HADOOP-15626. FileContextMainOperationsBaseTest.testBuilderCreateAppendExistingFile fails on filesystems without append.
Contributed by Steve Loughran.
2018-08-03 16:06:00 -07:00
Sean Mackrory
59adeb8d7f HADOOP-15636. Follow-up from HADOOP-14918; restoring test under new name. Contributed by Gabor Bota. 2018-07-27 18:23:29 -06:00
Aaron Fabbri
93ac01cb59
HADOOP-15215 s3guard set-capacity command to fail on read/write of 0 (Gabor Bota) 2018-07-03 13:50:11 -07:00
Akira Ajisaka
2b2399d623
HADOOP-15495. Upgrade commons-lang version to 3.7 in hadoop-common-project and hadoop-tools. Contributed by Takanobu Asanuma. 2018-06-28 14:37:22 +09:00
Sean Mackrory
c687a6617d HADOOP-15423. Merge fileCache and dirCache into ine single cache in LocalMetadataStore. Contributed by Gabor Bota. 2018-06-25 14:59:41 -06:00
Sean Mackrory
b089a06793 HADOOP-14918. Remove the Local Dynamo DB test option. Contributed by Gabor Bota. 2018-06-20 16:45:08 -06:00