HADOOP-17386. Change default fs.s3a.buffer.dir to be under Yarn container path on yarn applications (#3908)

Co-authored-by: Monthon Klongklaew <monthonk@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
This commit is contained in:
monthonk 2022-02-22 04:50:27 +00:00 committed by GitHub
parent e363f51ffb
commit 1f157f802d
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 16 additions and 7 deletions

View File

@ -1617,9 +1617,12 @@
<property> <property>
<name>fs.s3a.buffer.dir</name> <name>fs.s3a.buffer.dir</name>
<value>${hadoop.tmp.dir}/s3a</value> <value>${env.LOCAL_DIRS:-${hadoop.tmp.dir}}/s3a</value>
<description>Comma separated list of directories that will be used to buffer file <description>Comma separated list of directories that will be used to buffer file
uploads to.</description> uploads to.
Yarn container path will be used as default value on yarn applications,
otherwise fall back to hadoop.tmp.dir
</description>
</property> </property>
<property> <property>

View File

@ -546,7 +546,7 @@ Conflict management is left to the execution engine itself.
| Option | Meaning | Default | | Option | Meaning | Default |
|--------|---------|---------| |--------|---------|---------|
| `mapreduce.fileoutputcommitter.marksuccessfuljobs` | Write a `_SUCCESS` file on the successful completion of the job. | `true` | | `mapreduce.fileoutputcommitter.marksuccessfuljobs` | Write a `_SUCCESS` file on the successful completion of the job. | `true` |
| `fs.s3a.buffer.dir` | Local filesystem directory for data being written and/or staged. | `${hadoop.tmp.dir}/s3a` | | `fs.s3a.buffer.dir` | Local filesystem directory for data being written and/or staged. | `${env.LOCAL_DIRS:-${hadoop.tmp.dir}}/s3a` |
| `fs.s3a.committer.magic.enabled` | Enable "magic committer" support in the filesystem. | `true` | | `fs.s3a.committer.magic.enabled` | Enable "magic committer" support in the filesystem. | `true` |
| `fs.s3a.committer.abort.pending.uploads` | list and abort all pending uploads under the destination path when the job is committed or aborted. | `true` | | `fs.s3a.committer.abort.pending.uploads` | list and abort all pending uploads under the destination path when the job is committed or aborted. | `true` |
| `fs.s3a.committer.threads` | Number of threads in committers for parallel operations on files. | 8 | | `fs.s3a.committer.threads` | Number of threads in committers for parallel operations on files. | 8 |

View File

@ -967,9 +967,12 @@ options are covered in [Testing](./testing.md).
<property> <property>
<name>fs.s3a.buffer.dir</name> <name>fs.s3a.buffer.dir</name>
<value>${hadoop.tmp.dir}/s3a</value> <value>${env.LOCAL_DIRS:-${hadoop.tmp.dir}}/s3a</value>
<description>Comma separated list of directories that will be used to buffer file <description>Comma separated list of directories that will be used to buffer file
uploads to.</description> uploads to.
Yarn container path will be used as default value on yarn applications,
otherwise fall back to hadoop.tmp.dir
</description>
</property> </property>
<property> <property>
@ -1746,9 +1749,12 @@ consumed, and so eliminates heap size as the limiting factor in queued uploads
<property> <property>
<name>fs.s3a.buffer.dir</name> <name>fs.s3a.buffer.dir</name>
<value>${hadoop.tmp.dir}/s3a</value> <value>${env.LOCAL_DIRS:-${hadoop.tmp.dir}}/s3a</value>
<description>Comma separated list of directories that will be used to buffer file <description>Comma separated list of directories that will be used to buffer file
uploads to.</description> uploads to.
Yarn container path will be used as default value on yarn applications,
otherwise fall back to hadoop.tmp.dir
</description>
</property> </property>
``` ```