Commit Graph

36 Commits

Author SHA1 Message Date
Steve Loughran
e123de9f19
HADOOP-16202. Enhanced openFile(): mapreduce and YARN changes. (#2584/2)
These changes ensure that sequential files are opened with the
right read policy, and split start/end is passed in.

As well as offering opportunities for filesystem clients to
choose fetch/cache/seek policies, the settings ensure that
processing text files on an s3 bucket where the default policy
is "random" will still be processed efficiently.

This commit depends on the associated hadoop-common patch,
which must be committed first.

Contributed by Steve Loughran.

Change-Id: Ic6713fd752441cf42ebe8739d05c2293a5db9f94
2022-04-27 19:23:25 +01:00
Steve Loughran
f365957c63
HADOOP-15229. Add FileSystem builder-based openFile() API to match createFile();
S3A to implement S3 Select through this API.

The new openFile() API is asynchronous, and implemented across FileSystem and FileContext.

The MapReduce V2 inputs are moved to this API, and you can actually set must/may
options to pass in.

This is more useful for setting things like s3a seek policy than for S3 select,
as the existing input format/record readers can't handle S3 select output where
the stream is shorter than the file length, and splitting plain text is suboptimal.
Future work is needed there.

In the meantime, any/all filesystem connectors are now free to add their own filesystem-specific
configuration parameters which can be set in jobs and used to set filesystem input stream
options (seek policy, retry, encryption secrets, etc).

Contributed by Steve Loughran
2019-02-05 11:51:02 +00:00
Akira Ajisaka
3e3963b035
HADOOP-15552. Move logging APIs over to slf4j in hadoop-tools - Part2. Contributed by Ian Pickering. 2018-08-16 00:31:59 +09:00
Johan Gustavsson
d14e26b31f
HADOOP-15477. Make unjar in RunJar overrideable
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2018-05-28 17:29:59 +09:00
Arpit Agarwal
2fa7963c3d HADOOP-15254. Correct the wrong word spelling 'intialize'. Contributed by fang zhenyi. 2018-02-24 14:41:55 -08:00
Allen Wittenauer
4222c97108
HADOOP-10392. Use FileSystem#makeQualified(Path) instead of Path#makeQualified(FileSystem) (ajisakaa via aw) 2017-08-11 09:25:56 -07:00
Sean Mackrory
1a1bf6b7d0 HADOOP-13595. Rework hadoop_usage to be broken up by clients/daemons/etc. Contributed by Allen Wittenauer. 2017-08-02 12:25:05 -06:00
Chris Douglas
6eba79232f HADOOP-14271. Correct spelling of 'occurred' and variants. Contributed by Yeliang Cang 2017-04-03 20:13:14 -07:00
Akira Ajisaka
490abfb10f HADOOP-14057. Fix package.html to compile with Java 9. 2017-03-04 00:25:22 +09:00
Allen Wittenauer
58ed4fa544 HADOOP-13341. Deprecate HADOOP_SERVERNAME_OPTS; replace with (command)_(subcommand)_OPTS
This commit includes the following changes:

	HADOOP-13356. Add a function to handle command_subcommand_OPTS
	HADOOP-13355. Handle HADOOP_CLIENT_OPTS in a function
	HADOOP-13554. Add an equivalent of hadoop_subcmd_opts for secure opts
	HADOOP-13562. Change hadoop_subcommand_opts to use only uppercase
	HADOOP-13358. Modify HDFS to use hadoop_subcommand_opts
	HADOOP-13357. Modify common to use hadoop_subcommand_opts
	HADOOP-13359. Modify YARN to use hadoop_subcommand_opts
	HADOOP-13361. Modify hadoop_verify_user to be consistent with hadoop_subcommand_opts (ie more granularity)
	HADOOP-13564. modify mapred to use hadoop_subcommand_opts
	HADOOP-13563. hadoop_subcommand_opts should print name not actual content during debug
	HADOOP-13360. Documentation for HADOOP_subcommand_OPTS

This closes apache/hadoop#126
2016-09-12 11:10:00 -07:00
Allen Wittenauer
730bc746f9 HADOOP-12930. Dynamic subcommands for hadoop shell scripts (aw)
This commit contains the following JIRA issues:

    HADOOP-12931. bin/hadoop work for dynamic subcommands
    HADOOP-12932. bin/yarn work for dynamic subcommands
    HADOOP-12933. bin/hdfs work for dynamic subcommands
    HADOOP-12934. bin/mapred work for dynamic subcommands
    HADOOP-12935. API documentation for dynamic subcommands
    HADOOP-12936. modify hadoop-tools to take advantage of dynamic subcommands
    HADOOP-13086. enable daemonization of dynamic commands
    HADOOP-13087. env var doc update for dynamic commands
    HADOOP-13088. fix shellprofiles in hadoop-tools to allow replacement
    HADOOP-13089. hadoop distcp adds client opts twice when dynamic
    HADOOP-13094. hadoop-common unit tests for dynamic commands
    HADOOP-13095. hadoop-hdfs unit tests for dynamic commands
    HADOOP-13107. clean up how rumen is executed
    HADOOP-13108. dynamic subcommands need a way to manipulate arguments
    HADOOP-13110. add a streaming subcommand to mapred
    HADOOP-13111. convert hadoop gridmix to be dynamic
    HADOOP-13115. dynamic subcommand docs should talk about exit vs. continue program flow
    HADOOP-13117. clarify daemonization and security vars for dynamic commands
    HADOOP-13120. add a --debug message when dynamic commands have been used
    HADOOP-13121. rename sub-project shellprofiles to match the rest of Hadoop
    HADOOP-13129. fix typo in dynamic subcommand docs
    HADOOP-13151. Underscores should be escaped in dynamic subcommands document
    HADOOP-13153. fix typo in debug statement for dynamic subcommands
2016-05-16 17:54:45 -07:00
Allen Wittenauer
0a74610d1c HADOOP-11393. Revert HADOOP_PREFIX, go back to HADOOP_HOME (aw) 2016-03-31 07:51:05 -07:00
Haohui Mai
dc46c46b91 HADOOP-10465. Fix use of generics within SortedMapWritable. Contributed by Bertrand Dechoux. 2015-11-22 18:10:08 -08:00
Robert Kanter
cc70df98e7 MAPREDUCE-5965. Hadoop streaming throws error if list of input files is high. Error is: "error=7, Argument list too long at if number of input file is high" (wilfreds via rkanter) 2015-06-03 18:41:45 -07:00
Colin Patrick Mccabe
7dba7005b7 HADOOP-11969. ThreadLocal initialization in several classes is not thread safe (Sean Busbey via Colin P. McCabe) 2015-05-26 12:15:46 -07:00
Tsuyoshi Ozawa
ef9946cd52 HADOOP-11720. [JDK8] Fix javadoc errors caused by incorrect or illegal tags in hadoop-tools. Contributed by Akira AJISAKA. 2015-03-17 16:09:21 +09:00
Tsuyoshi Ozawa
d1c6accb6f HADOOP-11602. Fix toUpperCase/toLowerCase to use Locale.ENGLISH. (ozawa) 2015-03-03 14:17:52 +09:00
Tsuyoshi Ozawa
9cedad11d8 Revert "HADOOP-11602. Fix toUpperCase/toLowerCase to use Locale.ENGLISH. (ozawa)"
This reverts commit 946456c6d8.

Conflicts:
	hadoop-common-project/hadoop-common/CHANGES.txt
	hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/QuotaByStorageTypeEntry.java
2015-02-25 00:32:04 +09:00
Tsuyoshi Ozawa
946456c6d8 HADOOP-11602. Fix toUpperCase/toLowerCase to use Locale.ENGLISH. (ozawa) 2015-02-19 13:06:53 +09:00
Haohui Mai
7bceb13ba9 HADOOP-11367. Fix warnings from findbugs 3.0 in hadoop-streaming. Contributed by Li Lu. 2014-12-09 10:41:35 -08:00
Kihwal Lee
b056048114 MAPREDUCE-6022. map_input_file is missing from streaming job
environment. Contributed by Jason Lowe.
2014-10-29 12:29:07 -05:00
Allen Wittenauer
9f03a7c018 HADOOP-10946. Fix a bunch of typos in log messages (Ray Chiang via aw) 2014-09-19 11:33:07 -07:00
Haohui Mai
0862ee6520 HADOOP-10485. Remove dead classes in hadoop-streaming. Contributed by Haohui Mai.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1586059 13f79535-47bb-0310-9956-ffa450edef68
2014-04-09 18:10:36 +00:00
Haohui Mai
8ca32df08e HADOOP-10474. Move o.a.h.record to hadoop-streaming. Contributed by Haohui Mai.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1585886 13f79535-47bb-0310-9956-ffa450edef68
2014-04-09 04:02:46 +00:00
Sanford Ryza
7d637a3a99 MAPREDUCE-5457. Add a KeyOnlyTextOutputReader to enable streaming to write out text files without separators (Sandy Ryza)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1533624 13f79535-47bb-0310-9956-ffa450edef68
2013-10-18 20:43:53 +00:00
Suresh Srinivas
8f7ce62085 MAPREDUCE-5177. Use common utils FileUtil#setReadable/Writable/Executable & FileUtil#canRead/Write/Execute. Contributed by Ivan Mitic.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1477403 13f79535-47bb-0310-9956-ffa450edef68
2013-04-29 23:00:39 +00:00
Bikas Saha
41c4cd08a0 MAPREDUCE-4885. Streaming tests have multiple failures on Windows. (Chris Nauroth via bikas)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1467158 13f79535-47bb-0310-9956-ffa450edef68
2013-04-12 03:00:29 +00:00
Alejandro Abdelnur
806073867e MAPREDUCE-5113. Streaming input/output types are ignored with java mapper/reducer. (sandyr via tucu)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1463307 13f79535-47bb-0310-9956-ffa450edef68
2013-04-01 21:42:12 +00:00
Jason Darrell Lowe
40c3b7f0b2 MAPREDUCE-4793. Problem with adding resources when using both -files and -file to hadoop streaming. Contributed by Jason Lowe
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1425177 13f79535-47bb-0310-9956-ffa450edef68
2012-12-21 23:05:54 +00:00
Thomas Graves
735b50e8bd MAPREDUCE-4493. Distibuted Cache Compatability Issues (Robert Evans via tgraves)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1367713 13f79535-47bb-0310-9956-ffa450edef68
2012-07-31 19:20:03 +00:00
Robert Joseph Evans
9c87911c4a HADOOP-8521. Port StreamInputFormat to new Map Reduce API (madhukara phatak via bobby)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1360238 13f79535-47bb-0310-9956-ffa450edef68
2012-07-11 15:44:43 +00:00
Robert Joseph Evans
a9808de0d9 HADOOP-8341. Fix or filter findbugs issues in hadoop-tools (bobby)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1335505 13f79535-47bb-0310-9956-ffa450edef68
2012-05-08 13:20:56 +00:00
Robert Joseph Evans
858c6d2b1f MAPREDUCE-3790 Broken pipe on streaming job can lead to truncated output for a successful job (Jason Lowe via bobby)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1294743 13f79535-47bb-0310-9956-ffa450edef68
2012-02-28 17:43:08 +00:00
Arun Murthy
7afb9aca70 MAPREDUCE-3521. Fixed streaming to ensure it doesn't silently ignore unknown arguments. Contributed by Robert Evans.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1225471 13f79535-47bb-0310-9956-ffa450edef68
2011-12-29 08:24:09 +00:00
Arun Murthy
919f56c3d4 MAPREDUCE-3604. Fixed streaming to use new mapreduce.framework.name to check for local mode.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1225259 13f79535-47bb-0310-9956-ffa450edef68
2011-12-28 18:23:12 +00:00
Alejandro Abdelnur
26447229ba HADOOP-7590. Mavenize streaming and MR examples. (tucu)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1203941 13f79535-47bb-0310-9956-ffa450edef68
2011-11-19 01:24:32 +00:00