Commit Graph

48 Commits

Author SHA1 Message Date
Steve Loughran
c9270600b7
MAPREDUCE-7474. Improve Manifest committer resilience (#6716)
Improve task commit resilience everywhere
and add an option to reduce delete IO requests on
job cleanup (relevant for ABFS and HDFS).

Task Commit Resilience
----------------------

Task manifest saving is re-attempted on failure; the number of 
attempts made is configurable with the option:

  mapreduce.manifest.committer.manifest.save.attempts

* The default is 5.
* The minimum is 1; asking for less is ignored.
* A retry policy adds 500ms of sleep per attempt.
* Move from classic rename() to commitFile() to rename the file,
  after calling getFileStatus() to get its length and possibly etag.
  This becomes a rename() on gcs/hdfs anyway, but on abfs it does reach
  the ResilientCommitByRename callbacks in abfs, which report on
  the outcome to the caller...which is then logged at WARN.
* New statistic task_stage_save_summary_file to distinguish from
  other saving operations (job success/report file).
  This is only saved to the manifest on task commit retries, and
  provides statistics on all previous unsuccessful attempts to save
  the manifests
+ test changes to match the codepath changes, including improvements
  in fault injection.

Directory size for deletion
---------------------------

New option

  mapreduce.manifest.committer.cleanup.parallel.delete.base.first

This attempts an initial attempt at deleting the base dir, only falling
back to parallel deletes if there's a timeout.

This option is disabled by default; Consider enabling it for abfs to
reduce IO load. Consult the documentation for more details.

Success file printing
---------------------

The command to print a JSON _SUCCESS file from this committer and
any S3A committer is now something which can be invoked from
the mapred command:

  mapred successfile <path to file>

Contributed by Steve Loughran
2024-05-13 21:12:34 +01:00
Masatake Iwasaki
8fd0fdf889
MAPREDUCE-7281. Fix NoClassDefFoundError on 'mapred minicluster'. (#2077) 2020-06-20 07:37:55 +09:00
Vrushali C
c191538ed1 HADOOP-15166 CLI MiniCluster fails with ClassNotFoundException o.a.h.yarn.server.timelineservice.collector.TimelineCollectorManager. Contributed by Gera Shegalov 2018-01-19 16:15:55 -08:00
Arpit Agarwal
e00c7f78c1 HADOOP-14976. Set HADOOP_SHELL_EXECNAME explicitly in scripts. 2017-12-04 21:02:04 -08:00
Robert Kanter
3b78607a02 MAPREDUCE-6994. Uploader tool for Distributed Cache Deploy code changes (miklos.szegedi@cloudera.com via rkanter) 2017-12-01 12:12:15 -08:00
Sean Mackrory
1a1bf6b7d0 HADOOP-13595. Rework hadoop_usage to be broken up by clients/daemons/etc. Contributed by Allen Wittenauer. 2017-08-02 12:25:05 -06:00
Allen Wittenauer
96cbb4fce2 HADOOP-14202. fix jsvc/secure user var inconsistencies
Signed-off-by: John Zhuge <jzhuge@apache.org>
2017-04-07 08:59:21 -07:00
Allen Wittenauer
0eb4b513b7 HADOOP-13673. Update scripts to be smarter when running with privilege
Signed-off-by: Andrew Wang <wang@apache.org>
Signed-off-by: Ravi Prakash <raviprak@apache.org>
2017-01-18 14:39:05 -08:00
Allen Wittenauer
58ed4fa544 HADOOP-13341. Deprecate HADOOP_SERVERNAME_OPTS; replace with (command)_(subcommand)_OPTS
This commit includes the following changes:

	HADOOP-13356. Add a function to handle command_subcommand_OPTS
	HADOOP-13355. Handle HADOOP_CLIENT_OPTS in a function
	HADOOP-13554. Add an equivalent of hadoop_subcmd_opts for secure opts
	HADOOP-13562. Change hadoop_subcommand_opts to use only uppercase
	HADOOP-13358. Modify HDFS to use hadoop_subcommand_opts
	HADOOP-13357. Modify common to use hadoop_subcommand_opts
	HADOOP-13359. Modify YARN to use hadoop_subcommand_opts
	HADOOP-13361. Modify hadoop_verify_user to be consistent with hadoop_subcommand_opts (ie more granularity)
	HADOOP-13564. modify mapred to use hadoop_subcommand_opts
	HADOOP-13563. hadoop_subcommand_opts should print name not actual content during debug
	HADOOP-13360. Documentation for HADOOP_subcommand_OPTS

This closes apache/hadoop#126
2016-09-12 11:10:00 -07:00
Allen Wittenauer
730bc746f9 HADOOP-12930. Dynamic subcommands for hadoop shell scripts (aw)
This commit contains the following JIRA issues:

    HADOOP-12931. bin/hadoop work for dynamic subcommands
    HADOOP-12932. bin/yarn work for dynamic subcommands
    HADOOP-12933. bin/hdfs work for dynamic subcommands
    HADOOP-12934. bin/mapred work for dynamic subcommands
    HADOOP-12935. API documentation for dynamic subcommands
    HADOOP-12936. modify hadoop-tools to take advantage of dynamic subcommands
    HADOOP-13086. enable daemonization of dynamic commands
    HADOOP-13087. env var doc update for dynamic commands
    HADOOP-13088. fix shellprofiles in hadoop-tools to allow replacement
    HADOOP-13089. hadoop distcp adds client opts twice when dynamic
    HADOOP-13094. hadoop-common unit tests for dynamic commands
    HADOOP-13095. hadoop-hdfs unit tests for dynamic commands
    HADOOP-13107. clean up how rumen is executed
    HADOOP-13108. dynamic subcommands need a way to manipulate arguments
    HADOOP-13110. add a streaming subcommand to mapred
    HADOOP-13111. convert hadoop gridmix to be dynamic
    HADOOP-13115. dynamic subcommand docs should talk about exit vs. continue program flow
    HADOOP-13117. clarify daemonization and security vars for dynamic commands
    HADOOP-13120. add a --debug message when dynamic commands have been used
    HADOOP-13121. rename sub-project shellprofiles to match the rest of Hadoop
    HADOOP-13129. fix typo in dynamic subcommand docs
    HADOOP-13151. Underscores should be escaped in dynamic subcommands document
    HADOOP-13153. fix typo in debug statement for dynamic subcommands
2016-05-16 17:54:45 -07:00
Allen Wittenauer
0a74610d1c HADOOP-11393. Revert HADOOP_PREFIX, go back to HADOOP_HOME (aw) 2016-03-31 07:51:05 -07:00
Allen Wittenauer
738155063e HADOOP-12857. rework hadoop-tools (aw) 2016-03-23 13:46:38 -07:00
Allen Wittenauer
b76b0ce51e HADOOP-12366. expose calculated paths (aw) 2015-11-07 08:32:56 -08:00
Varun Vasudev
73b9c7b82b HADOOP-10787. Rename/remove non-HADOOP_*, etc from the shell scripts. Contributed by Allen Wittenauer. 2015-11-04 15:56:17 +05:30
Karthik Kambatla
119cc75e7e MAPREDUCE-6415. Create a tool to combine aggregated logs into HAR files. (Robert Kanter via kasha) 2015-09-09 17:45:19 -07:00
Allen Wittenauer
666cafca8d HADOOP-12249. pull argument parsing into a function (aw) 2015-07-31 14:32:21 -07:00
Allen Wittenauer
ee36f4f9b8 HADOOP-10979. Auto-entries in hadoop_usage (aw) 2015-07-16 16:58:11 -07:00
cnauroth
4528eb9fb2 HADOOP-11524. hadoop_do_classpath_subcommand throws a shellcheck warning. Contributed by Chris Nauroth. 2015-03-25 22:36:09 -07:00
Allen Wittenauer
93b941c637 HADOOP-11565. Add --slaves shell option (aw) 2015-02-12 18:01:28 -08:00
cnauroth
0742591335 MAPREDUCE-3283. mapred classpath CLI does not display the complete classpath. Contributed by Varun Saxena. 2015-01-21 13:50:39 -08:00
Allen Wittenauer
c536142699 HADOOP-6590. Add a username check for hadoop sub-commands (John Smith via aw) 2014-12-10 13:41:28 -08:00
Allen Wittenauer
a7c6c710b2 HADOOP-10950. rework heap management vars (John Smith via aw) 2014-12-10 13:37:32 -08:00
Allen Wittenauer
72c141ba96 HADOOP-11208. Replace "daemon" with better name in script subcommands (aw) 2014-11-19 14:49:29 -08:00
Jason Lowe
3baaa42945 MAPREDUCE-6161. mapred hsadmin command missing from trunk. Contributed by Allen Wittenauer 2014-11-14 21:10:06 +00:00
cnauroth
0abb973f09 HADOOP-7984. Add hadoop --loglevel option to change log level. Contributed by Aikira AJISAKA. 2014-11-12 21:41:19 -08:00
Allen Wittenauer
3dc28e2052 HADOOP-11092. hadoop shell commands should print usage if not given a class (aw) 2014-09-23 12:24:23 -07:00
Allen Wittenauer
d8774cc577 HADOOP-11013. CLASSPATH handling should be consolidated, debuggable (aw) 2014-08-28 18:09:25 -07:00
Allen Wittenauer
ef32d09030 [post-HADOOP-9902] mapred version is missing (Akira AJISAKA via aw)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1619201 13f79535-47bb-0310-9956-ffa450edef68
2014-08-20 18:47:10 +00:00
Allen Wittenauer
31467453ec HADOOP-9902. Shell script rewrite (aw)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1618847 13f79535-47bb-0310-9956-ffa450edef68
2014-08-19 12:11:17 +00:00
Devarajulu K
a202855af5 MAPREDUCE-5164. mapred job and queue commands omit HADOOP_CLIENT_OPTS. Contributed by Nemon Lou.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1522595 13f79535-47bb-0310-9956-ffa450edef68
2013-09-12 14:32:20 +00:00
Jason Darrell Lowe
cc536fe4da MAPREDUCE-5265. History server admin service to refresh user and superuser group mappings. Contributed by Ashwin Shankar
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1504645 13f79535-47bb-0310-9956-ffa450edef68
2013-07-18 20:41:14 +00:00
Jason Darrell Lowe
865d902bd1 MAPREDUCE-5380. Invalid mapred command should return non-zero exit code. Contributed by Stephen Chu
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1503957 13f79535-47bb-0310-9956-ffa450edef68
2013-07-17 00:29:50 +00:00
Suresh Srinivas
638801cce1 HADOOP-8952. Enhancements to support Hadoop on Windows Server and Windows Azure environments. Contributed by Ivan Mitic, Chuan Liu, Ramya Sunil, Bikas Saha, Kanna Karanam, John Gordon, Brandon Li, Chris Nauroth, David Lao, Sumadhur Reddy Bolli, Arpit Agarwal, Ahmed El Baz, Mike Liddell, Jing Zhao, Thejas Nair, Steve Maine, Ganeshan Iyer, Raja Aluri, Giridharan Kesavan, Ramya Bharathi Nimmagadda.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1453486 13f79535-47bb-0310-9956-ffa450edef68
2013-03-06 19:15:18 +00:00
Aaron Myers
42e987f11e MAPREDUCE-5033. mapred shell script should respect usage flags (--help -help -h). Contributed by Andrew Wang.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1450584 13f79535-47bb-0310-9956-ffa450edef68
2013-02-27 02:42:38 +00:00
Siddharth Seth
803828b996 MAPREDUCE-4123. Remove the 'mapred groups' command, which is no longer supported. (Contributed by Devaraj K)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1393775 13f79535-47bb-0310-9956-ffa450edef68
2012-10-03 21:24:55 +00:00
Harsh J
34becbb019 HADOOP-8386. hadoop script doesn't work if 'cd' prints to stdout (default behavior in Ubuntu). Contributed by Christopher Berner and Andy Isaacson. (harsh)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1391780 13f79535-47bb-0310-9956-ffa450edef68
2012-09-29 10:57:53 +00:00
Arun Murthy
1eedee177e MAPREDUCE-4649. Ensure MapReduce JobHistory Daemon doens't assume HADOOP_YARN_HOME and HADOOP_MAPRED_HOME are the same. Contributed by Vinod K V.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1390224 13f79535-47bb-0310-9956-ffa450edef68
2012-09-25 23:49:09 +00:00
Vinod Kumar Vavilapalli
7c1e857176 MAPREDUCE-3954. Added new envs to separate heap size for different daemons started via bin scripts. Contributed by Robert Joseph Evans.
.--This line, and those below, will be ignored--

M    hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ClusterSetup.apt.vm
M    hadoop-mapreduce-project/hadoop-yarn/bin/yarn
M    hadoop-mapreduce-project/CHANGES.txt
M    hadoop-mapreduce-project/bin/mapred


git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1297174 13f79535-47bb-0310-9956-ffa450edef68
2012-03-05 19:03:56 +00:00
Robert Joseph Evans
4e64d7b447 MAPREDUCE-3918 proc_historyserver no longer in command line arguments for HistoryServer (Jon Eagles via bobby)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1293469 13f79535-47bb-0310-9956-ffa450edef68
2012-02-24 23:00:17 +00:00
Arun Murthy
6e376a39a0 MAPREDUCE-3817. Fixed bin/mapred to allow running of distcp and archive jobs. Contributed by Arpit Gupta.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1241323 13f79535-47bb-0310-9956-ffa450edef68
2012-02-07 01:49:50 +00:00
Arun Murthy
4f6839f23d MAPREDUCE-3354. Changed scripts so that jobhistory server is started by bin/mapred instead of bin/yarn. Contributed by Jonathan Eagles.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1241250 13f79535-47bb-0310-9956-ffa450edef68
2012-02-06 23:11:26 +00:00
Alejandro Abdelnur
8a234f394e HADOOP-7939. Improve Hadoop subcomponent integration in Hadoop 0.23. (rvs via tucu)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1236929 13f79535-47bb-0310-9956-ffa450edef68
2012-01-27 23:53:35 +00:00
Robert Joseph Evans
1149d9a13d MAPREDUCE-3194. "mapred mradmin" command is broken in mrv2 (Jason Lowe via bobby)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1235956 13f79535-47bb-0310-9956-ffa450edef68
2012-01-25 21:22:33 +00:00
Thomas White
bd2e2aaf99 MAPREDUCE-3373. Hadoop scripts unconditionally source "$bin"/../libexec/hadoop-config.sh. Contributed by Bruno Mahé
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1203455 13f79535-47bb-0310-9956-ffa450edef68
2011-11-18 00:57:08 +00:00
Eric Yang
da1db28e93 HADOOP-7740. Fixed security audit logger configuration. (Arpit Gupta via Eric Yang)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1190452 13f79535-47bb-0310-9956-ffa450edef68
2011-10-28 17:08:09 +00:00
Arun Murthy
caf07897a7 MAPREDUCE-2736. Resurrecting bin/mapred and bin/mapred-config.sh.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1190128 13f79535-47bb-0310-9956-ffa450edef68
2011-10-28 01:56:57 +00:00
Eli Collins
1ad8415b72 MAPREDUCE-2736. Remove unused contrib components dependent on MR1. Contributed by Eli Collins
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1189982 13f79535-47bb-0310-9956-ffa450edef68
2011-10-27 20:06:26 +00:00
Arun Murthy
cd7157784e HADOOP-7560. Change src layout to be heirarchical. Contributed by Alejandro Abdelnur.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1161332 13f79535-47bb-0310-9956-ffa450edef68
2011-08-25 00:14:24 +00:00