Commit Graph

65 Commits

Author SHA1 Message Date
Steve Loughran
c9270600b7
MAPREDUCE-7474. Improve Manifest committer resilience (#6716)
Improve task commit resilience everywhere
and add an option to reduce delete IO requests on
job cleanup (relevant for ABFS and HDFS).

Task Commit Resilience
----------------------

Task manifest saving is re-attempted on failure; the number of 
attempts made is configurable with the option:

  mapreduce.manifest.committer.manifest.save.attempts

* The default is 5.
* The minimum is 1; asking for less is ignored.
* A retry policy adds 500ms of sleep per attempt.
* Move from classic rename() to commitFile() to rename the file,
  after calling getFileStatus() to get its length and possibly etag.
  This becomes a rename() on gcs/hdfs anyway, but on abfs it does reach
  the ResilientCommitByRename callbacks in abfs, which report on
  the outcome to the caller...which is then logged at WARN.
* New statistic task_stage_save_summary_file to distinguish from
  other saving operations (job success/report file).
  This is only saved to the manifest on task commit retries, and
  provides statistics on all previous unsuccessful attempts to save
  the manifests
+ test changes to match the codepath changes, including improvements
  in fault injection.

Directory size for deletion
---------------------------

New option

  mapreduce.manifest.committer.cleanup.parallel.delete.base.first

This attempts an initial attempt at deleting the base dir, only falling
back to parallel deletes if there's a timeout.

This option is disabled by default; Consider enabling it for abfs to
reduce IO load. Consult the documentation for more details.

Success file printing
---------------------

The command to print a JSON _SUCCESS file from this committer and
any S3A committer is now something which can be invoked from
the mapred command:

  mapred successfile <path to file>

Contributed by Steve Loughran
2024-05-13 21:12:34 +01:00
Masatake Iwasaki
8fd0fdf889
MAPREDUCE-7281. Fix NoClassDefFoundError on 'mapred minicluster'. (#2077) 2020-06-20 07:37:55 +09:00
Vrushali C
c191538ed1 HADOOP-15166 CLI MiniCluster fails with ClassNotFoundException o.a.h.yarn.server.timelineservice.collector.TimelineCollectorManager. Contributed by Gera Shegalov 2018-01-19 16:15:55 -08:00
Arpit Agarwal
e00c7f78c1 HADOOP-14976. Set HADOOP_SHELL_EXECNAME explicitly in scripts. 2017-12-04 21:02:04 -08:00
Robert Kanter
3b78607a02 MAPREDUCE-6994. Uploader tool for Distributed Cache Deploy code changes (miklos.szegedi@cloudera.com via rkanter) 2017-12-01 12:12:15 -08:00
Sean Mackrory
1a1bf6b7d0 HADOOP-13595. Rework hadoop_usage to be broken up by clients/daemons/etc. Contributed by Allen Wittenauer. 2017-08-02 12:25:05 -06:00
Robert Kanter
2b87faf166 MAPREDUCE-6904. HADOOP_JOB_HISTORY_OPTS should be HADOOP_JOB_HISTORYSERVER_OPTS in mapred-config.sh (rkanter) 2017-06-26 17:35:55 -07:00
Allen Wittenauer
96cbb4fce2 HADOOP-14202. fix jsvc/secure user var inconsistencies
Signed-off-by: John Zhuge <jzhuge@apache.org>
2017-04-07 08:59:21 -07:00
Allen Wittenauer
0eb4b513b7 HADOOP-13673. Update scripts to be smarter when running with privilege
Signed-off-by: Andrew Wang <wang@apache.org>
Signed-off-by: Ravi Prakash <raviprak@apache.org>
2017-01-18 14:39:05 -08:00
Allen Wittenauer
58ed4fa544 HADOOP-13341. Deprecate HADOOP_SERVERNAME_OPTS; replace with (command)_(subcommand)_OPTS
This commit includes the following changes:

	HADOOP-13356. Add a function to handle command_subcommand_OPTS
	HADOOP-13355. Handle HADOOP_CLIENT_OPTS in a function
	HADOOP-13554. Add an equivalent of hadoop_subcmd_opts for secure opts
	HADOOP-13562. Change hadoop_subcommand_opts to use only uppercase
	HADOOP-13358. Modify HDFS to use hadoop_subcommand_opts
	HADOOP-13357. Modify common to use hadoop_subcommand_opts
	HADOOP-13359. Modify YARN to use hadoop_subcommand_opts
	HADOOP-13361. Modify hadoop_verify_user to be consistent with hadoop_subcommand_opts (ie more granularity)
	HADOOP-13564. modify mapred to use hadoop_subcommand_opts
	HADOOP-13563. hadoop_subcommand_opts should print name not actual content during debug
	HADOOP-13360. Documentation for HADOOP_subcommand_OPTS

This closes apache/hadoop#126
2016-09-12 11:10:00 -07:00
Allen Wittenauer
730bc746f9 HADOOP-12930. Dynamic subcommands for hadoop shell scripts (aw)
This commit contains the following JIRA issues:

    HADOOP-12931. bin/hadoop work for dynamic subcommands
    HADOOP-12932. bin/yarn work for dynamic subcommands
    HADOOP-12933. bin/hdfs work for dynamic subcommands
    HADOOP-12934. bin/mapred work for dynamic subcommands
    HADOOP-12935. API documentation for dynamic subcommands
    HADOOP-12936. modify hadoop-tools to take advantage of dynamic subcommands
    HADOOP-13086. enable daemonization of dynamic commands
    HADOOP-13087. env var doc update for dynamic commands
    HADOOP-13088. fix shellprofiles in hadoop-tools to allow replacement
    HADOOP-13089. hadoop distcp adds client opts twice when dynamic
    HADOOP-13094. hadoop-common unit tests for dynamic commands
    HADOOP-13095. hadoop-hdfs unit tests for dynamic commands
    HADOOP-13107. clean up how rumen is executed
    HADOOP-13108. dynamic subcommands need a way to manipulate arguments
    HADOOP-13110. add a streaming subcommand to mapred
    HADOOP-13111. convert hadoop gridmix to be dynamic
    HADOOP-13115. dynamic subcommand docs should talk about exit vs. continue program flow
    HADOOP-13117. clarify daemonization and security vars for dynamic commands
    HADOOP-13120. add a --debug message when dynamic commands have been used
    HADOOP-13121. rename sub-project shellprofiles to match the rest of Hadoop
    HADOOP-13129. fix typo in dynamic subcommand docs
    HADOOP-13151. Underscores should be escaped in dynamic subcommands document
    HADOOP-13153. fix typo in debug statement for dynamic subcommands
2016-05-16 17:54:45 -07:00
Allen Wittenauer
0a74610d1c HADOOP-11393. Revert HADOOP_PREFIX, go back to HADOOP_HOME (aw) 2016-03-31 07:51:05 -07:00
Allen Wittenauer
738155063e HADOOP-12857. rework hadoop-tools (aw) 2016-03-23 13:46:38 -07:00
Allen Wittenauer
b76b0ce51e HADOOP-12366. expose calculated paths (aw) 2015-11-07 08:32:56 -08:00
Varun Vasudev
73b9c7b82b HADOOP-10787. Rename/remove non-HADOOP_*, etc from the shell scripts. Contributed by Allen Wittenauer. 2015-11-04 15:56:17 +05:30
Karthik Kambatla
119cc75e7e MAPREDUCE-6415. Create a tool to combine aggregated logs into HAR files. (Robert Kanter via kasha) 2015-09-09 17:45:19 -07:00
Allen Wittenauer
666cafca8d HADOOP-12249. pull argument parsing into a function (aw) 2015-07-31 14:32:21 -07:00
Allen Wittenauer
ee36f4f9b8 HADOOP-10979. Auto-entries in hadoop_usage (aw) 2015-07-16 16:58:11 -07:00
cnauroth
4528eb9fb2 HADOOP-11524. hadoop_do_classpath_subcommand throws a shellcheck warning. Contributed by Chris Nauroth. 2015-03-25 22:36:09 -07:00
Allen Wittenauer
93b941c637 HADOOP-11565. Add --slaves shell option (aw) 2015-02-12 18:01:28 -08:00
Allen Wittenauer
6f5290b030 MAPREDUCE-6250. deprecate sbin/mr-jobhistory-daemon.sh (aw) 2015-02-12 13:48:47 -08:00
Allen Wittenauer
43d5caef5e HADOOP-11460. Deprecate shell vars (John Smith via aw) 2015-02-04 16:35:50 -08:00
cnauroth
0742591335 MAPREDUCE-3283. mapred classpath CLI does not display the complete classpath. Contributed by Varun Saxena. 2015-01-21 13:50:39 -08:00
Allen Wittenauer
c536142699 HADOOP-6590. Add a username check for hadoop sub-commands (John Smith via aw) 2014-12-10 13:41:28 -08:00
Allen Wittenauer
a7c6c710b2 HADOOP-10950. rework heap management vars (John Smith via aw) 2014-12-10 13:37:32 -08:00
Allen Wittenauer
72c141ba96 HADOOP-11208. Replace "daemon" with better name in script subcommands (aw) 2014-11-19 14:49:29 -08:00
Jason Lowe
3baaa42945 MAPREDUCE-6161. mapred hsadmin command missing from trunk. Contributed by Allen Wittenauer 2014-11-14 21:10:06 +00:00
cnauroth
0abb973f09 HADOOP-7984. Add hadoop --loglevel option to change log level. Contributed by Aikira AJISAKA. 2014-11-12 21:41:19 -08:00
Allen Wittenauer
3dc28e2052 HADOOP-11092. hadoop shell commands should print usage if not given a class (aw) 2014-09-23 12:24:23 -07:00
Allen Wittenauer
7971c97ec1 HADOOP-11022. User replaced functions get lost 2-3 levels deep (e.g., sbin) (aw) 2014-09-16 16:06:23 -07:00
Allen Wittenauer
d8774cc577 HADOOP-11013. CLASSPATH handling should be consolidated, debuggable (aw) 2014-08-28 18:09:25 -07:00
Allen Wittenauer
9ec4a930f5 HADOOP-10996. Stop violence in the *_HOME (aw) 2014-08-27 07:00:31 -07:00
Allen Wittenauer
ef32d09030 [post-HADOOP-9902] mapred version is missing (Akira AJISAKA via aw)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1619201 13f79535-47bb-0310-9956-ffa450edef68
2014-08-20 18:47:10 +00:00
Allen Wittenauer
31467453ec HADOOP-9902. Shell script rewrite (aw)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1618847 13f79535-47bb-0310-9956-ffa450edef68
2014-08-19 12:11:17 +00:00
Vinayakumar B
dc31d66f8a HADOOP-9921. daemon scripts should remove pid file on stop call after stop or process is found not running ( Contributed by Vinayakumar B)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1610964 13f79535-47bb-0310-9956-ffa450edef68
2014-07-16 10:48:15 +00:00
Zhijie Shen
f9df4d7377 MAPREDUCE-5818. Added "hsadmin" command into mapred.cmd. Contributed by Jian He.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1584378 13f79535-47bb-0310-9956-ffa450edef68
2014-04-03 20:23:51 +00:00
Chris Nauroth
ed45d97ed7 MAPREDUCE-5546. mapred.cmd on Windows set HADOOP_OPTS incorrectly. Contributed by Chuan Liu.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1532016 13f79535-47bb-0310-9956-ffa450edef68
2013-10-14 18:36:34 +00:00
Chris Nauroth
11716cf390 HADOOP-10040. Revert svn propset for CRLF line endings on Windows files. Contributed by Chris Nauroth.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1531977 13f79535-47bb-0310-9956-ffa450edef68
2013-10-14 17:01:33 +00:00
Chris Nauroth
bbac0cf05e HADOOP-10040. hadoop.cmd in UNIX format and would not run by default on Windows. Contributed by Chris Nauroth.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1531491 13f79535-47bb-0310-9956-ffa450edef68
2013-10-12 03:14:24 +00:00
Devarajulu K
a202855af5 MAPREDUCE-5164. mapred job and queue commands omit HADOOP_CLIENT_OPTS. Contributed by Nemon Lou.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1522595 13f79535-47bb-0310-9956-ffa450edef68
2013-09-12 14:32:20 +00:00
Jason Darrell Lowe
cc536fe4da MAPREDUCE-5265. History server admin service to refresh user and superuser group mappings. Contributed by Ashwin Shankar
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1504645 13f79535-47bb-0310-9956-ffa450edef68
2013-07-18 20:41:14 +00:00
Jason Darrell Lowe
865d902bd1 MAPREDUCE-5380. Invalid mapred command should return non-zero exit code. Contributed by Stephen Chu
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1503957 13f79535-47bb-0310-9956-ffa450edef68
2013-07-17 00:29:50 +00:00
Chris Nauroth
e4e0499fc9 MAPREDUCE-5187. Create mapreduce command scripts on Windows. Contributed by Chuan Liu.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1500929 13f79535-47bb-0310-9956-ffa450edef68
2013-07-08 20:27:12 +00:00
Suresh Srinivas
638801cce1 HADOOP-8952. Enhancements to support Hadoop on Windows Server and Windows Azure environments. Contributed by Ivan Mitic, Chuan Liu, Ramya Sunil, Bikas Saha, Kanna Karanam, John Gordon, Brandon Li, Chris Nauroth, David Lao, Sumadhur Reddy Bolli, Arpit Agarwal, Ahmed El Baz, Mike Liddell, Jing Zhao, Thejas Nair, Steve Maine, Ganeshan Iyer, Raja Aluri, Giridharan Kesavan, Ramya Bharathi Nimmagadda.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1453486 13f79535-47bb-0310-9956-ffa450edef68
2013-03-06 19:15:18 +00:00
Aaron Myers
42e987f11e MAPREDUCE-5033. mapred shell script should respect usage flags (--help -help -h). Contributed by Andrew Wang.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1450584 13f79535-47bb-0310-9956-ffa450edef68
2013-02-27 02:42:38 +00:00
Thomas Graves
a047fa9b23 MAPREDUCE-4712. mr-jobhistory-daemon.sh doesn't accept --config (Vinod Kumar Vavilapalli via tgraves)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1395724 13f79535-47bb-0310-9956-ffa450edef68
2012-10-08 19:03:43 +00:00
Siddharth Seth
803828b996 MAPREDUCE-4123. Remove the 'mapred groups' command, which is no longer supported. (Contributed by Devaraj K)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1393775 13f79535-47bb-0310-9956-ffa450edef68
2012-10-03 21:24:55 +00:00
Harsh J
34becbb019 HADOOP-8386. hadoop script doesn't work if 'cd' prints to stdout (default behavior in Ubuntu). Contributed by Christopher Berner and Andy Isaacson. (harsh)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1391780 13f79535-47bb-0310-9956-ffa450edef68
2012-09-29 10:57:53 +00:00
Arun Murthy
1eedee177e MAPREDUCE-4649. Ensure MapReduce JobHistory Daemon doens't assume HADOOP_YARN_HOME and HADOOP_MAPRED_HOME are the same. Contributed by Vinod K V.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1390224 13f79535-47bb-0310-9956-ffa450edef68
2012-09-25 23:49:09 +00:00
Aaron Myers
cc9c6bdce2 HADOOP-8353. hadoop-daemon.sh and yarn-daemon.sh can be misleading on stop. Contributed by Roman Shaposhnik.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1337251 13f79535-47bb-0310-9956-ffa450edef68
2012-05-11 16:15:18 +00:00