Jason Lowe
fdf02d1f26
YARN-3619. ContainerMetrics unregisters during getMetrics and leads to ConcurrentModificationException. Contributed by Zhihai Xu
2015-10-02 20:20:31 +00:00
Jason Lowe
854d25b0c3
YARN-3727. For better error recovery, check if the directory exists before using it for localization. Contributed by Zhihai Xu
2015-09-30 14:59:44 +00:00
Rohith Sharma K S
8ed0d4b744
YARN-4152. NodeManager crash with NPE when LogAggregationService#stopContainer called for absent container. (Bibin A Chundatt via rohithsharmaks)
2015-09-24 11:24:14 +05:30
Jian He
c57eac5dfe
YARN-3868. Recovery support for container resizing. Contributed by Meng Ding
2015-09-23 13:29:38 -07:00
Jian He
c3dc1af072
YARN-1644. RM-NM protocol changes and NodeStatusUpdater implementation to support container resizing. Contributed by Meng Ding
2015-09-23 13:29:37 -07:00
Jian He
c59ae4eeb1
YARN-1643. Make ContainersMonitor support changing monitoring size of an allocated container. Contributed by Meng Ding and Wangda Tan
2015-09-23 13:29:37 -07:00
Jian He
5f5a968d65
YARN-3867. ContainerImpl changes to support container resizing. Contributed by Meng Ding
2015-09-23 13:29:37 -07:00
Jian He
ffd820c27a
YARN-1645. ContainerManager implementation to support container resizing. Contributed by Meng Ding & Wangda Tan
2015-09-23 13:29:37 -07:00
Jian He
83a18add10
YARN-1449. AM-NM protocol changes to support container resizing. Contributed by Meng Ding & Wangda Tan)
2015-09-23 13:29:36 -07:00
Jason Lowe
c890c51a91
YARN-4095. Avoid sharing AllocatorPerContext object in LocalDirAllocator between ShuffleHandler and LocalDirsHandlerService. Contributed by Zhihai Xu
2015-09-23 15:42:01 +00:00
Xuan
34ef1a092b
YARN-4149. yarn logs -am should provide an option to fetch all the log
...
files. Contributed by Varun Vasudev
2015-09-15 14:36:30 -07:00
Jason Lowe
8c1cdb17a0
YARN-4158. Remove duplicate close for LogWriter in AppLogAggregatorImpl#uploadLogsForContainers. Contributed by Zhihai Xu
2015-09-15 20:21:33 +00:00
Varun Vasudev
486d5cb803
YARN-4136. LinuxContainerExecutor loses info when forwarding ResourceHandlerException. Contributed by Bibin A Chundatt.
2015-09-11 14:37:48 +05:30
Wangda Tan
77666105b4
YARN-4106. NodeLabels for NM in distributed mode is not updated even after clusterNodelabel addition in RM. (Bibin A Chundatt via wangda)
2015-09-10 09:30:09 -07:00
Zhihai Xu
16b9037dc1
YARN-4096. App local logs are leaked if log aggregation fails to initialize for the app. Contributed by Jason Lowe.
2015-09-08 12:29:54 -07:00
Jian He
6f72f1e600
YARN-2884. Added a proxy service in NM to proxy the the communication between AM and RM. Contributed by Kishore Chaliparambil
2015-09-08 09:35:46 +08:00
Varun Vasudev
1dbd8e34a7
YARN-3591. Resource localization on a bad disk causes subsequent containers failure. Contributed by Lavkesh Lahngir.
2015-09-07 11:32:12 +05:30
Rohith Sharma K S
095ab9ab5f
YARN-4073. Removed unused ApplicationACLsManager in ContainerManagerImpl constructor. (Naganarasimha G R via rohithsharmaks)
2015-09-02 14:13:33 +05:30
Xuan
b71c6006f5
YARN-221. Addendum patch to compilation issue which is caused by missing
...
AllContainerLogAggregationPolicy. Contributed by Xuan Gong
2015-08-23 16:46:30 -07:00
Xuan
37e1c3d82a
YARN-221. NM should provide a way for AM to tell it not to aggregate
...
logs. Contributed by Ming Ma
2015-08-22 16:25:24 -07:00
Wangda Tan
fc07464d1a
YARN-2923. Support configuration based NodeLabelsProvider Service in Distributed Node Label Configuration Setup. (Naganarasimha G R)
2015-08-20 11:51:03 -07:00
Zhihai Xu
14215c8ef8
YARN-4057. If ContainersMonitor is not enabled, only print related log info one time. Contributed by Jun Gong.
2015-08-18 11:36:02 -07:00
Karthik Kambatla
13604bd5f1
YARN-4055. Report node resource utilization in heartbeat. (Inigo Goiri via kasha)
2015-08-16 15:08:53 -07:00
Karthik Kambatla
def12933b3
YARN-3534. Collect memory/cpu usage on the node. (Inigo Goiri via kasha)
2015-08-16 06:24:16 -07:00
Jian He
38aed1a94e
YARN-4005. Completed container whose app is finished is possibly not removed from NMStateStore. Contributed by Jun Gong
2015-08-13 14:46:08 -07:00
Junping Du
cfee02b3bd
YARN-4019. Add JvmPauseMonitor to ResourceManager and NodeManager. Contributed by Robert Kanter.
2015-08-06 06:49:45 -07:00
Xuan
c3364ca8e7
YARN-4004. container-executor should print output of docker logs if the
...
docker container exits with non-0 exit status. Contributed by Varun
Vasudev
2015-08-03 18:10:11 -07:00
Jason Lowe
469cfcd695
YARN-3965. Add startup timestamp to nodemanager UI. Contributed by Hong Zhiguo
2015-08-03 15:53:32 +00:00
Xuan
f170934215
YARN-3982. container-executor parsing of container-executor.cfg broken
...
in trunk and branch-2. Contributed by Varun Vasudev
2015-07-27 23:45:58 -07:00
Varun Vasudev
3e6fce91a4
YARN-3853. Add docker container runtime support to LinuxContainterExecutor. Contributed by Sidharta Seethana.
2015-07-27 11:57:40 -07:00
Varun Vasudev
f36835ff9b
YARN-3852. Add docker container support to container-executor. Contributed by Abin Shahab.
2015-07-27 10:14:51 -07:00
Jason Lowe
ff9c13e0a7
YARN-3925. ContainerLogsUtils#getContainerLogFile fails to read container log files from full disks. Contributed by zhihai xu
2015-07-24 22:14:39 +00:00
Colin Patrick Mccabe
419c51d233
YARN-3844. Make hadoop-yarn-project Native code -Wall-clean (Alan Burlison via Colin P. McCabe)
2015-07-17 11:38:59 -07:00
Akira Ajisaka
19295b36d9
YARN-3381. Fix typo InvalidStateTransitonException. Contributed by Brahma Reddy Battula.
2015-07-13 17:52:13 +09:00
Zhijie Shen
1ea36299a4
YARN-3116. RM notifies NM whether a container is an AM container or normal task container. Contributed by Giovanni Matteo Fumarola.
2015-07-10 18:58:10 -07:00
Karthik Kambatla
527c40e4d6
YARN-1012. Report NM aggregated container resource utilization in heartbeat. (Inigo Goiri via kasha)
2015-07-09 09:35:14 -07:00
Varun Vasudev
c40bdb56a7
YARN-2194. Fix bug causing CGroups functionality to fail on RHEL7. Contributed by Wei Yan.
2015-07-07 16:59:29 +05:30
Jason Lowe
b5cdf78e8e
YARN-3793. Several NPEs when deleting local files on NM recovery. Contributed by Varun Saxena
2015-07-01 21:13:32 +00:00
Jason Lowe
40b256949a
YARN-3850. NM fails to read files from full disks which can lead to container logs being lost and other issues. Contributed by Varun Saxena
2015-06-26 15:47:07 +00:00
Jason Lowe
8d58512d6e
YARN-3832. Resource Localization fails on a cluster due to existing cache directories. Contributed by Brahma Reddy Battula
2015-06-24 16:37:39 +00:00
Xuan
6c7a9d502a
YARN-3834. Scrub debug logging of tokens during resource localization. Contributed by Chris Nauroth
2015-06-21 17:13:44 -07:00
Junping Du
d7e7f6aa03
YARN-41. The RM should handle the graceful shutdown of the NM. Contributed by Devaraj K.
2015-06-04 04:59:27 -07:00
Jason Lowe
e13b671aa5
YARN-3585. NodeManager cannot exit on SHUTDOWN event triggered and NM recovery is enabled. Contributed by Rohith Sharmaks
2015-06-03 19:44:07 +00:00
Robert Kanter
6aec13cb33
YARN-3713. Remove duplicate function call storeContainerDiagnostics in ContainerDiagnosticsUpdateTransition (zxu via rkanter)
2015-05-29 15:34:37 -07:00
cnauroth
4102e5882e
YARN-3626. On Windows localized resources are not moved to the front of the classpath when they should be. Contributed by Craig Welch.
2015-05-27 14:31:49 -07:00
Vinod Kumar Vavilapalli
500a1d9c76
YARN-160. Enhanced NodeManager to automatically obtain cpu/memory values from underlying OS when configured to do so. Contributed by Varun Vasudev.
2015-05-26 11:38:35 -07:00
Junping Du
132d909d4a
YARN-3594. WintuilsProcessStubExecutor.startStreamReader leaks streams. Contributed by Lars Francke.
2015-05-22 04:23:25 -07:00
Vinod Kumar Vavilapalli
53fafcf061
YARN-3684. Changed ContainerExecutor's primary lifecycle methods to use a more extensible mechanism of context objects. Contributed by Sidharta Seethana.
2015-05-21 15:50:23 -07:00
Jian He
6329bd00fa
YARN-3654. ContainerLogsPage web UI should not have meta-refresh. Contributed by Xuan Gong
2015-05-20 17:20:21 -07:00
Wangda Tan
b37da52a1c
YARN-3565. NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object instead of String. (Naganarasimha G R via wangda)
2015-05-19 16:34:17 -07:00
Colin Patrick Mccabe
470c87dbc6
HADOOP-11970. Replace uses of ThreadLocal<Random> with JDK7 ThreadLocalRandom (Sean Busbey via Colin P. McCabe)
2015-05-19 10:50:15 -07:00
Junping Du
03a293aed6
YARN-3505 addendum: fix an issue in previous patch.
2015-05-15 06:39:39 -07:00
Ravi Prakash
53fe4eff09
YARN-1519. Check in container-executor if sysconf is implemented before using it (Radim Kolar and Eric Payne via raviprak)
2015-05-14 15:55:37 -07:00
Junping Du
15ccd967ee
YARN-3505. Node's Log Aggregation Report with SUCCEED should not cached in RMApps. Contributed by Xuan Gong.
2015-05-14 10:58:12 -07:00
Jason Lowe
711d77cc54
YARN-3641. NodeManager: stopRecoveryStore() shouldn't be skipped when exceptions happen in stopping NM's sub-services. Contributed by Junping Du
2015-05-13 21:06:47 +00:00
Xuan
0f95921447
YARN-3626. On Windows localized resources are not moved to the front of the classpath when they should be. Contributed by Craig Welch
2015-05-13 13:10:53 -07:00
Devaraj K
5c2f05cd9b
YARN-3629. NodeID is always printed as "null" in node manager
...
initialization log. Contributed by nijel.
2015-05-12 22:20:25 +05:30
Devaraj K
8badd82ce2
YARN-3513. Remove unused variables in ContainersMonitorImpl and add debug
...
log for overall resource usage by all containers. Contributed by
Naganarasimha G R.
2015-05-12 16:54:38 +05:30
Xuan
6471d18bc7
YARN-1912. ResourceLocalizer started without any jvm memory control.
...
Contributed by Masatake Iwasaki
2015-05-08 20:01:21 -07:00
Jason Lowe
25e2b02122
YARN-3476. Nodemanager can fail to delete local logs if log aggregation fails. Contributed by Rohith
2015-05-08 22:45:52 +00:00
Xuan
088156de43
YARN-2331. Distinguish shutdown during supervision vs. shutdown for
...
rolling upgrade. Contributed by Jason Lowe
2015-05-08 15:10:43 -07:00
Robert (Bobby) Evans
bcf2890502
YARN-644: Basic null check is not performed on passed in arguments before using them in ContainerManagerImpl.startContainer
2015-05-08 11:11:01 -05:00
Robert Kanter
b72507810a
YARN-3491. PublicLocalizer#addResource is too slow. (zxu via rkanter)
2015-05-06 14:19:06 -07:00
Junping Du
3810242062
YARN-3396. Handle URISyntaxException in ResourceLocalizationService. (Contributed by Brahma Reddy Battula)
2015-05-05 10:18:23 -07:00
Wangda Tan
71f4de220c
YARN-3375. NodeHealthScriptRunner.shouldRun() check is performing 3 times for starting NodeHealthScriptRunner (Devaraj K via wangda)
2015-05-04 15:49:19 -07:00
Jason Lowe
8f65c793f2
YARN-3097. Logging of resource recovery on NM restart has redundancies. Contributed by Eric Payne
2015-05-04 15:31:15 +00:00
Robert Kanter
ac7d152901
YARN-3363. add localization and container launch time to ContainerMetrics at NM to show these timing information for each active container. (zxu via rkanter)
2015-05-01 16:39:21 -07:00
Vinod Kumar Vavilapalli
1b3b9e5c31
YARN-2619. Added NodeManager support for disk io isolation through cgroups. Contributed by Varun Vasudev and Wei Yan.
2015-04-30 21:41:07 -07:00
Karthik Kambatla
47279c3228
YARN-3464. Race condition in LocalizerRunner kills localizer before localizing all resources. (Zhihai Xu via kasha)
2015-04-26 09:13:46 -07:00
Jason Lowe
5e093f0d40
YARN-3537. NPE when NodeManager.serviceInit fails and stopRecoveryStore invoked. Contributed by Brahma Reddy Battula
2015-04-24 22:02:53 +00:00
Xuan
0b3f8957a8
YARN-3516. killing ContainerLocalizer action doesn't take effect when
...
private localizer receives FETCH_FAILURE status. Contributed by zhihai
xu
2015-04-23 16:40:40 -07:00
Vinod Kumar Vavilapalli
a100be685c
YARN-3366. Enhanced NodeManager to support classifying/shaping outgoing network bandwidth traffic originating from YARN containers Contributed by Sidharta Seethana.
2015-04-22 17:26:13 -07:00
Jian He
674c7ef649
YARN-3503. Expose disk utilization percentage and bad local and log dir counts in NM metrics. Contributed by Varun Vasudev
2015-04-21 20:57:02 -07:00
Junping Du
1db355a875
YARN-1402. Update related Web UI and CLI with exposing client API to check log aggregation status. Contributed by Xuan Gong.
2015-04-17 13:18:59 -07:00
Jian He
1b89a3e173
YARN-3354. Add node label expression in ContainerTokenIdentifier to support RM recovery. Contributed by Wangda Tan
2015-04-15 13:57:06 -07:00
Junping Du
838b06ac87
YARN-3443. Create a 'ResourceHandler' subsystem to ease addition of support for new resource types on the NM. Contributed by Sidharta Seethana.
2015-04-13 18:35:56 -07:00
Junping Du
92431c9617
YARN-1376. NM need to notify the log aggregation status to RM through Node heartbeat. Contributed by Xuan Gong.
2015-04-10 08:56:18 -07:00
Karthik Kambatla
6495940eae
YARN-3465. Use LinkedHashMap to preserve order of resource requests. (Zhihai Xu via kasha)
2015-04-09 00:07:49 -07:00
Tsuyoshi Ozawa
dd852f5b8c
YARN-3457. NPE when NodeManager.serviceInit fails and stopRecoveryStore called. Contributed by Bibin A Chundatt.
2015-04-08 15:56:18 +09:00
Wangda Tan
bad070fe15
YARN-2901. Add errors and warning metrics page to RM, NM web UI. (Varun Vasudev via wangda)
2015-04-02 17:23:20 -07:00
Vinod Kumar Vavilapalli
b21c72777a
YARN-3365. Enhanced NodeManager to support using the 'tc' tool via container-executor for outbound network traffic control. Contributed by Sidharta Seethana.
2015-04-02 16:53:59 -07:00
Tsuyoshi Ozawa
c69ba81497
YARN-3424. Change logs for ContainerMonitorImpl's resourse monitoring from info to debug. Contributed by Anubhav Dhoot.
2015-04-01 17:44:25 +09:00
Karthik Kambatla
2daa478a64
YARN-3428. Debug log resources to be localized for a container. (kasha)
2015-03-31 17:34:47 -07:00
Wangda Tan
2a945d24f7
YARN-2495. Allow admin specify labels from each NM (Distributed configuration for node label). (Naganarasimha G R via wangda)
2015-03-30 12:05:21 -07:00
Vinod Kumar Vavilapalli
c358368f51
YARN-3304. Cleaning up ResourceCalculatorProcessTree APIs for public use and removing inconsistencies in the default values. Contributed by Junping Du and Karthik Kambatla.
2015-03-30 10:09:40 -07:00
Ravi Prakash
e0ccea33c9
YARN-3288. Document and fix indentation in the DockerContainerExecutor code
2015-03-28 08:00:41 -07:00
Junping Du
d81109e588
YARN-3269. Yarn.nodemanager.remote-app-log-dir could not be configured to fully qualified path. Contributed by Xuan Gong
2015-03-20 13:41:22 -07:00
Karthik Kambatla
20b49224eb
YARN-3351. AppMaster tracking URL is broken in HA. (Anubhav Dhoot via kasha)
2015-03-18 16:30:33 -07:00
Tsuyoshi Ozawa
3da9a97cfb
YARN-1453. [JDK8] Fix Javadoc errors caused by incorrect or illegal tags in doc comments. Contributed by Akira AJISAKA, Andrew Purtell, and Allen Wittenauer.
2015-03-16 23:19:05 +09:00
Vinod Kumar Vavilapalli
863079bb87
YARN-3154. Added additional APIs in LogAggregationContext to avoid aggregating running logs of application when rolling is enabled. Contributed by Xuan Gong.
2015-03-12 13:32:29 -07:00
Jian He
21101c01f2
YARN-2190. Added CPU and memory limit options to the default container executor for Windows containers. Contributed by Chuan Liu
2015-03-06 14:18:11 -08:00
Karthik Kambatla
53947f37c7
YARN-3122. Metrics for container's actual CPU usage. (Anubhav Dhoot via kasha)
2015-03-04 17:33:30 -08:00
Konstantin V Shvachko
8ca0d957c4
YARN-3255. RM, NM, JobHistoryServer, and WebAppProxyServer's main() should support generic options. Contributed by Konstantin Shvachko.
2015-02-26 17:12:19 -08:00
Allen Wittenauer
d4ac6822e1
YARN-2980. Move health check script related functionality to hadoop-common (Varun Saxena via aw)
2015-02-24 11:25:26 -08:00
Xuan
f56c65bb3e
YARN-3237. AppLogAggregatorImpl fails to log error cause. Contributed by
...
Rushabh S Shah
2015-02-20 14:02:40 -08:00
Tsuyoshi Ozawa
447bd7b5a6
YARN-3203. Correct a log message in AuxServices. Contributed by Brahma Reddy Battula.
2015-02-16 23:55:58 +09:00
Junping Du
ab0b958a52
YARN-2749. Fix some testcases from TestLogAggregationService fails in trunk. (Contributed by Xuan Gong)
2015-02-15 06:46:32 -08:00
Jason Lowe
1a0f508b63
YARN-2847. Linux native container executor segfaults if default banned user detected. Contributed by Olaf Flebbe
2015-02-13 20:20:55 +00:00
Akira Ajisaka
6a49e58cb8
YARN-3191. Log object should be initialized with its own class. Contributed by Rohith.
2015-02-12 17:58:54 -08:00
Junping Du
04f5ef18f7
YARN-2079. Recover NonAggregatingLogHandler state upon nodemanager restart. (Contributed by Jason Lowe)
2015-02-12 11:46:47 -08:00
Jason Lowe
b379972ab3
YARN-3074. Nodemanager dies when localizer runner tries to write to a full disk. Contributed by Varun Saxena
2015-02-11 16:33:43 +00:00
Jason Lowe
3f5431a22f
YARN-2809. Implement workaround for linux kernel panic when removing cgroup. Contributed by Nathan Roberts
2015-02-10 17:27:21 +00:00
Jason Lowe
4eb5f7fa32
YARN-3090. DeletionService can silently ignore deletion task failures. Contributed by Varun Saxena
2015-02-10 16:54:21 +00:00
Jason Lowe
4c484320b4
YARN-3089. LinuxContainerExecutor does not handle file arguments to deleteAsUser. Contributed by Eric Payne
2015-02-06 20:39:01 +00:00
Robert Kanter
f7a77819a1
YARN-3022. Expose Container resource information from NodeManager for monitoring (adhoot via ranter)
2015-02-03 10:39:41 -08:00
Akira Ajisaka
342efa110a
HADOOP-9907. Webapp http://hostname:port/metrics link is not working. (aajisaka)
2015-01-30 02:49:10 +09:00
Allen Wittenauer
9dd0b7a2ab
YARN-2428. LCE default banned user list should have yarn (Varun Saxena via aw)
2015-01-29 09:30:26 -08:00
Jian He
4e15fc0841
YARN-3011. Possible IllegalArgumentException in ResourceLocalizationService might lead NM to crash. Contributed by Varun Saxena
2015-01-27 13:31:22 -08:00
Jason Lowe
902c6ea7e4
YARN-3088. LinuxContainerExecutor.deleteAsUser can throw NPE if native executor returns an error. Contributed by Eric Payne
2015-01-26 15:40:21 +00:00
Xuan
0d6bd62102
YARN-3024. LocalizerRunner should give DIE action when all resources are
...
localized. Contributed by Chengbing Liu
2015-01-25 19:37:57 -08:00
Karthik Kambatla
84198564ba
YARN-2984. Metrics for container's actual memory usage. (kasha)
2015-01-17 05:44:04 +05:30
Jian He
cc2a745f7e
YARN-2997. Fixed NodeStatusUpdater to not send alreay-sent completed container statuses on heartbeat. Contributed by Chengbing Liu
2015-01-08 11:12:54 -08:00
Zhijie Shen
41a548a916
YARN-2937. Fixed new findbugs warnings in hadoop-yarn-nodemanager. Contributed by Varun Saxena.
2014-12-23 20:32:36 -08:00
Jian He
808cba3821
YARN-2952. Fixed incorrect version check in StateStore. Contributed by Rohith Sharmaks
2014-12-19 16:56:30 -08:00
Karthik Kambatla
954fb8581e
YARN-2675. containersKilled metrics is not updated when the container is killed during localization. (Zhihai Xu via kasha)
2014-12-19 16:02:20 -08:00
cnauroth
e996a1bfd4
HADOOP-11321. copyToLocal cannot save a file to an SMB share unless the user has Full Control permissions. Contributed by Chris Nauroth.
2014-12-16 15:29:22 -08:00
Karthik Kambatla
db73cc9124
YARN-2931. PublicLocalizer may fail until directory is initialized by LocalizeRunner. (Anubhav Dhoot via kasha)
2014-12-08 22:26:18 -08:00
Harsh J
a31e016491
YARN-2891. Failed Container Executor does not provide a clear error message. Contributed by Dustin Cote. (harsh)
2014-12-04 03:17:15 +05:30
Jason Lowe
03ab24aa01
MAPREDUCE-5932. Provide an option to use a dedicated reduce-side shuffle log. Contributed by Gera Shegalov
2014-12-03 17:02:14 +00:00
Junping Du
e65b7c5ff6
YARN-1156. Enhance NodeManager AllocatedGB and AvailableGB metrics for aggregation of decimal values. (Contributed by Tsuyoshi OZAWA)
2014-12-03 04:11:18 -08:00
Karthik Kambatla
233b61e495
YARN-2679. Add metric for container launch duration. (Zhihai Xu via kasha)
2014-11-21 14:22:21 -08:00
Jason Lowe
49c38898b0
YARN-2816. NM fail to start with NPE during container recovery. Contributed by Zhihai Xu
2014-11-14 21:25:59 +00:00
Jason Lowe
33ea5ae92b
YARN-2846. Incorrect persist exit code for running containers in reacquireContainer() that interrupted by NodeManager restart. Contributed by Junping Du
2014-11-13 16:11:04 +00:00
Zhijie Shen
be7bf956e9
YARN-2794. Fixed log messages about distributing system-credentials. Contributed by Jian He.
2014-11-12 11:07:57 -08:00
Karthik Kambatla
a04143039e
YARN-2236. [YARN-1492] Shared Cache uploader service on the Node Manager. (Chris Trezzo and Sanjin Lee via kasha)
2014-11-12 09:31:05 -08:00
Ravi Prakash
53f64ee516
YARN-1964. Create Docker analog of the LinuxContainerExecutor in YARN
2014-11-11 21:28:11 -08:00
Karthik Kambatla
061bc293c8
YARN-2735. diskUtilizationPercentageCutoff and diskUtilizationSpaceCutoff are initialized twice in DirectoryCollection. (Zhihai Xu via kasha)
2014-11-11 10:31:39 -08:00
Jason Lowe
c3d475070a
YARN-2825. Container leak on NM. Contributed by Jian He
2014-11-07 23:16:37 +00:00
cnauroth
06b797947c
YARN-2803. MR distributed cache not working correctly on Windows after NodeManager privileged account changes. Contributed by Craig Welch.
2014-11-07 12:29:39 -08:00
Vinod Kumar Vavilapalli
c5a46d4c8c
YARN-1922. Fixed NodeManager to kill process-trees correctly in the presence of races between the launch and the stop-container call and when root processes crash. Contributed by Billie Rinaldi.
2014-11-03 16:38:55 -08:00
Jason Lowe
6157ace547
YARN-2730. DefaultContainerExecutor runs only one localizer at a time. Contributed by Siqi Li
2014-11-03 20:37:47 +00:00
Vinod Kumar Vavilapalli
5c0381c96a
YARN-2790. Fixed a NodeManager bug that was causing log-aggregation to fail beyond HFDS delegation-token expiry even when RM is a proxy-user (YARN-2704). Contributed by Jian He.
2014-11-01 16:32:35 -07:00
Xuan
86ff28dea0
YARN-2701. Addendum patch. Potential race condition in startLocalizer when using LinuxContainerExecutor. Contributed by Xuan Gong
2014-10-31 14:36:25 -07:00
Jason Lowe
73e626ad91
YARN-2755. NM fails to clean up usercache_DEL_<timestamp> dirs after YARN-661. Contributed by Siqi Li
2014-10-30 15:10:27 +00:00
Zhijie Shen
8984e9b177
YARN-2741. Made NM web UI serve logs on the drive other than C: on Windows. Contributed by Craig Welch.
2014-10-28 14:11:19 -07:00
Vinod Kumar Vavilapalli
a16d022ca4
YARN-2704. Changed ResourceManager to optionally obtain tokens itself for the sake of localization and log-aggregation for long-running services. Contributed by Jian He.
2014-10-27 15:49:47 -07:00
Jian He
3b12fd6cfb
YARN-2198. Remove the need to run NodeManager as privileged account for Windows Secure Container Executor. Contributed by Remus Rusanu
2014-10-22 15:57:46 -07:00
cnauroth
6637e3cf95
YARN-2720. Windows: Wildcard classpath variables not expanded against resources contained in archives. Contributed by Craig Welch.
2014-10-21 12:33:21 -07:00
Jason Lowe
6f2028bd15
YARN-90. NodeManager should identify failed disks becoming good again. Contributed by Varun Vasudev
2014-10-21 17:31:13 +00:00
Jian He
2839365f23
YARN-2701. Potential race condition in startLocalizer when using LinuxContainerExecutor. Contributed by Xuan Gong
2014-10-20 18:45:47 -07:00
Jian He
0fd0ebae64
YARN-2682. Updated WindowsSecureContainerExecutor to not use DefaultContainerExecutor#getFirstApplicationDir and use getWorkingDir() instead. Contributed by Zhihai Xu
2014-10-16 18:14:34 -07:00
Jian He
0af1a2b5bc
YARN-2312. Deprecated old ContainerId#getId API and updated MapReduce to use ContainerId#getContainerId instead. Contributed by Tsuyoshi OZAWA
2014-10-15 15:22:07 -07:00
Karthik Kambatla
cc93e7e683
YARN-2566. DefaultContainerExecutor should pick a working directory randomly. (Zhihai Xu via kasha)
2014-10-13 16:32:01 -07:00
Jason Lowe
a56ea01002
YARN-2377. Localization exception stack traces are not passed as diagnostic info. Contributed by Gera Shegalov
2014-10-13 18:31:16 +00:00
Zhijie Shen
4aed2d8e91
YARN-2651. Spun off LogRollingInterval from LogAggregationContext. Contributed by Xuan Gong.
2014-10-13 10:54:09 -07:00
Zhijie Shen
cb81bac002
YARN-2583. Modified AggregatedLogDeletionService to be able to delete rolling aggregated logs. Contributed by Xuan Gong.
2014-10-10 00:11:30 -07:00
Vinod Kumar Vavilapalli
34cdcaad71
YARN-2468. Enhanced NodeManager to support log handling APIs (YARN-2569) for use by long running services. Contributed by Xuan Gong.
2014-10-03 12:15:40 -07:00
Jason Lowe
29f520052e
YARN-2624. Resource Localization fails on a cluster due to existing cache directories. Contributed by Anubhav Dhoot
2014-10-02 17:39:34 +00:00
Jian He
3ef1cf187f
YARN-2617. Fixed NM to not send duplicate container status whose app is not running. Contributed by Jun Gong
2014-10-02 10:04:09 -07:00
Zhijie Shen
52bbe0f11b
YARN-2630. Prevented previous AM container status from being acquired by the current restarted AM. Contributed by Jian He.
2014-10-01 15:38:11 -07:00
Vinod Kumar Vavilapalli
ba7f31c2ee
YARN-1972. Added a secure container-executor for Windows. Contributed by Remus Rusanu.
2014-10-01 10:14:41 -07:00
Jian He
5391919b09
YARN-668. Changed NMTokenIdentifier/AMRMTokenIdentifier/ContainerTokenIdentifier to use protobuf object as the payload. Contributed by Junping Du.
2014-09-26 17:48:41 -07:00
Zhijie Shen
c86674a3a4
YARN-2581. Passed LogAggregationContext to NM via ContainerTokenIdentifier. Contributed by Xuan Gong.
2014-09-24 17:50:26 -07:00
Allen Wittenauer
034df0e2eb
YARN-2161. Fix build on macosx: YARN parts (Binglin Chang via aw)
2014-09-24 08:47:55 -07:00
junping_du
a9a55db065
YARN-2584. TestContainerManagerSecurity fails on trunk. (Contributed by Jian He)
2014-09-22 22:45:06 -07:00
Jian He
0a641496c7
YARN-1372. Ensure all completed containers are reported to the AMs across RM restart. Contributed by Anubhav Dhoot
2014-09-22 10:30:53 -07:00
Vinod Kumar Vavilapalli
9f6891d9ef
YARN-2531. Added a configuration for admins to be able to override app-configs and enforce/not-enforce strict control of per-container cpu usage. Contributed by Varun Vasudev.
2014-09-16 10:14:46 -07:00
Vinod Kumar Vavilapalli
4be95175cd
YARN-2440. Enabled Nodemanagers to limit the aggregate cpu usage across all containers to a preconfigured limit. Contributed by Varun Vasudev.
2014-09-10 19:22:52 -07:00
Jason Lowe
3fa5f728c4
YARN-2431. NM restart: cgroup is not removed for reacquired containers. Contributed by Jason Lowe
2014-09-04 21:11:27 +00:00
Hitesh Shah
3de66011c2
YARN-2450. Fix typos in log messages. Contributed by Ray Chiang.
2014-08-29 11:16:36 -07:00
Allen Wittenauer
7e75226e68
YARN-2424. LCE should support non-cgroups, non-secure mode (Chris Douglas via aw)
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1619421 13f79535-47bb-0310-9956-ffa450edef68
2014-08-21 14:57:11 +00:00
Junping Du
c2febdcbaa
YARN-1337. Recover containers upon nodemanager restart. (Contributed by Jason Lowe)
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1617448 13f79535-47bb-0310-9956-ffa450edef68
2014-08-12 10:56:13 +00:00
Junping Du
b8f151231b
YARN-1354. Recover applications upon nodemanager restart. (Contributed by Jason Lowe)
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1615550 13f79535-47bb-0310-9956-ffa450edef68
2014-08-04 13:25:37 +00:00
Jian He
a41c314373
YARN-2343. Improve NMToken expire exception message. Contributed by Li Lu
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1615270 13f79535-47bb-0310-9956-ffa450edef68
2014-08-01 23:44:48 +00:00
Xuan Gong
e52f67e389
YARN-1994. Expose YARN/MR endpoints on multiple interfaces. Contributed by Craig Welch, Milan Potocnik,and Arpit Agarwal
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1614981 13f79535-47bb-0310-9956-ffa450edef68
2014-07-31 20:06:02 +00:00
Zhijie Shen
1d6e178144
YARN-2347. Consolidated RMStateVersion and NMDBSchemaVersion into Version in yarn-server-common. Contributed by Junping Du.
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1614838 13f79535-47bb-0310-9956-ffa450edef68
2014-07-31 09:27:43 +00:00
Aaron Myers
5d4677b57b
YARN-1796. container-executor shouldn't require o-r permissions. Contributed by Aaron T. Myers.
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1613548 13f79535-47bb-0310-9956-ffa450edef68
2014-07-26 01:51:35 +00:00
Devarajulu K
2050e0dad6
YARN-1342. Recover container tokens upon nodemanager restart. Contributed by Jason Lowe.
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1612995 13f79535-47bb-0310-9956-ffa450edef68
2014-07-24 05:02:00 +00:00
Junping Du
537c361f5b
YARN-2013. The diagnostics is always the ExitCodeException stack when the container crashes. (Contributed by Tsuyoshi OZAWA)
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1612449 13f79535-47bb-0310-9956-ffa450edef68
2014-07-22 03:01:58 +00:00
Jason Darrell Lowe
1ad2d7b405
YARN-2321. NodeManager web UI can incorrectly report Pmem enforcement. Contributed by Leitao Guo
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1612411 13f79535-47bb-0310-9956-ffa450edef68
2014-07-21 21:55:06 +00:00
Jason Darrell Lowe
8a87085820
YARN-2045. Data persisted in NM should be versioned. Contributed by Junping Du
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1612285 13f79535-47bb-0310-9956-ffa450edef68
2014-07-21 14:43:59 +00:00
Junping Du
403ec8ea80
YARN-1341. Recover NMTokens upon nodemanager restart. (Contributed by Jason Lowe)
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1611512 13f79535-47bb-0310-9956-ffa450edef68
2014-07-17 23:33:22 +00:00
Jian He
6d7dbd4fed
YARN-1367. Changed NM to not kill containers on NM resync if RM work-preserving restart is enabled. Contributed by Anubhav Dhoot
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1608334 13f79535-47bb-0310-9956-ffa450edef68
2014-07-07 04:37:59 +00:00
Steve Loughran
d1f54f4f4b
YARN-2065 AM cannot create new containers after restart
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1607441 13f79535-47bb-0310-9956-ffa450edef68
2014-07-02 18:35:10 +00:00
Vinod Kumar Vavilapalli
e285b98f0f
YARN-2152. Added missing information into ContainerTokenIdentifier so that NodeManagers can report the same to RM when RM restarts. Contributed Jian He.
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1605205 13f79535-47bb-0310-9956-ffa450edef68
2014-06-24 21:43:22 +00:00
Thomas Graves
1f9a0fd927
YARN-2072. RM/NM UIs and webservices are missing vcore information. (Nathan Roberts via tgraves)
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1605162 13f79535-47bb-0310-9956-ffa450edef68
2014-06-24 19:34:34 +00:00
Jason Darrell Lowe
98238a8d4a
YARN-2167. LeveldbIterator should get closed in NMLeveldbStateStoreService#loadLocalizationState() within finally block. Contributed by Junping Du
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1603039 13f79535-47bb-0310-9956-ffa450edef68
2014-06-17 02:12:03 +00:00
Junping Du
072360d128
YARN-1339. Recover DeletionService state upon nodemanager restart. (Contributed by Jason Lowe)
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1603036 13f79535-47bb-0310-9956-ffa450edef68
2014-06-17 01:02:16 +00:00
Jian He
95897ca14b
YARN-1885. Fixed a bug that RM may not send application-clean-up signal to NMs where the completed applications previously ran in case of RM restart. Contributed by Wangda Tan
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1603028 13f79535-47bb-0310-9956-ffa450edef68
2014-06-16 23:56:12 +00:00
Bikas Saha
ecfd43a2f1
YARN-2091. Add more values to ContainerExitStatus and pass it from NM to RM and then to app masters (Tsuyoshi OZAWA via bikas)
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1601762 13f79535-47bb-0310-9956-ffa450edef68
2014-06-10 20:08:33 +00:00
Vinod Kumar Vavilapalli
23c325ad47
YARN-2115. Replaced RegisterNodeManagerRequest's ContainerStatus with a new NMContainerStatus which has more information that is needed for work-preserving RM-restart. Contributed by Jian He.
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1598790 13f79535-47bb-0310-9956-ffa450edef68
2014-05-31 00:20:50 +00:00
Junping Du
66598697a6
YARN-1338. Recover localized resource cache state upon nodemanager restart (Contributed by Jason Lowe)
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1598640 13f79535-47bb-0310-9956-ffa450edef68
2014-05-30 15:37:27 +00:00
Junping Du
b29434a5c8
YARN-1362. Distinguish between nodemanager shutdown for decommission vs shutdown for restart. (Contributed by Jason Lowe)
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1594421 13f79535-47bb-0310-9956-ffa450edef68
2014-05-14 00:20:53 +00:00
Ivan Mitic
4810e2b849
YARN-1865. ShellScriptBuilder does not check for some error conditions. Contributed by Remus Rusanu.
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1588693 13f79535-47bb-0310-9956-ffa450edef68
2014-04-19 18:55:07 +00:00
Jason Darrell Lowe
cda8646cfa
YARN-1940. deleteAsUser() terminates early without deleting more files on error. Contributed by Rushabh S Shah
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1588546 13f79535-47bb-0310-9956-ffa450edef68
2014-04-18 19:24:13 +00:00
Zhijie Shen
44b6261bfa
YARN-1892. Improved some logs in the scheduler. Contributed by Jian He.
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1587717 13f79535-47bb-0310-9956-ffa450edef68
2014-04-15 20:37:44 +00:00
Jian He
ed78328d50
YARN-1903. Set exit code and diagnostics when container is killed at NEW/LOCALIZING state. Contributed by Zhijie Shen
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1586522 13f79535-47bb-0310-9956-ffa450edef68
2014-04-11 01:26:36 +00:00
Karthik Kambatla
245012a9d9
YARN-1757. NM Recovery. Auxiliary service support. (Jason Lowe via kasha)
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1585783 13f79535-47bb-0310-9956-ffa450edef68
2014-04-08 17:15:58 +00:00
Jian He
6a89e57b8d
YARN-1206. Fixed AM container log to show on NM web page after application finishes if log-aggregation is disabled. Contributed by Rohith Sharmaks
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1578614 13f79535-47bb-0310-9956-ffa450edef68
2014-03-17 21:49:06 +00:00
Vinod Kumar Vavilapalli
96e0ca2d27
YARN-1824. Improved NodeManager and clients to be able to handle cross platform application submissions. Contributed by Jian He.
...
MAPREDUCE-4052. Improved MapReduce clients to use NodeManagers' ability to handle cross platform application submissions. Contributed by Jian He.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1578135 13f79535-47bb-0310-9956-ffa450edef68
2014-03-16 18:32:05 +00:00
Christopher Douglas
53790d3300
YARN-1771. Reduce the number of NameNode operations during localization of
...
public resources using a cache. Contributed by Sangjin Lee
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1577391 13f79535-47bb-0310-9956-ffa450edef68
2014-03-14 00:30:35 +00:00
Vinod Kumar Vavilapalli
8aab8533a1
YARN-1800. Fixed NodeManager to gracefully handle RejectedExecutionException in the public-localizer thread-pool. Contributed by Varun Vasudev.
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1576545 13f79535-47bb-0310-9956-ffa450edef68
2014-03-11 23:33:56 +00:00
Vinod Kumar Vavilapalli
0b1304d098
YARN-1781. Modified NodeManagers to allow admins to specify max disk utilization for local disks so as to be able to offline full disks. Contributed by Varun Vasudev.
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1575463 13f79535-47bb-0310-9956-ffa450edef68
2014-03-08 00:52:06 +00:00
Vinod Kumar Vavilapalli
1c4047b0e4
YARN-1783. Fixed a bug in NodeManager's status-updater that was losing completed container statuses when NodeManager is forced to resync by the ResourceManager. Contributed by Jian He.
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1575437 13f79535-47bb-0310-9956-ffa450edef68
2014-03-07 22:36:47 +00:00
Vinod Kumar Vavilapalli
d07f855892
YARN-1686. Fixed NodeManager to properly handle any errors during re-registration after a RESYNC and thus avoid hanging. Contributed by Rohith Sharma.
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1571474 13f79535-47bb-0310-9956-ffa450edef68
2014-02-24 22:41:24 +00:00
Vinod Kumar Vavilapalli
990cffdcfa
YARN-1553. Modified YARN and MR to stop using HttpConfig.isSecure() and
...
instead rely on the http policy framework. And also fix some bugs related
to https handling in YARN web-apps. Contributed by Haohui Mai.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1568501 13f79535-47bb-0310-9956-ffa450edef68
2014-02-14 20:01:02 +00:00
Sanford Ryza
9024ad4aa0
YARN-1697. NodeManager reports negative running containers (Sandy Ryza)
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1567356 13f79535-47bb-0310-9956-ffa450edef68
2014-02-11 20:14:30 +00:00
Karthik Kambatla
d57c6e0fe7
YARN-1672. YarnConfiguration is missing a default for yarn.nodemanager.log.retain-seconds (Naren Koneru via kasha)
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1565866 13f79535-47bb-0310-9956-ffa450edef68
2014-02-08 01:55:33 +00:00
Jason Darrell Lowe
3497e76e19
YARN-1575. Public localizer crashes with "Localized unkown resource". Contributed by Jason Lowe
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1561110 13f79535-47bb-0310-9956-ffa450edef68
2014-01-24 18:54:48 +00:00
Jason Darrell Lowe
a6ea460a91
MAPREDUCE-5672. Provide optional RollingFileAppender for container log4j (syslog). Contributed by Gera Shegalov
...
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1558948 13f79535-47bb-0310-9956-ffa450edef68
2014-01-16 22:56:09 +00:00