Commit Graph

345 Commits

Author SHA1 Message Date
Robert Kanter
9d4d30243b Remove parent's env vars from child processes 2016-04-29 09:25:51 -07:00
Varun Vasudev
0f25a1bb52 YARN-3998. Add support in the NodeManager to re-launch containers. Contributed by Jun Gong. 2016-04-29 16:09:07 +05:30
Jian He
4a8508501b YARN-5009. NMLeveldbStateStoreService database can grow substantially leading to longer recovery times. Contributed by Jason Lowe 2016-04-28 21:54:11 -07:00
Karthik Kambatla
1a3f1482e2 YARN-4795. ContainerMetrics drops records. (Daniel Templeton via kasha) 2016-04-26 06:15:36 -07:00
Arun Suresh
c282a08f38 YARN-2885. Create AMRMProxy request interceptor and ContainerAllocator to distribute OPPORTUNISTIC containers to appropriate Nodes (asuresh)
(cherry picked from commit 2bf025278a318b0452fdc9ece4427b4c42124e39)
2016-04-24 22:38:33 -07:00
Jing Zhao
63e5412f1a HDFS-9427. HDFS should not default to ephemeral ports. Contributed by Xiaobing Zhou. 2016-04-22 15:14:40 -07:00
Karthik Kambatla
c8172f5f14 YARN-2883. Queuing of container requests in the NM. (Konstantinos Karanasos and Arun Suresh via kasha) 2016-04-20 09:55:50 -07:00
Jason Lowe
3150ae8108 YARN-4924. NM recovery race can lead to container not cleaned up. Contributed by sandflee 2016-04-14 19:17:14 +00:00
Naganarasimha
437e9d6475 YARN-4810. NM applicationpage cause internal error 500. Contributed by Bibin A Chundatt. 2016-04-12 17:59:46 +05:30
Vinod Kumar Vavilapalli
44bbc50d91 YARN-4168. Fixed a failing test TestLogAggregationService.testLocalFileDeletionOnDiskFull. Contributed by Takashi Ohnishi. 2016-04-11 12:11:14 -07:00
Karthik Kambatla
e82f961a39 YARN-4756. Unnecessary wait in Node Status Updater during reboot. (Eric Badger via kasha) 2016-04-07 17:05:29 -07:00
Varun Vasudev
b41e65e5bc YARN-4906. Capture container start/finish time in container metrics. Contributed by Jian He. 2016-04-06 13:41:33 +05:30
Junping Du
0005816743 YARN-4916. TestNMProxy.tesNMProxyRPCRetry fails. Contributed by Tibor Kiss. 2016-04-05 09:01:08 -07:00
naganarasimha
5092c94195 YARN-4746. yarn web services should convert parse failures of appId, appAttemptId and containerId to 400. Contributed by Bibin A Chundatt 2016-04-04 16:25:03 +05:30
Jian He
0dd9bcab97 YARN-4811. Generate histograms in ContainerMetrics for actual container resource usage 2016-03-31 14:28:13 -07:00
Allen Wittenauer
0a74610d1c HADOOP-11393. Revert HADOOP_PREFIX, go back to HADOOP_HOME (aw) 2016-03-31 07:51:05 -07:00
Jason Lowe
948b758070 YARN-4773. Log aggregation performs extraneous filesystem operations when rolling log aggregation is disabled. Contributed by Jun Gong 2016-03-28 23:00:56 +00:00
Robert Kanter
22ca176dfe TestResourceLocalizationService.testPublicResourceInitializesLocalDir fails Intermittently due to IllegalArgumentException from cleanup (templedf via rkanter) 2016-03-15 10:05:10 -07:00
Vinod Kumar Vavilapalli
b2661765a5 YARN-4762. Fixed CgroupHandler's creation and usage to avoid NodeManagers crashing when LinuxContainerExecutor is enabled. (Sidharta Seethana via vinodkv) 2016-03-07 11:08:17 -08:00
Jason Lowe
059caf9989 YARN-4744. Too many signal to container failure in case of LCE. Contributed by Sidharta Seethana 2016-03-07 15:40:01 +00:00
Haohui Mai
0fa54d45b1 HADOOP-12813. Migrate TestRPC and related codes to rebase on ProtobufRpcEngine. Contributed by Kai Zheng. 2016-02-29 11:41:00 -08:00
Ming Ma
7f3139e54d YARN-4720. Skip unnecessary NN operations in log aggregation. (Jun Gong via mingma) 2016-02-26 08:40:05 -08:00
Robert Kanter
d7fdec1e6b YARN-4579. Allow DefaultContainerExecutor container log directory permissions to be configurable (rchiang via rkanter) 2016-02-25 16:36:38 -08:00
Robert Kanter
954dd57043 YARN-4697. NM aggregation thread pool is not bound by limits (haibochen via rkanter) 2016-02-24 15:00:24 -08:00
Jason Lowe
d284e187b8 YARN-2046. Out of band heartbeats are sent only on container kill and possibly too early. Contributed by Ming Ma 2016-02-23 20:49:09 +00:00
Varun Vasudev
140cb5d745 YARN-4709. NMWebServices produces incorrect JSON for containers. Contributed by Varun Saxena. 2016-02-23 12:29:25 +05:30
Varun Vasudev
fa00d3e205 YARN-4655. Log uncaught exceptions/errors in various thread pools in YARN. Contributed by Sidharta Seethana. 2016-02-11 12:06:42 +05:30
Wangda Tan
9875325d5c YARN-4340. Add list API to reservation system. (Sean Po via wangda) 2016-02-02 10:17:33 +08:00
Rohith Sharma K S
ac68666803 YARN-4543. Fix random test failure in TestNodeStatusUpdater.testStopReentrant. (Akihiro Suda via rohithsharmaks) 2016-01-29 12:29:54 +05:30
Jason Lowe
61382ff8fa YARN-4643. Container recovery is broken with delegating container runtime. Contributed by Sidharta Seethana 2016-01-28 18:59:35 +00:00
Vinod Kumar Vavilapalli (I am also known as @tshooter.)
2085e60a96 YARN-3542. Refactored existing CPU cgroups support to use the newer and integrated ResourceHandler mechanism, and also deprecated the old LCEResourceHandler inteface hierarchy. Contributed by Varun Vasudev. 2016-01-25 16:19:36 -08:00
Xuan
618bfd6ac2 YARN-4496. Improve HA ResourceManager Failover detection on the client.
Contributed by Jian He
2016-01-22 18:20:38 -08:00
Varun Vasudev
b41a7e89d1 YARN-4578. Directories that are mounted in docker containers need to be more restrictive/container-specific. Contributed by Sidharta Seethana. 2016-01-22 14:43:14 +05:30
Wangda Tan
89d1fd5dac HADOOP-12356. Fix computing CPU usage statistics on Windows. (Inigo Goiri via wangda) 2016-01-19 21:27:38 +08:00
Varun Vasudev
3ddb92bd30 YARN-4553. Add cgroups support for docker containers. Contributed by Sidharta Seethana. 2016-01-14 14:29:29 +05:30
Jason Lowe
13de8359a1 YARN-4414. Nodemanager connection errors are retried at multiple levels. Contributed by Chang Li 2016-01-12 15:56:15 +00:00
Steve Loughran
07d1cb612c YARN-4550. Some tests in TestContainerLanch fails on non-english locale environment. (Takashi Ohnishi via stevel) 2016-01-07 14:30:20 +00:00
rohithsharmaks
791c1639ae YARN-4393. Fix intermittent test failure for TestResourceLocalizationService#testFailedDirsResourceRelease (Varun Saxana via rohithsharmaks) 2016-01-07 09:38:47 +05:30
Gera Shegalov
2c17b81569 YARN-2934. Improve handling of container's stderr. (Naganarasimha G R via gera) 2015-12-24 23:48:05 -08:00
Uma Mahesh
0f82b5d878 YARN-4480. Clean up some inappropriate imports. (Kai Zheng via umamahesh) 2015-12-19 23:10:13 -08:00
Vinod Kumar Vavilapalli (I am also known as @tshooter.)
4e7d32c0db YARN-1856. Added cgroups based memory monitoring for containers as another alternative to custom memory-monitoring. Contributed by Varun Vasudev. 2015-12-17 12:13:03 -08:00
Jian He
915cd6c3f4 YARN-4402. TestNodeManagerShutdown And TestNodeManagerResync fails with bind exception. Contributed by Brahma Reddy Battula 2015-12-14 14:59:01 -08:00
Wangda Tan
dfcbbddb09 YARN-4309. Add container launch related debug information to container logs when a container fails. (Varun Vasudev via wangda) 2015-12-14 11:13:22 -08:00
Junping Du
62e9348bc1 YARN-4408. Fix issue that NodeManager still reports negative running containers. Contributed by Robert Kanter. 2015-12-03 06:36:37 -08:00
Tsuyoshi Ozawa
0656d2dc83 YARN-4380. TestResourceLocalizationService.testDownloadingResourcesOnContainerKill fails intermittently. Contributed by Varun Saxena. 2015-11-26 01:10:02 +09:00
Jason Lowe
4ac6799d4a YARN-4132. Separate configs for nodemanager to resourcemanager connection timeout and retries. Contributed by Chang Li 2015-11-24 22:35:37 +00:00
Junping Du
855d52927b YARN-4354. Public resource localization fails with NPE. Contributed by Jason Lowe. 2015-11-15 04:43:57 -08:00
Jason Lowe
e2267de207 YARN-2902. Killing a container that is localizing can orphan resources in the DOWNLOADING state. Contributed by Varun Saxena 2015-10-29 16:34:25 +00:00
Wangda Tan
6f606214e7 YARN-4169. Fix racing condition of TestNodeStatusUpdaterForLabels. (Naganarasimha G R via wangda) 2015-10-26 16:36:34 -07:00
Rohith Sharma K S
5acdde4744 YARN-2729. Support script based NodeLabelsProvider Interface in Distributed Node Label Configuration Setup. (Naganarasimha G R via rohithsharmaks) 2015-10-26 15:42:42 +05:30