Commit Graph

1827 Commits

Author SHA1 Message Date
Daniel Templeton
3b9d3acd20 YARN-5890. FairScheduler should log information about AM-resource-usage and max-AM-share for queues
(Contributed by Yufei Gu via Daniel Templeton)
2016-11-29 12:46:05 -08:00
Daniel Templeton
25f9872be6 YARN-5774. MR Job stuck in ACCEPTED status without any progress in Fair Scheduler
if set yarn.scheduler.minimum-allocation-mb to 0. (Contributed by Yufei Gu via Daniel Templeton)
2016-11-29 09:40:49 -08:00
Kai Zheng
5d5614f847 HDFS-10994. Support an XOR policy XOR-2-1-64k in HDFS. Contributed by Sammi Chen 2016-11-28 14:34:44 +08:00
Sunil
eb0a483ed0 YARN-4206. Add Application timeouts in Application report and CLI. Contributed by Rohith Sharma K S. 2016-11-24 18:18:42 +05:30
Rohith Sharma K S
e15c20edba YARN-5920. Fix deadlock in TestRMHA.testTransitionedToStandbyShouldNotHang. Contributed by Varun Saxena. 2016-11-24 12:18:38 +05:30
Daniel Templeton
10468529a9 YARN-4752. Improved preemption in FairScheduler. (kasha)
Contains:
YARN-5605. Preempt containers (all on one node) to meet the requirement of starved applications
YARN-5821. Drop left-over preemption-related code and clean up method visibilities in the Schedulable hierarchy
YARN-5783. Verify identification of starved applications.
YARN-5819. Verify fairshare and minshare preemption
YARN-5885. Cleanup YARN-4752 branch for merge

Change-Id: Iee0962377d019dd64dc69a020725d2eaf360858c
2016-11-23 19:48:59 -10:00
Jian He
1f12867a69 YARN-5649. Add REST endpoints for updating application timeouts. Contributed by Rohith Sharma K S 2016-11-23 16:25:39 -08:00
Arun Suresh
005850b28f YARN-5918. Handle Opportunistic scheduling allocate request failure when NM is lost. (Bibin A Chundatt via asuresh) 2016-11-23 09:53:31 -08:00
Rohith Sharma K S
a926f895c1 YARN-5865. Retrospect updateApplicationPriority api to handle state store exception in align with YARN-5611. Contributed by Sunil G. 2016-11-22 14:49:15 +05:30
Rohith Sharma K S
d65603517e YARN-5375. invoke MockRM#drainEvents implicitly in MockRM methods to reduce test failures. Contributed by sandflee. 2016-11-16 15:14:00 +05:30
Xiao Chen
f121d0b036 YARN-5875. TestTokenClientRMService#testTokenRenewalWrongUser fails. Contributed by Gergely Novák. 2016-11-15 13:58:11 -08:00
Rohith Sharma K S
b7070f3308 YARN-5874. RM -format-state-store and -remove-application-from-state-store commands fail with NPE. Contributed by Varun Saxena. 2016-11-15 10:58:25 +05:30
Jian He
fad9609d13 YARN-5825. ProportionalPreemptionalPolicy should use readLock over LeafQueue instead of synchronized block. Contributed by Sunil G 2016-11-11 15:16:21 -08:00
Naganarasimha
503e73e849 YARN-5545. Fix issues related to Max App in capacity scheduler. Contributed by Bibin A Chundatt 2016-11-11 20:48:31 +05:30
Eric Payne
93eeb13164 YARN-4218. Metric for resource*time that was preempted. Contributed by Chang Li. 2016-11-10 22:35:12 +00:00
Karthik Kambatla
86ac1ad9fd YARN-5453. FairScheduler#update may skip update demand resource of child queue/app if current demand reached maxResource. (sandflee via kasha) 2016-11-09 23:44:02 -08:00
Jian He
bcc15c6290 YARN-5611. Provide an API to update lifetime of an application. Contributed by Rohith Sharma K S 2016-11-09 16:08:05 -08:00
Naganarasimha
edbee9e609 YARN-4498. Application level node labels stats to be available in REST (addendum patch). Contributed by Bibin A Chundatt. 2016-11-10 05:00:05 +05:30
Daniel Templeton
59ee8b7a88 YARN-4329. [YARN-5437] Allow fetching exact reason as to why a submitted app
is in ACCEPTED state in Fair Scheduler (Contributed by Yufei Gu)
2016-11-09 13:11:37 -08:00
Jason Lowe
3f93ac0733 YARN-5356. NodeManager should communicate physical resource capability to ResourceManager. Contributed by Inigo Goiri 2016-11-08 22:01:26 +00:00
Jian He
de3b4aac56 YARN-5716. Add global scheduler interface definition and update CapacityScheduler to use it. Contributed by Wangda Tan 2016-11-07 10:14:39 -08:00
Jason Lowe
6bb741ff0e YARN-5837. NPE when getting node status of a decommissioned node after an RM restart. Contributed by Robert Kanter 2016-11-04 22:20:21 +00:00
Arun Suresh
0aafc122d4 YARN-2995. Enhance UI to show cluster resource utilization of various container Execution types. (Konstantinos Karanasos via asuresh) 2016-11-04 07:31:54 -07:00
Sunil
19b3779ae7 YARN-5802. updateApplicationPriority api in scheduler should ensure to re-insert app to correct ordering policy. Contributred by Bibin A Chundatt 2016-11-04 16:07:28 +05:30
Jason Lowe
352cbaa7a5 YARN-4862. Handle duplicate completed containers in RMNodeImpl. Contributed by Rohith Sharma K S 2016-11-03 13:54:31 +00:00
Varun Saxena
377919010b YARN-5815. Random failure of TestApplicationPriority.testOrderOfActivatingThePriorityApplicationOnRMRestart (Bibin A Chundatt via Varun Saxena) 2016-11-03 00:37:09 +05:30
Varun Saxena
7d2d8d25ba YARN-5788. Apps not activiated and AM limit resource in UI and REST not updated after -replaceLabelsOnNode (Bibin A Chundatt via Varun Saxena) 2016-11-01 15:32:04 +05:30
Wangda Tan
90dd3a8148 YARN-2009. CapacityScheduler: Add intra-queue preemption for app priority support. (Sunil G via wangda) 2016-10-31 15:18:31 -07:00
Daniel Templeton
cc2c993a8a YARN-4907. Make all MockRM#waitForState consistent. (Contributed by Yufei Gu via Daniel Templeton) 2016-10-31 13:20:56 -07:00
Naganarasimha
e0bebbbcdd YARN-4498. Application level node labels stats to be available in REST. Contributed by Bibin A Chundatt 2016-10-31 04:38:20 +05:30
Arun Suresh
aa3cab1eb2 YARN-5799. Fix Opportunistic Allocation to set the correct value of Node Http Address. (asuresh) 2016-10-29 02:03:57 -07:00
Varun Saxena
1c8ab41e8b YARN-5773. RM recovery too slow due to LeafQueue#activateApplications (Bibin A Chundatt via Varun Saxena) 2016-10-29 13:47:39 +05:30
Jason Lowe
1eae719bce YARN-4963. capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat configurable. Contributed by Nathan Roberts 2016-10-28 17:30:15 +00:00
Karthik Kambatla
4df8ed63ed YARN-4743. FairSharePolicy breaks TimSort assumption. (Zephyr Guo and Yufei Gu via kasha) 2016-10-27 17:45:48 -07:00
Subru Krishnan
b2c4f24c31 YARN-2306. Add test for leakage of reservation metrics in fair scheduler. (Hong Zhiguo and Yufei Gu via subru). 2016-10-27 17:43:13 -07:00
Robert Kanter
5877f20f9c HADOOP-10075. Update jetty dependency to version 9 (rkanter) 2016-10-27 16:09:00 -07:00
Subru Krishnan
79ae78dcbe YARN-3568. TestAMRMTokens should use some random port. (Takashi Ohnishi via Subru). 2016-10-27 15:11:12 -07:00
Varun Saxena
79aeddc88f YARN-5308. FairScheduler: Move continuous scheduling related tests to TestContinuousScheduling (Kai Sasaki via Varun Saxena) 2016-10-28 00:34:50 +05:30
Naganarasimha
b3c15e4ef7 YARN-5420. Delete org.apache.hadoop.yarn.server.resourcemanager.resource.Priority as its not necessary. Contributed by Sunil G. 2016-10-27 18:22:07 +05:30
Naganarasimha
6c8830992c YARN-3848. TestNodeLabelContainerAllocation is timing out. Contributed by Varun Saxena 2016-10-27 17:10:02 +05:30
Rohith Sharma K S
e29cba61a0 YARN-4363. In TestFairScheduler, testcase should not create FairScheduler redundantly. Conntributed by Tao Jie. 2016-10-27 11:57:17 +05:30
Akira Ajisaka
d3bb69a667 YARN-5575. Many classes use bare yarn. properties instead of the defined constants. Contributed by Daniel Templeton. 2016-10-26 15:32:07 +09:00
Karthik Kambatla
754cb4e30f YARN-5047. Refactor nodeUpdate across schedulers. (Ray Chiang via kasha) 2016-10-20 21:17:48 -07:00
Karthik Kambatla
a064865abf YARN-4911. Bad placement policy in FairScheduler causes the RM to crash 2016-10-20 20:57:04 -07:00
Xuan
b733a6f862 YARN-5718. TimelineClient (and other places in YARN) shouldn't over-write HDFS client retry settings which could cause unexpected behavior. Contributed by Junping Du. 2016-10-18 11:06:47 -07:00
Varun Saxena
b154d3edce YARN-5743. [Atsv2] Publish queue name and RMAppMetrics to ATS (Rohith Sharma K S via Varun Saxena) 2016-10-18 23:32:52 +05:30
Sangjin Lee
1f304b0c7f YARN-5699. Retrospect yarn entity fields which are publishing in events info fields. Contributed by Rohith Sharma K S. 2016-10-15 13:54:40 -07:00
Karthik Kambatla
6476934ae5 YARN-5677. RM should transition to standby when connection is lost for an extended period. (Daniel Templeton via kasha) 2016-10-11 22:07:10 -07:00
Naganarasimha
0773ffd0f8 YARN-5057. Resourcemanager.security.TestDelegationTokenRenewer fails in trunk. Contributed by Jason Lowe. 2016-10-10 18:04:47 -04:00
Karthik Kambatla
736d33cddd YARN-4767. Network issues can cause persistent RM UI outage. (Daniel Templeton via kasha) 2016-10-03 14:35:57 -07:00
Naganarasimha
6e130c308c YARN-4855. Should check if node exists when replace nodelabels. Contributeed by Tao Jie 2016-10-03 02:02:26 -04:00
Subru Krishnan
3a3697deab YARN-5384. Expose priority in ReservationSystem submission APIs. (Sean Po via Subru). 2016-09-30 19:41:43 -07:00
Arun Suresh
10be45986c YARN-5486. Update OpportunisticContainerAllocatorAMService::allocate method to handle OPPORTUNISTIC container requests. (Konstantinos Karanasos via asuresh) 2016-09-29 15:11:41 -07:00
Jian He
2ae5a3a5bf YARN-4205. Add a service for monitoring application life time out. Contributed by Rohith Sharma K S 2016-09-29 22:00:31 +08:00
Varun Saxena
9b0fd01d2e YARN-5599. Publish AM launch command to ATS (Rohith Sharma K S via Varun Saxena) 2016-09-28 16:10:10 +05:30
Arun Suresh
4815d024c5 YARN-5609. Expose upgrade and restart API in ContainerManagementProtocol. Contributed by Arun Suresh 2016-09-26 08:46:54 -07:00
Arun Suresh
2f163cd5cf Revert "YARN-5609. Expose upgrade and restart API in ContainerManagementProtocol. Contributed by Arun Suresh"
This reverts commit fe644bafe7.
2016-09-26 08:36:59 -07:00
Jian He
fe644bafe7 YARN-5609. Expose upgrade and restart API in ContainerManagementProtocol. Contributed by Arun Suresh 2016-09-26 22:41:16 +08:00
Naganarasimha
d0372dc613 YARN-3692. Allow REST API to set a user generated message when killing an application. Contributed by Rohith Sharma K S 2016-09-23 06:30:49 +05:30
Arun Suresh
9f03b403ec YARN-5656. Fix ReservationACLsTestBase. (Sean Po via asuresh) 2016-09-20 12:27:17 -07:00
Jian He
2b66d9ec5b YARN-3140. Improve locks in AbstractCSQueue/LeafQueue/ParentQueue. Contributed by Wangda Tan 2016-09-20 15:03:31 +08:00
Jason Lowe
7558dbbb48 YARN-5540. Scheduler spends too much time looking at empty priorities. Contributed by Jason Lowe 2016-09-19 20:31:35 +00:00
Kai Zheng
58bae35447 YARN-5163. Migrate TestClientToAMTokens and TestClientRMTokens tests from the old RPC engine. Contributed by Wei Zhou and Kai Zheng 2016-09-18 08:43:36 +08:00
Karthik Kambatla
f6ea9be547 YARN-5264. Store all queue-specific information in FSQueue. (Yufei Gu via kasha) 2016-09-02 14:56:29 -07:00
Varun Vasudev
05f5c0f631 YARN-5555. Scheduler UI: "% of Queue" is inaccurate if leaf queue is hierarchically nested. Contributed by Eric Payne. 2016-09-02 16:02:01 +05:30
Karthik Kambatla
74f4bae455 YARN-5566. Client-side NM graceful decom is not triggered when jobs finish. (Robert Kanter via kasha) 2016-09-01 14:44:01 -07:00
Arun Suresh
d6d9cff21b YARN-5221. Expose UpdateResourceRequest API to allow AM to request for change in container properties. (asuresh) 2016-08-30 15:52:29 -07:00
Subru Krishnan
b930dc3ec0 YARN-5327. API changes required to support recurring reservations in the YARN ReservationSystem. (Sangeetha Abdu Jyothi via Subru). 2016-08-26 16:58:47 -07:00
Junping Du
9ef632f3b0 YARN-5557. Add localize API to the ContainerManagementProtocol. Contributed by Jian He. 2016-08-26 09:04:44 -07:00
Naganarasimha
46e02ab719 YARN-3940. Application moveToQueue should check NodeLabel permission. Contributed by Bibin A Chundatt 2016-08-26 20:19:11 +05:30
Naganarasimha
27c3b86252 YARN-5564. Fix typo in RM_SCHEDULER_RESERVATION_THRESHOLD_INCREMENT_MULTIPLE. Contributed by Ray Chiang 2016-08-26 08:47:21 +05:30
Rohith Sharma K S
0d5997d2b9 YARN-5544. TestNodeBlacklistingOnAMFailures fails on trunk. Contributed by Sunil G. 2016-08-23 14:37:39 +05:30
Wangda Tan
444b2ea7af YARN-3388. Allocation in LeafQueue could get stuck because DRF calculator isn't well supported when computing user-limit. (Nathan Roberts via wangda) 2016-08-19 16:28:32 -07:00
Varun Saxena
091dd19e86 YARN-5533. JMX AM Used metrics for queue wrong when app submited to nodelabel partition (Bibin A Chundatt via Varun Saxena) 2016-08-19 17:30:17 +05:30
Varun Saxena
8aed374182 Revert "YARN-5533. JMX AM Used metrics for queue wrong when app submited to nodelabel partition (Bibin A Chundatt via Varun Saxena)"
This reverts commit 59557e85a4.
2016-08-19 16:14:16 +05:30
Varun Saxena
59557e85a4 YARN-5533. JMX AM Used metrics for queue wrong when app submited to nodelabel partition (Bibin A Chundatt via Varun Saxena) 2016-08-19 15:01:48 +05:30
Junping Du
0da69c324d YARN-4676. Automatic and Asynchronous Decommissioning Nodes Status Tracking. Contributed by Diniel Zhi.
(cherry picked from commit d464483bf7f0b3e3be3ba32cd6c3eee546747ab5)

Conflicts:

	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNode.java
	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
2016-08-18 07:27:23 -07:00
Karthik Kambatla
20f0eb871c YARN-4702. FairScheduler: Allow setting maxResources for ad hoc queues. (Daniel Templeton via kasha) 2016-08-17 17:40:20 -07:00
Varun Saxena
24249115bf YARN-5521. Fix random failure of TestCapacityScheduler#testKillAllAppsInQueue (sandflee via Varun Saxena) 2016-08-16 00:03:29 +05:30
Varun Saxena
d677b68c25 YARN-5491. Fix random failure of TestCapacityScheduler#testCSQueueBlocked (Bibin A Chundatt via Varun Saxena) 2016-08-15 03:31:21 +05:30
Varun Saxena
23c6e3c4e4 YARN-5476. Non existent application reported as ACCEPTED by YarnClientImpl (Junping Du via Varun Saxena) 2016-08-12 20:37:58 +05:30
Naganarasimha
874577a67d YARN-4833. For Queue AccessControlException client retries multiple times on both RM. Contributed by Bibin A Chundatt 2016-08-12 01:09:41 +05:30
Rohith Sharma K S
5199db387d YARN-5492. TestSubmitApplicationWithRMHA is failing sporadically during precommit builds. Contributed by Vrushali C. 2016-08-11 11:50:46 +05:30
Jason Lowe
5c95bb315b YARN-5382. RM does not audit log kill request for active applications. Contributed by Vrushali C 2016-08-10 18:25:54 +00:00
Karthik Kambatla
7992c0b42c YARN-5343. TestContinuousScheduling#testSortedNodes fails intermittently. (Yufei Gu via kasha) 2016-08-09 16:51:03 -07:00
Arun Suresh
82c9e06101 YARN-5457. Refactor DistributedScheduling framework to pull out common functionality. (asuresh) 2016-08-09 00:42:29 -07:00
Wangda Tan
3f100d76ff YARN-4888. Changes in scheduler to identify resource-requests explicitly by allocation-id. (Subru Krishnan via wangda) 2016-08-05 10:43:35 -07:00
Wangda Tan
e0d131f055 YARN-4091. Add REST API to retrieve scheduler activity. (Chen Ge and Sunil G via wangda) 2016-08-05 10:27:34 -07:00
Rohith Sharma K S
d9a354c2f3 YARN-5333. Some recovered apps are put into default queue when RM HA. Contributed by Jun Gong. 2016-08-05 21:35:49 +05:30
Jason Lowe
4d92aefd35 YARN-4280. CapacityScheduler reservations may not prevent indefinite postponement on a busy cluster. Contributed by Kuhu Shukla 2016-08-03 18:53:14 +00:00
Arun Suresh
e5766b1dbe YARN-5113. Refactoring and other clean-up for distributed scheduling. (Konstantinos Karanasos via asuresh) 2016-07-31 11:48:25 -07:00
Subru Krishnan
4e756d7271 YARN-5203.Return ResourceRequest JAXB object in ResourceManager Cluster Applications REST API. Contributed by Ellen Hui. 2016-07-28 16:03:24 -07:00
Wangda Tan
d62e121ffc YARN-5195. RM intermittently crashed with NPE while handling APP_ATTEMPT_REMOVED event when async-scheduling enabled in CapacityScheduler. (sandflee via wangda) 2016-07-26 21:22:59 -07:00
Wangda Tan
49969b16cd YARN-5342. Improve non-exclusive node partition resource allocation in Capacity Scheduler. (Sunil G via wangda) 2016-07-26 18:14:09 -07:00
Arun Suresh
5aace38b74 YARN-5392. Replace use of Priority in the Scheduling infrastructure with an opaque ShedulerRequestKey. (asuresh and subru) 2016-07-26 14:54:03 -07:00
Chris Douglas
d383bfdcd4 YARN-5164. Use plan RLE to improve CapacityOverTimePolicy efficiency 2016-07-25 16:37:50 -07:00
Rohith Sharma K S
557a245d83 YARN-5092. TestRMDelegationTokens fails intermittently. Contributed by Jason Lowe. 2016-07-21 12:47:27 +05:30
Arun Suresh
cda0a280dd YARN-5181. ClusterNodeTracker: add method to get list of nodes matching a specific resourceName. (kasha via asuresh) 2016-07-19 10:43:37 -07:00
Arun Suresh
5f2d33a551 Revert "YARN=5181. ClusterNodeTracker: add method to get list of nodes matching a specific resourceName. (kasha via asuresh)"
This reverts commit e905a42a2c.
2016-07-19 10:43:19 -07:00
Varun Saxena
fe20494a72 YARN-4996. Make TestNMReconnect.testCompareRMNodeAfterReconnect() scheduler agnostic (Kai Sasaki via Varun Saxena) 2016-07-19 16:03:28 +05:30
Ray Chiang
f5f1c81e7d YARN-5272. Handle queue names consistently in FairScheduler. (Wilfred Spiegelenburg via rchiang) 2016-07-15 14:38:50 -07:00
Arun Suresh
e905a42a2c YARN=5181. ClusterNodeTracker: add method to get list of nodes matching a specific resourceName. (kasha via asuresh) 2016-07-15 14:35:12 -07:00
Wangda Tan
24db9167f1 YARN-4484. Available Resource calculation for a queue is not correct when used with labels. (Sunil G via wangda) 2016-07-15 11:40:12 -07:00
Rohith Sharma K S
d6d41e820a YARN-5362. TestRMRestart#testFinishedAppRemovalAfterRMRestart can fail. Contributed by sandflee. 2016-07-13 19:12:35 +05:30
Varun Saxena
06c56ff79b YARN-5353. ResourceManager can leak delegation tokens when they are shared across apps. (Jason Lowe via Varun Saxena). 2016-07-13 07:55:34 +05:30
Jason Lowe
10b704c594 YARN-5317. testAMRestartNotLostContainerCompleteMsg may fail. Contributed by sandflee 2016-07-12 20:27:41 +00:00
Jian He
819224dcf9 YARN-5270. Solve miscellaneous issues caused by YARN-4844. Contributed by Wangda Tan 2016-07-11 22:36:20 -07:00
Varun Saxena
0fd3980a1f YARN-5037. Fix random failure of TestRMRestart#testQueueMetricsOnRMRestart (sandflee via Varun Saxena). 2016-07-10 21:28:52 +05:30
Sangjin Lee
6cf6ab7b78 Made a number of miscellaneous fixes for javac, javadoc, and checstyle warnings. 2016-07-10 08:46:05 -07:00
Li Lu
0a9b085f05 YARN-5189. Make HBaseTimeline[Reader|Writer]Impl default and move FileSystemTimeline*Impl. (Joep Rottinghuis and Sangjin Lee via gtcarrera9) 2016-07-10 08:46:01 -07:00
Sangjin Lee
702236129b YARN-5095. flow activities and flow runs are populated with wrong timestamp when RM restarts w/ recovery enabled (Varun Saxena via sjlee) 2016-07-10 08:46:00 -07:00
Li Lu
c2055a97d5 YARN-3150. Documenting the timeline service v2. (Sangjin Lee and Vrushali C via gtcarrera9) 2016-07-10 08:45:57 -07:00
Varun Saxena
a3cf40e532 YARN-3461. Consolidate flow name/version/run defaults. (Sangjin Lee via Varun Saxena) 2016-07-10 08:45:55 -07:00
Naganarasimha
6934b05c71 YARN-4238. createdTime and modifiedTime is not reported while publishing entities to ATSv2. (Varun Saxena via Naganarasimha G R) 2016-07-10 08:45:52 -07:00
Li Lu
34f02f07d5 Rebase to latest trunk 2016-07-10 08:45:51 -07:00
Varun Saxena
829cceebc0 YARN-3586. RM to only get back addresses of Collectors that NM needs to know.
(Junping Du via Varun Saxena).
2016-07-10 08:45:50 -07:00
Li Lu
8ef546c1ee YARN-4445. Unify the term flowId and flowName in timeline v2 codebase.
Contributed by Zhan Zhang.
2016-07-10 08:45:49 -07:00
Varun Saxena
c4d7bbda5c YARN-4460. [Bug fix] RM fails to start when SMP is enabled. (Li Lu via Varun Saxena) 2016-07-10 08:45:49 -07:00
Xuan
2e2dbf59d1 YARN-4392. ApplicationCreatedEvent event time resets after RM
restart/failover. Contributed by Naganarasimha G R and Xuan Gong

(cherry picked from commit 4546c7582b)

Conflicts:
	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java
2016-07-10 08:45:49 -07:00
Li Lu
89e5c44f9e YARN-4356. Ensure the timeline service v.2 is disabled cleanly and has no
impact when it's turned off. Contributed by Sangjin Lee.
2016-07-10 08:45:48 -07:00
Sangjin Lee
10ec5586fb YARN-4129. Refactor the SystemMetricPublisher in RM to better support newer events (Naganarasimha G R via sjlee) 2016-07-10 08:45:46 -07:00
Sangjin Lee
8d9476ec5f YARN-4058. Miscellaneous issues in NodeManager project (Naganarasimha G R via sjlee) 2016-07-10 08:45:43 -07:00
Zhijie Shen
f3c661e8dd YARN-3044. Made RM write app, attempt and optional container lifecycle events to timeline service v2. Contributed by Naganarasimha G R. 2016-07-10 08:45:37 -07:00
Sangjin Lee
dc1f306fdc YARN-3562. unit tests failures and issues found from findbug from earlier ATS checkins (Naganarasimha G R via sjlee) 2016-07-10 08:45:35 -07:00
Junping Du
2188a07e5b YARN-3333. Rename TimelineAggregator etc. to TimelineCollector. Contributed by Sangjin Lee 2016-07-10 08:45:31 -07:00
Zhijie Shen
9b56364080 YARN-3039. Implemented the app-level timeline aggregator discovery service. Contributed by Junping Du. 2016-07-10 08:45:31 -07:00
Varun Saxena
c04c5ec501 YARN-5318. Fix intermittent test failure of TestRMAdminService#testRefreshNodesResourceWithFileSystemBasedConfigurationProvider. (Jun Gong via Varun Saxena). 2016-07-09 01:13:18 +05:30
Junping Du
30ee57ceb1 YARN-4939. The decommissioning Node should keep alive during NM restart. Contributed by sandflee. 2016-07-08 04:14:53 -07:00
Wangda Tan
04f6ebb66a YARN-5294. Pass remote ip address down to YarnAuthorizationProvider. (Jian He via wangda) 2016-07-06 10:36:48 -07:00
Varun Saxena
8e672e3c71 YARN-5286. Add RPC port info in RM web service's response when getting app status. (Jun Gong via Varun Saxena). 2016-07-05 22:56:07 +05:30
Jian He
c35a5a7a8d YARN-5023. TestAMRestart#testShouldNotCountFailureToMaxAttemptRetry fails. Contributed by sandflee 2016-07-01 14:29:03 -07:00
Varun Saxena
abe7fc22c1 YARN-5182. MockNodes.newNodes creates one more node per rack than requested. (Karthik Kambatla via Varun Saxena). 2016-06-30 00:13:28 +05:30
Rohith Sharma K S
26b5e6116f YARN-5262. Optimize sending RMNodeFinishedContainersPulledByAMEvent for every AM heartbeat. 2016-06-29 10:08:30 +05:30
Akira Ajisaka
a8a48c9125 YARN-5278. Remove unused argument in TestRMWebServicesForCSWithPartitions#setupQueueConfiguration. Contributed by Tao Jie. 2016-06-23 14:28:12 +09:00
Arun Suresh
99e5dd68d0 YARN-5171. Extend DistributedSchedulerProtocol to notify RM of containers allocated by the Node. (Inigo Goiri via asuresh) 2016-06-22 19:04:54 -07:00
Tsuyoshi Ozawa
5d58858bb6 HADOOP-9613. [JDK8] Update jersey version to latest 1.x release. 2016-06-21 08:05:32 +09:00
Karthik Kambatla
20f2799938 YARN-5077. Fix FSLeafQueue#getFairShare() for queues with zero fairshare. (Yufei Gu via kasha) 2016-06-17 22:24:42 -07:00
Karthik Kambatla
fbbe0bb627 YARN-5082. Limit ContainerId increase in fair scheduler if the num of node app reserved reached the limit. Addendum to fix javac warning. (Arun Suresh via kasha) 2016-06-17 22:12:50 -07:00
Wangda Tan
c77a1095dc YARN-1942. Deprecate toString/fromString methods from ConverterUtils and move them to records classes like ContainerId/ApplicationId, etc. (wangda) 2016-06-14 15:06:38 -07:00
Rohith Sharma K S
28b66ae919 YARN-4989. TestWorkPreservingRMRestart#testCapacitySchedulerRecovery fails intermittently. Contributed by Ajith S. 2016-06-13 11:09:32 +05:30
Arun Suresh
5279af7cd4 YARN-5082. Limit ContainerId increase in fair scheduler if the num of node app reserved reached the limit (sandflee via asuresh) 2016-06-10 22:33:42 -07:00
Rohith Sharma K S
e0f4620cc7 YARN-5197. RM leaks containers if running container disappears from node update. Contributed by Jason Lowe. 2016-06-11 10:22:27 +05:30
Wangda Tan
244506f9c8 YARN-5208. Run TestAMRMClient TestNMClient TestYarnClient TestClientRMTokens TestAMAuthorization tests with hadoop.security.token.service.use_ip enabled. (Rohith Sharma K S via wangda) 2016-06-10 09:34:32 -07:00
Wangda Tan
620325e816 YARN-4837. User facing aspects of 'AM blacklisting' feature need fixing. (vinodkv via wangda) 2016-06-07 15:06:42 -07:00
Arun Suresh
3a154f75ed YARN-4525. Fix bug in RLESparseResourceAllocation.getRangeOverlapping(). (Ishai Menache and Carlo Curino via asuresh) 2016-06-06 21:18:32 -07:00
Ming Ma
4a1cedc010 MAPREDUCE-5044. Have AM trigger jstack on task attempts that timeout before killing them. (Eric Payne and Gera Shegalov via mingma) 2016-06-06 14:30:51 -07:00
Arun Suresh
db54670e83 YARN-5165. Fix NoOvercommitPolicy to take advantage of RLE representation of plan. (Carlo Curino via asuresh) 2016-06-03 14:49:32 -07:00
Vinod Kumar Vavilapalli
f10ebc67f5 YARN-5098. Fixed ResourceManager's DelegationTokenRenewer to replace expiring system-tokens if RM stops and only restarts after a long time. Contributed by Jian He. 2016-06-03 13:00:07 -07:00
Jian He
097baaaeba YARN-1815. Work preserving recovery of Unmanged AMs. Contributed by Subru Krishnan 2016-06-03 10:49:30 -07:00
Arun Suresh
dc26601d8f YARN-5180. Allow ResourceRequest to specify an enforceExecutionType flag. (asuresh) 2016-06-02 09:01:02 -07:00
Varun Vasudev
42f90ab885 YARN-4844. Add getMemorySize/getVirtualCoresSize to o.a.h.y.api.records.Resource. Contributed by Wangda Tan. 2016-05-29 21:24:16 +05:30
Arun Suresh
aa975bc781 YARN-5127. Expose ExecutionType in Container api record. (Hitesh Sharma via asuresh) 2016-05-27 14:06:32 -07:00
Kai Zheng
916140604f HADOOP-12911. Upgrade Hadoop MiniKDC with Kerby. Contributed by Jiajia Li 2016-05-28 14:23:39 +08:00
Rohith Sharma K S
0a544f8a3e YARN-5005. TestRMWebServices#testDumpingSchedulerLogs fails randomly. Contributed by Bibin A Chundatt. 2016-05-27 10:44:35 +05:30
Arun Suresh
5b41b288d0 YARN-5162. Fix Exceptions thrown during in registerAM call when Distributed Scheduling is Enabled (Hitesh Sharma via asuresh) 2016-05-26 14:56:37 -07:00
Karthik Kambatla
04ded558b0 YARN-5035. FairScheduler: Adjust maxAssign dynamically when assignMultiple is turned on. (kasha) 2016-05-26 14:41:07 -07:00
Karthik Kambatla
4f513a4a8e YARN-4866. FairScheduler: AMs can consume all vcores leading to a livelock when using FAIR policy. (Yufei Gu via kasha) 2016-05-25 22:13:27 -07:00
Carlo Curino
013532a95e YARN-4957. Add getNewReservation in ApplicationClientProtocol (Sean Po via curino) 2016-05-25 16:55:49 -07:00
Rohith Sharma K S
28bd63e92b YARN-5024. TestContainerResourceUsage#testUsageAfterAMRestartWithMultipleContainers random failure. Contributed by Bibin A Chundatt 2016-05-25 10:15:50 +05:30
Naganarasimha
edd716e99c YARN-5114. Add additional tests in TestRMWebServicesApps and rectify testInvalidAppAttempts failure in 2.8. Contributed by Bibin A Chundatt 2016-05-25 06:11:38 +08:00
Karthik Kambatla
f979d779e1 YARN-4878. Expose scheduling policy and max running apps over JMX for Yarn queues. (Yufei Gu via kasha) 2016-05-24 10:54:11 -07:00
Naganarasimha
b4078bd17b YARN-3971. Skip RMNodeLabelsManager#checkRemoveFromClusterNodeLabelsOfQueue on nodelabel recovery. (addendum patch). Contributed by Bibin A chundatt 2016-05-24 08:06:53 +08:00
Karthik Kambatla
6d043aa4cf YARN-4979. FSAppAttempt demand calculation considers demands at multiple locality levels different. (Zhihai Xu via kasha) 2016-05-23 14:29:28 -07:00
Jason Lowe
ac954486c5 YARN-5055. max apps per user can be larger than max per queue. Contributed by Eric Badger 2016-05-23 15:54:42 +00:00
Junping Du
22fcd819f0 YARN-5076. YARN web interfaces lack XFS protection. Contributed by Jonathan Maron.
(cherry picked from commit 2703ec68712279494d67b0d76b7ac10e7a1628be)
2016-05-19 14:15:21 -07:00
Arun Suresh
8a9ecb7584 YARN-5090. Add End-to-End test-cases for DistributedScheduling using MiniYarnCluster. (asuresh) 2016-05-17 19:01:29 -07:00
Jian He
fa3bc3405d YARN-4832. NM side resource value should get updated if change applied in RM side. Contributed by Junping Du 2016-05-17 12:52:19 -07:00
Eric Payne
1217c8f6b4 YARN-5069. TestFifoScheduler.testResourceOverCommit race condition. Contributed by Eric Badger. 2016-05-16 20:28:04 +00:00
Arun Suresh
f0ac18d001 YARN-2888. Corrective mechanisms for rebalancing NM container queues. (asuresh) 2016-05-13 13:38:36 -07:00
Rohith Sharma K S
b7ac85259c YARN-5068. Expose scheduler queue to application master. (Harish Jaiprakash via rohithsharmaks) 2016-05-12 15:17:49 +05:30
Junping Du
39f2bac38b YARN-5029. RM needs to send update event with YarnApplicationState as Running to ATS/AHS. Contributed by Xuan Gong. 2016-05-11 09:28:35 -07:00
Naganarasimha
2750fb900f YARN-4926. Change nodelabel rest API invalid reponse status to 400. Contributed by Bibin A Chundatt 2016-05-08 22:49:25 +05:30
Jason Lowe
b2ed6ae731 YARN-4747. AHS error 500 due to NPE when container start event is missing. Contributed by Varun Saxena 2016-05-06 22:59:39 +00:00
Wangda Tan
23248f63aa getApplicationReport call may raise NPE for removed queues. (Jian He via wangda) 2016-05-06 15:30:45 -07:00
Jian He
bb62e05925 YARN-4390. Do surgical preemption based on reserved container in CapacityScheduler. Contributed by Wangda Tan 2016-05-05 12:56:21 -07:00
Jason Lowe
d0da13229c YARN-4311. Removing nodes from include and exclude lists will not remove them from decommissioned nodes list. Contributed by Kuhu Shukla 2016-05-05 14:07:54 +00:00
Rohith Sharma K S
75e0450593 YARN-4947. Test timeout is happening for TestRMWebServicesNodes. Contributed by Bibin A Chundatt 2016-05-04 09:58:26 +05:30
Jason Lowe
ed54f5f1ff YARN-5003. Add container resource to RM audit log. Contributed by Nathan Roberts 2016-05-03 20:03:41 +00:00
Jian He
dd80042c42 YARN-5008. LeveldbRMStateStore database can grow substantially leading to long recovery times. Contributed by Jason Lowe 2016-04-28 21:27:25 -07:00
Karthik Kambatla
185c3d4de1 YARN-4807. MockAM#waitForState sleep duration is too long. (Yufei Gu via kasha) 2016-04-27 09:43:23 -07:00
Jian He
4beff01354 YARN-4983. JVM and UGI metrics disappear after RM transitioned to standby mode 2016-04-26 21:00:17 -07:00
Arun Suresh
341888a0aa YARN-4412. Create ClusterMonitor to compute ordered list of preferred NMs for OPPORTUNITIC containers. (asuresh) 2016-04-26 20:12:12 -07:00
Arun Suresh
c282a08f38 YARN-2885. Create AMRMProxy request interceptor and ContainerAllocator to distribute OPPORTUNISTIC containers to appropriate Nodes (asuresh)
(cherry picked from commit 2bf025278a318b0452fdc9ece4427b4c42124e39)
2016-04-24 22:38:33 -07:00
Wangda Tan
7cb3a3da96 YARN-4846. Fix random failures for TestCapacitySchedulerPreemption#testPreemptionPolicyShouldRespectAlreadyMarkedKillableContainers. (Bibin A Chundatt via wangda) 2016-04-22 11:40:32 -07:00
Eric Payne
3dce486d88 YARN-4556. TestFifoScheduler.testResourceOverCommit fails. Contributed by Akihiro Suda 2016-04-21 21:16:47 +00:00
Li Lu
7c6339f66a YARN-4968. A couple of AM retry unit tests need to wait SchedulerApplicationAttempt stopped. (Wangda Tan via gtcarrera9) 2016-04-21 13:25:33 -07:00
Karthik Kambatla
170c4fd4cd YARN-4784. Fairscheduler: defaultQueueSchedulingPolicy should not accept FIFO. (Yufei Gu via kasha) 2016-04-20 23:58:12 -07:00
Wangda Tan
33fd95a99c YARN-4890. Unit test intermittent failure: TestNodeLabelContainerAllocation#testQueueUsedCapacitiesUpdate. (Sunil G via wangda) 2016-04-20 17:37:38 -07:00
Wangda Tan
fdc46bfb37 YARN-4934. Reserved Resource for QueueMetrics needs to be handled correctly in few cases. (Sunil G via wangda) 2016-04-16 22:47:41 -07:00
Jason Lowe
69f3d428d5 YARN-4940. yarn node -list -all failed if RM start with decommissioned node. Contributed by sandflee 2016-04-15 20:36:45 +00:00
Jason Lowe
2a5da97f81 Revert "YARN-4311. Removing nodes from include and exclude lists will not remove them from decommissioned nodes list. Contributed by Kuhu Shukla"
This reverts commit 1cbcd4a491.
2016-04-11 15:51:01 +00:00
Akira Ajisaka
1ff27f9d12 YARN-4630. Remove useless boxing/unboxing code. Contributed by Kousuke Saruta. 2016-04-11 14:55:03 +09:00
Karthik Kambatla
ff95fd547b YARN-4927. TestRMHA#testTransitionedToActiveRefreshFail fails with FairScheduler. (Bibin A Chundatt via kasha) 2016-04-09 10:31:02 -07:00
Wangda Tan
ec06957941 YARN-3215. Respect labels in CapacityScheduler when computing headroom. (Naganarasimha G R via wangda) 2016-04-08 15:33:04 -07:00
Jian He
9cb0c963d2 YARN-4740. AM may not receive the container complete msg when it restarts. Contributed by Jun Gong 2016-04-08 11:20:35 -07:00
Wangda Tan
21eb428448 YARN-4699. Scheduler UI and REST o/p is not in sync when -replaceLabelsOnNode is used to change label of a node. (Sunil G via wangda) 2016-04-05 16:24:11 -07:00
Junping Du
6be28bcc46 YARN-4893. Fix some intermittent test failures in TestRMAdminService. Contributed by Brahma Reddy Battula. 2016-04-05 06:57:54 -07:00
Jason Lowe
1cbcd4a491 YARN-4311. Removing nodes from include and exclude lists will not remove them from decommissioned nodes list. Contributed by Kuhu Shukla 2016-04-05 13:40:19 +00:00
Rohith Sharma K S
776b549e2a YARN-4609. RM Nodes list page takes too much time to load. Contributed by Bibin A Chundatt 2016-04-05 14:47:25 +05:30
Rohith Sharma K S
552237d4a3 YARN-4880. Running TestZKRMStateStorePerf with real zookeeper cluster throws NPE. Contributed by Sunil G 2016-04-05 14:26:19 +05:30
naganarasimha
5092c94195 YARN-4746. yarn web services should convert parse failures of appId, appAttemptId and containerId to 400. Contributed by Bibin A Chundatt 2016-04-04 16:25:03 +05:30
Robert Kanter
7a021471c3 YARN-4639. Remove dead code in TestDelegationTokenRenewer added in YARN-3055 (templedf via rkanter) 2016-03-31 13:09:09 -07:00
Jian He
60e4116bf1 YARN-4822. Refactor existing Preemption Policy of CS for easier adding new approach to select preemption candidates. Contributed by Wangda Tan 2016-03-30 12:43:52 -07:00
Wangda Tan
fc055a3cbe YARN-4865. Track Reserved resources in ResourceUsage and QueueCapacities. (Sunil G via wangda) 2016-03-29 17:07:55 -07:00
Jian He
524bc3c33a YARN-998. Keep NM resource updated through dynamic resource config for RM/NM restart. Contributed by Junping Du 2016-03-28 11:12:33 -07:00
Karthik Kambatla
49ff54c860 YARN-4805. Don't go through all schedulers in ParameterizedTestBase. (kasha) 2016-03-26 21:45:13 -07:00
Arun Suresh
00bebb7e58 YARN-4823. Refactor the nested reservation id field in listReservation to simple string field. (subru via asuresh) 2016-03-25 15:54:38 -07:00
Allen Wittenauer
b1394d6307 YARN-4850. test-fair-scheduler.xml isn't valid xml (Yufei Gu via aw) 2016-03-24 08:15:58 -07:00
Junping Du
ca8106d2dd YARN-4785. inconsistent value type of the type field for LeafQueueInfo in response of RM REST API. 2016-03-17 09:04:41 -07:00
Karthik Kambatla
f84af8bd58 YARN-4812. TestFairScheduler#testContinuousScheduling fails intermittently. (kasha) 2016-03-17 05:54:06 -07:00
Wangda Tan
ae14e5d07f YARN-4108. CapacityScheduler: Improve preemption to only kill containers that would satisfy the incoming request. (Wangda Tan)
(cherry picked from commit 7e8c9beb41)
2016-03-16 17:02:33 -07:00
Wangda Tan
fa7a43529d Revert "CapacityScheduler: Improve preemption to only kill containers that would satisfy the incoming request. (Wangda Tan)"
This reverts commit 7e8c9beb41.
2016-03-16 17:02:10 -07:00
Wangda Tan
7e8c9beb41 CapacityScheduler: Improve preemption to only kill containers that would satisfy the incoming request. (Wangda Tan) 2016-03-16 16:59:59 -07:00
Karthik Kambatla
20d389ce61 YARN-4719. Add a helper library to maintain node state and allows common queries. (kasha) 2016-03-14 14:19:05 -07:00
Wangda Tan
0233d4e0ee YARN-4465. SchedulerUtils#validateRequest for Label check should happen only when nodelabel enabled. (Bibin A Chundatt via wangda) 2016-03-08 14:27:03 -08:00
Jian He
3c33158d1c YARN-4764. Application submission fails when submitted queue is not available in scheduler xml. Contributed by Bibin A Chundatt 2016-03-08 13:07:57 -08:00
Varun Vasudev
e51a8c1056 YARN-4737. Add CSRF filter support in YARN. Contributed by Jonathan Maron. 2016-03-07 15:26:44 +05:30
Zhihai Xu
e1ccc9622b YARN-4761. NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations on fair scheduler. Contributed by Sangjin Lee 2016-03-06 19:46:09 -08:00
Jian He
5c465df904 YARN-4671. There is no need to acquire CS lock when completing a container. Contributed by Meng Ding 2016-03-01 13:14:12 -08:00
Karthik Kambatla
9dafaaaf0d YARN-4704. TestResourceManager#testResourceAllocation() fails when using FairScheduler. (Yufei Gu via kasha) 2016-02-29 16:10:12 -08:00
Haohui Mai
0fa54d45b1 HADOOP-12813. Migrate TestRPC and related codes to rebase on ProtobufRpcEngine. Contributed by Kai Zheng. 2016-02-29 11:41:00 -08:00
Karthik Kambatla
f9692770a5 YARN-4718. Rename variables in SchedulerNode to reduce ambiguity post YARN-1011. (Inigo Goiri via kasha) 2016-02-28 09:35:59 -08:00
Jason Lowe
6b0f813e89 YARN-4723. NodesListManager$UnknownNodeId ClassCastException. Contributed by Kuhu Shukla 2016-02-26 20:24:50 +00:00
Junping Du
9ed17f181d YARN-3223. Resource update during NM graceful decommission. Contributed by Brook Zhou. 2016-02-23 03:30:26 -08:00
Tsuyoshi Ozawa
0e12114c9c YARN-4648. Move preemption related tests from TestFairScheduler to TestFairSchedulerPreemption. Contributed by Kai Sasaki. 2016-02-23 19:50:08 +09:00
Junping Du
3fab88540f YARN-4386. refreshNodesGracefully() should send recommission event to active RMNodes only. Contributed by Kuhu Shukla. 2016-02-22 07:04:19 -08:00
Arun Suresh
23f937e3b7 YARN-2575. Create separate ACLs for Reservation create/update/delete/list ops (Sean Po via asuresh) 2016-02-11 10:47:43 -08:00
Varun Vasudev
fa00d3e205 YARN-4655. Log uncaught exceptions/errors in various thread pools in YARN. Contributed by Sidharta Seethana. 2016-02-11 12:06:42 +05:30
Jian He
d16b17b4d2 YARN-4138. Roll back container resource allocation after resource increase token expires. Contributed by Meng Ding 2016-02-11 10:06:27 +08:00
=
b706cbc1bc YARN-4420. Add REST API for List Reservations (Sean Po via curino) 2016-02-10 10:19:26 -08:00
Arun Suresh
5cf5c41a89 YARN-4360. Improve GreedyReservationAgent to support "early" allocations, and performance improvements (curino via asuresh) 2016-02-10 09:11:15 -08:00
Devaraj K
565af873d5 YARN-4667. RM Admin CLI for refreshNodesResources throws NPE when nothing
is configured. Contributed by Naganarasimha G R.
2016-02-08 15:01:54 +05:30
Varun Vasudev
22a2b2231d YARN-4669. Fix logging statements in resource manager's Application class. Contributed by Sidharta Seethana. 2016-02-04 13:51:25 +05:30
Varun Vasudev
308d63f382 YARN-4307. Display blacklisted nodes for AM container in the RM web UI. Contributed by Naganarasimha G R. 2016-02-04 13:32:54 +05:30
Varun Vasudev
1adb64e09b YARN-4625. Make ApplicationSubmissionContext and ApplicationSubmissionContextInfo more consistent. Contributed by Xuan Gong. 2016-02-03 16:26:28 +05:30
Wangda Tan
9875325d5c YARN-4340. Add list API to reservation system. (Sean Po via wangda) 2016-02-02 10:17:33 +08:00
Jason Lowe
ed55950164 YARN-3102. Decommisioned Nodes not listed in Web UI. Contributed by Kuhu Shukla 2016-02-01 23:15:26 +00:00
Rohith Sharma K S
2673cbaf55 YARN-4615. Fix random test failure in TestAbstractYarnScheduler#testResourceRequestRecoveryToTheRightAppAttempt. (Sunil G via rohithsharmaks) 2016-02-01 10:43:56 +05:30
Jason Lowe
772ea7b41b YARN-4428. Redirect RM page to AHS page when AHS turned on and RM page is not available. Contributed by Chang Li 2016-01-29 21:48:54 +00:00
Jian He
f4a57d4a53 YARN-4617. LeafQueue#pendingOrderingPolicy should always use fixed ordering policy instead of using same as active applications ordering policy. Contributed by Rohith Sharma K S 2016-01-29 12:22:23 -08:00
Devaraj K
a277bdc9ed YARN-4411. RMAppAttemptImpl#createApplicationAttemptReport throws
IllegalArgumentException. Contributed by Bibin A Chundatt and yarntime.
2016-01-29 13:51:37 +05:30
Jian He
7f46636495 YARN-4519. Potential deadlock of CapacityScheduler between decrease container and assign containers. Contributed by Meng Ding 2016-01-28 14:51:00 -08:00
Rohith Sharma K S
ef343be82b YARN-4633. Fix random test failure in TestRMRestart#testRMRestartAfterPreemption. (Bibin A Chundatt via rohithsharmaks) 2016-01-28 21:53:45 +05:30
Karthik Kambatla
fb238d7e5d YARN-4462. FairScheduler: Disallow preemption from a queue. (Tao Jie via kasha) 2016-01-27 12:29:06 -08:00
Rohith Sharma K S
c01bee0108 YARN-4573. Fix test failure in TestRMAppTransitions#testAppRunningKill and testAppKilledKilled. (Takashi Ohnishi via rohithsharmaks) 2016-01-27 08:23:02 +05:30
rohithsharmaks
10dc2c0493 YARN-4613. Fix test failure in TestClientRMService#testGetClusterNodes. (Takashi Ohnishi via rohithsharmaks) 2016-01-24 23:36:15 +05:30
rohithsharmaks
99829eb221 YARN-4614. Fix random failure in TestApplicationPriority#testApplicationPriorityAllocationWithChangeInPriority. (Sunil G via rohithsharmaks) 2016-01-23 07:56:57 +05:30
rohithsharmaks
d6258b33a7 YARN-4497. RM might fail to restart when recovering apps whose attempts are missing. (Jun Gong via rohithsharmaks) 2016-01-22 20:27:38 +05:30
Akira Ajisaka
8f58f742ae YARN-4605. Spelling mistake in the help message of "yarn applicationattempt" command. Contributed by Weiwei Yang. 2016-01-22 19:43:06 +09:00
Rohith Sharma K S
e30668106d YARN-4584. RM startup failure when AM attempts greater than max-attempts. (Bibin A Chundatt via rohithsharmaks) 2016-01-22 10:14:46 +05:30
Jason Lowe
468a53b22f YARN-4610. Reservations continue looking for one app causes other apps to starve. Contributed by Jason Lowe 2016-01-21 18:31:29 +00:00
Wangda Tan
5ff5f67332 YARN-4557. Fix improper Queues sorting in PartitionedQueueComparator when accessible-node-labels=*. (Naganarasimha G R via wangda) 2016-01-21 11:21:06 +08:00
Xuan
890a2ebd1a YARN-4559. Make leader elector and zk store share the same curator
client. Contributed by Jian He
2016-01-20 14:48:10 -08:00
Jian He
edc43a9097 YARN-4565. Fix a bug that leads to AM resource limit not hornored when sizeBasedWeight enabled for FairOrderingPolicy. Contributed by Wangda Tan 2016-01-18 21:04:36 -08:00
Wangda Tan
a44ce3f14f YARN-4502. Fix two AM containers get allocated when AM restart. (Vinod Kumar Vavilapalli via wangda) 2016-01-19 09:30:04 +08:00
Wangda Tan
150f5ae034 Revert "YARN-4502. Fix two AM containers get allocated when AM restart. (Vinod Kumar Vavilapalli via wangda)"
This reverts commit 3fe5728563.

Conflicts:
	hadoop-yarn-project/CHANGES.txt
2016-01-19 09:27:36 +08:00
Karthik Kambatla
d40859fab1 YARN-4526. Make SystemClock singleton so AppSchedulingInfo could use it. (kasha) 2016-01-18 10:58:14 +01:00
Wangda Tan
3fe5728563 YARN-4502. Fix two AM containers get allocated when AM restart. (Vinod Kumar Vavilapalli via wangda)
(cherry picked from commit 805a9ed85e)
2016-01-18 17:06:05 +08:00
Wangda Tan
adf260a728 Revert "YARN-4502. Fix two AM containers get allocated when AM restart. (Vinod Kumar Vavilapalli via wangda)"
This reverts commit 805a9ed85e.
2016-01-18 16:50:45 +08:00
Wangda Tan
b08ecf5c75 YARN-4304. AM max resource configuration per partition to be displayed/updated correctly in UI and in various partition related metrics. (Sunil G via wangda) 2016-01-18 11:11:32 +08:00
Wangda Tan
805a9ed85e YARN-4502. Fix two AM containers get allocated when AM restart. (Vinod Kumar Vavilapalli via wangda) 2016-01-18 11:04:25 +08:00
Wangda Tan
9523648d57 YARN-4538. QueueMetrics pending cores and memory metrics wrong. (Bibin A Chundatt via wangda) 2016-01-18 10:57:14 +08:00
rohithsharmaks
f7736f464f YARN-4389. Allow application to enable or disable am blacklisting. (Sunil G via rohithsharmaks) 2016-01-15 21:38:26 +05:30
Karthik Kambatla
9d04f26d4c YARN-3446. FairScheduler headroom calculation should exclude nodes in the blacklist. (Zhihai Xu via kasha) 2016-01-14 08:33:23 -08:00
Wangda Tan
c0537bcd2c YARN-4571. Make app id/name available to the yarn authorizer provider for better auditing. (Jian He via wangda) 2016-01-13 13:18:31 +08:00
Wangda Tan
9e792da014 YARN-4582. Label-related invalid resource request exception should be able to properly handled by application. (Bibin A Chundatt via wangda) 2016-01-12 12:53:31 +08:00
Jian He
b8942be888 YARN-4537. Pull out priority comparison from fifocomparator and use compound comparator for FifoOrdering policy. Contributed by Rohith Sharma K S 2016-01-11 16:44:28 -08:00
Jian He
109e528ef5 YARN-4479. Change CS LeafQueue pendingOrderingPolicy to hornor recovered apps. Contributed by Rohith Sharma K S 2016-01-08 15:51:10 -08:00
Xuan
89022f8d4b YARN-4438. Implement RM leader election with curator. Contributed by Jian He 2016-01-07 14:33:06 -08:00
Junping Du
c1462a67ff YARN-4546. ResourceManager crash due to scheduling opportunity overflow. Contributed by Jason Lowe. 2016-01-06 05:49:24 -08:00
Wangda Tan
8310b2e9ff YARN-4522. Queue acl can be checked at app submission. (Jian He via wangda) 2015-12-30 15:30:12 -08:00
Jian He
5273413411 YARN-3480. Remove attempts that are beyond max-attempt limit from state store. Contributed by Jun Gong 2015-12-29 15:58:39 -08:00
Wangda Tan
561abb9fee YARN-4315. NaN in Queue percentage for cluster apps page. (Bibin A Chundatt via wangda) 2015-12-29 13:28:00 -08:00
Jian He
d0a22bae9b YARN-4417. Make RM and Timeline-server REST APIs more consistent. Contributed by Wangda Tan 2015-12-28 15:52:45 -08:00
Karthik Kambatla
0af492b4bd YARN-4156. TestAMRestart#testAMBlacklistPreventsRestartOnSameNode assumes CapacityScheduler. (Anubhav Dhoot via kasha) 2015-12-23 17:52:36 -08:00
Arun Suresh
e88422df45 YARN-4477. FairScheduler: Handle condition which can result in an infinite loop in attemptScheduling. (Tao Jie via asuresh) 2015-12-21 22:41:09 -08:00
Wangda Tan
bc038b382c YARN-4454. NM to nodelabel mapping going wrong after RM restart. (Bibin A Chundatt via wangda) 2015-12-21 11:30:13 -08:00
Jian He
85c2466048 YARN-4164. Changed updateApplicationPriority API to return the updated application priority. Contributed by Rohith Sharma K S 2015-12-18 14:13:48 -08:00
Junping Du
1de56b0448 YARN-3226. UI changes for decommissioning node. Contributed by Sunil G. 2015-12-17 15:20:17 -08:00
Wangda Tan
7faa406f27 YARN-4225. Add preemption status to yarn queue -status for capacity scheduler. (Eric Payne via wangda) 2015-12-16 13:19:40 -08:00
Wangda Tan
79c41b1d83 YARN-4293. ResourceUtilization should be a part of yarn node CLI. (Sunil G via wangda) 2015-12-16 13:18:19 -08:00
Junping Du
50bd067e1d YARN-4452. NPE when submit Unmanaged application. Contributed by Naganarasimha G R. 2015-12-16 10:57:39 -08:00
Zhihai Xu
2aaed10327 YARN-4440. FSAppAttempt#getAllowedLocalityLevelByTime should init the lastScheduler time. Contributed by Lin Yiqun 2015-12-15 00:17:21 -08:00
Wangda Tan
07b0fb996a YARN-4418. AM Resource Limit per partition can be updated to ResourceUsage as well. (Sunil G via wangda) 2015-12-14 11:24:30 -08:00
Wangda Tan
6cb0af3c39 YARN-3946. Update exact reason as to why a submitted app is in ACCEPTED state to app's diagnostic message. (Naganarasimha G R via wangda) 2015-12-14 10:52:46 -08:00
=
c25a635459 YARN-4248. REST API for submit/update/delete Reservations. (curino) 2015-12-07 13:33:28 -08:00
Xuan
4546c7582b YARN-4392. ApplicationCreatedEvent event time resets after RM
restart/failover. Contributed by Naganarasimha G R and Xuan Gong
2015-12-07 12:24:55 -08:00
Arun Suresh
742632e346 YARN-4358. Reservation System: Improve relationship between SharingPolicy and ReservationAgent. (Carlo Curino via asuresh) 2015-12-05 21:26:16 -08:00
Jian He
755dda8dd8 YARN-4405. Support node label store in non-appendable file system. Contributed by Wangda Tan 2015-12-03 17:45:31 -08:00
Wangda Tan
a2c3bfc8c1 YARN-4292. ResourceUtilization should be a part of NodeInfo REST API. (Sunil G via wangda) 2015-12-03 14:28:32 -08:00
Jian He
9f77ccad73 YARN-3840. Resource Manager web ui issue when sorting application by id (with application having id > 9999). Contributed by Mohammad Shahid Khan and Varun Saxena 2015-12-03 12:48:50 -08:00
Karthik Kambatla
52948bb20b YARN-3980. Plumb resource-utilization info in node heartbeat through to the scheduler. (Inigo Goiri via kasha) 2015-11-24 13:47:17 +05:30
Jian He
8676a118a1 YARN-4349. Support CallerContext in YARN. Contributed by Wangda Tan 2015-11-23 17:19:48 -08:00
Jason Lowe
d36b6e045f YARN-4344. NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations. Contributed by Varun Vasudev 2015-11-23 20:30:26 +00:00
Arun Suresh
da1016365a YARN-3454. Add efficient merge operation to RLESparseResourceAllocation (Carlo Curino via asuresh) 2015-11-21 09:59:41 -08:00
Wangda Tan
2346fa3141 YARN-3769. Consider user limit when calculating total pending resource for preemption policy in Capacity Scheduler. (Eric Payne via wangda) 2015-11-20 15:55:50 -08:00
Arun Suresh
6a61928fb7 YARN-4184. Remove update reservation state api from state store as its not used by ReservationSystem (Sean Po via asuresh) 2015-11-17 15:50:34 -08:00
Jian He
fcd7888029 Revert "YARN-3840. Resource Manager web ui issue when sorting application by id (with application having id > 9999) Contributed by Mohammad Shahid Khan"
This reverts commit 8fbea531d7.

Conflicts:
	hadoop-yarn-project/CHANGES.txt
2015-11-16 20:18:44 -08:00
Wangda Tan
7f55a18071 YARN-4347. Resource manager fails with Null pointer exception. (Jian He via wangda) 2015-11-12 11:23:40 -08:00
Wangda Tan
796638d9bc YARN-4287. Capacity Scheduler: Rack Locality improvement (Nathan Roberts via wangda) 2015-11-12 11:09:37 -08:00
Vinod Kumar Vavilapalli (I am also known as @tshooter.)
6351d3fa63 YARN-4183. Reverting the patch to fix behaviour change.
Revert "YARN-4183. Enabling generic application history forces every job to get a timeline service delegation token (jeagles)"

This reverts commit c293c58954.
2015-11-11 10:40:43 -08:00
Jian He
8fbea531d7 YARN-3840. Resource Manager web ui issue when sorting application by id (with application having id > 9999) Contributed by Mohammad Shahid Khan 2015-11-09 10:43:45 -08:00
Jian He
e5b1733e04 YARN-4127. RM fail with noAuth error if switched from failover to non-failover. Contributed by Varun Saxena 2015-10-29 15:42:57 -07:00
Jonathan Eagles
c293c58954 YARN-4183. Enabling generic application history forces every job to get a timeline service delegation token (jeagles) 2015-10-29 16:41:10 -05:00
Wangda Tan
56e4f6237a YARN-3216. Max-AM-Resource-Percentage should respect node labels. (Sunil G via wangda) 2015-10-26 16:44:39 -07:00
Wangda Tan
6f606214e7 YARN-4169. Fix racing condition of TestNodeStatusUpdaterForLabels. (Naganarasimha G R via wangda) 2015-10-26 16:36:34 -07:00
Wangda Tan
3cc73773eb YARN-4285. Display resource usage as percentage of queue and cluster in the RM UI (Varun Vasudev via wangda) 2015-10-26 13:07:39 -07:00
Jason Lowe
33a03af3c3 YARN-4284. condition for AM blacklisting is too narrow. Contributed by Sangjin Lee 2015-10-26 19:53:03 +00:00
Arun Suresh
ab8eb8770c YARN-3738. Add support for recovery of reserved apps running under dynamic queues (subru via asuresh) 2015-10-24 22:53:10 -07:00
Jason Lowe
d3a34a4f38 YARN-4041. Slow delegation token renewal can severely prolong RM recovery. Contributed by Sunil G 2015-10-23 20:57:01 +00:00
Ming Ma
934d96a334 YARN-2913. Fair scheduler should have ability to set MaxResourceDefault for each queue. (Siqi Li via mingma) 2015-10-23 08:36:33 -07:00
Zhihai Xu
960201b79b YARN-4256. YARN fair scheduler vcores with decimal values. Contributed by Jun Gong 2015-10-22 12:28:03 -07:00
Anubhav Dhoot
2798723a54 YARN-3739. Add reservation system recovery to RM recovery process. Contributed by Subru Krishnan. 2015-10-22 06:51:00 -07:00
Arun Suresh
506d1b1dbc YARN-3985. Make ReservationSystem persist state using RMStateStore reservation APIs. (adhoot via asuresh) 2015-10-20 16:46:14 -07:00
Arun Suresh
7e2837f830 YARN-4270. Limit application resource reservation on nodes for non-node/rack specific requests (asuresh) 2015-10-19 20:00:38 -07:00
Jian He
f9da5cdb2b YARN-4170. AM need to be notified with priority in AllocateResponse. Contributed by Sunil G 2015-10-16 15:26:27 -07:00
Wangda Tan
4337b263aa YARN-4162. CapacityScheduler: Add resource usage by partition and queue capacity by partition to REST API. (Naganarasimha G R via wangda) 2015-10-16 15:06:28 -07:00
Jian He
cf23f2c2b5 YARN-4000. RM crashes with NPE if leaf queue becomes parent queue during restart. Contributed by Varun Saxena 2015-10-15 17:12:46 -07:00
Jian He
9849c8b386 YARN-4230. RM crashes with NPE when increasing container resource if there is no headroom left. Contributed by Meng Ding 2015-10-12 11:51:33 -07:00
Zhihai Xu
049c6e8dc0 YARN-4201. AMBlacklist does not work for minicluster. Contributed by Jun Gong. 2015-10-12 00:14:25 -07:00
Devaraj K
db93047881 YARN-3964. Support NodeLabelsProvider at Resource Manager side.
Contributed by Dian Fu.
2015-10-11 11:21:29 +05:30
Wangda Tan
def374e666 YARN-4140. RM container allocation delayed incase of app submitted to Nodelabel partition. (Bibin A Chundatt via wangda) 2015-10-09 16:38:59 -07:00
Jason Lowe
a0bca2b5ad YARN-261. Ability to fail AM attempts. Contributed by Andrey Klochkov and Rohith Sharma K S 2015-10-09 14:17:38 +00:00
Rohith Sharma K S
8f195387a4 YARN-4235. FairScheduler PrimaryGroup does not handle empty groups returned for a user. (Anubhav Dhoot via rohithsharmaks) 2015-10-09 10:09:26 +05:30
Rohith Sharma K S
9156fc60c6 YARN-4209. RMStateStore FENCED state doesn’t work due to updateFencedState called by stateMachine.doTransition. (Zhihai Xu via rohithsharmaks) 2015-10-07 09:34:59 +05:30
Wangda Tan
29a582ada0 YARN-4215. RMNodeLabels Manager Need to verify and replace node labels for the only modified Node Label Mappings in the request. (Naganarasimha G R via wangda) 2015-10-06 11:56:04 -07:00
Xuan
8f08532bde YARN-1897. CLI and core support for signal container functionality. Contributed by Ming Ma 2015-10-02 18:50:47 -07:00
Anubhav Dhoot
9735afe967 YARN-4180. AMLauncher does not retry on failures when talking to NM. (adhoot) 2015-09-28 16:13:41 -07:00
Jason Lowe
9f53a95ff6 YARN-4141. Runtime Application Priority change should not throw exception for applications at finishing states. Contributed by Sunil G 2015-09-28 22:55:20 +00:00
Anubhav Dhoot
fb2e525c07 YARN-4204. ConcurrentModificationException in FairSchedulerQueueInfo. (adhoot) 2015-09-28 09:05:45 -07:00
Rohith Sharma K S
a9aafad12b YARN-4044. Running applications information changes such as movequeue is not published to TimeLine server. (Sunil G via rohithsharmaks) 2015-09-24 12:13:22 +05:30
Jian He
89cab1ba5f YARN-1651. CapacityScheduler side changes to support container resize. Contributed by Wangda Tan 2015-09-23 13:29:38 -07:00
Jian He
5f5a968d65 YARN-3867. ContainerImpl changes to support container resizing. Contributed by Meng Ding 2015-09-23 13:29:37 -07:00
Jian He
83a18add10 YARN-1449. AM-NM protocol changes to support container resizing. Contributed by Meng Ding & Wangda Tan) 2015-09-23 13:29:36 -07:00
Arun Suresh
94dec5a916 YARN-3920. FairScheduler container reservation on a node should be configurable to limit it to large containers (adhoot via asuresh) 2015-09-18 14:02:55 -07:00
Wangda Tan
9bc913a35c YARN-3212. RMNode State Transition Update with DECOMMISSIONING state. (Junping Du via wangda) 2015-09-18 10:04:17 -07:00
Rohith Sharma K S
723c31d45b YARN-4135. Improve the assertion message in MockRM while failing after waiting for the state.(Nijel S F via rohithsharmaks) 2015-09-18 08:44:10 +05:30
Jian He
6c6e734f0b YARN-4034. Render cluster Max Priority in scheduler metrics in RM web UI. Contributed by Rohith Sharma K S 2015-09-17 14:55:50 +08:00
Jian He
452079af8b YARN-4078. Add getPendingResourceRequestForAttempt in YarnScheduler interface. Contributed by Naganarasimha G R 2015-09-16 14:59:20 +08:00
Wangda Tan
ae5308fe1d YARN-3717. Expose app/am/queue's node-label-expression to RM web UI / CLI / REST-API. (Naganarasimha G R via wangda) 2015-09-15 11:40:50 -07:00
Junping Du
73e3a49eb0 YARN-313. Add Admin API for supporting node resource configuration in command line. (Contributed by Inigo Goiri, Kenji Kikushima and Junping Du) 2015-09-15 07:56:47 -07:00
Jian He
5468baa80a YARN-3635. Refactored current queue mapping implementation in CapacityScheduler to use a generic PlacementManager framework. Contributed by Wangda Tan 2015-09-15 15:39:20 +08:00
Jian He
e1b1d7e4ae YARN-4126. RM should not issue delegation tokens in unsecure mode. Contributed by Bibin A Chundatt 2015-09-14 14:09:19 +08:00
Karthik Kambatla
332b520a48 YARN-3697. FairScheduler: ContinuousSchedulingThread can fail to shutdown. (Zhihai Xu via kasha) 2015-09-13 18:07:43 -07:00
Karthik Kambatla
81df7b586a YARN-2005. Blacklisting support for scheduling AMs. (Anubhav Dhoot via kasha) 2015-09-13 17:03:15 -07:00
Robert Kanter
ea4bb2749f YARN-4145. Make RMHATestBase abstract so its not run when running all tests under that namespace (adhoot via rkanter) 2015-09-11 11:46:10 -07:00
Wangda Tan
bcc85e3bab YARN-4024. YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat. (Hong Zhiguo via wangda) 2015-09-04 15:13:53 -07:00
Jason Lowe
6eaca2e363 YARN-4105. Capacity Scheduler headroom for DRF is wrong. Contributed by Chang Li 2015-09-04 15:30:53 +00:00
Varun Vasudev
40d222e862 YARN-4103. RM WebServices missing scheme for appattempts logLinks. Contributed by Jonathan Eagles. 2015-09-04 14:31:51 +05:30
Varun Vasudev
b469ac531a YARN-3970. Add REST api support for Application Priority. Contributed by Naganarasimha G R. 2015-09-03 16:40:10 +05:30
Jian He
09c64ba1ba YARN-4101. RM should print alert messages if Zookeeper and Resourcemanager gets connection issue. Contributed by Xuan Gong 2015-09-02 17:45:23 -07:00
Rohith Sharma K S
7d6687fe76 YARN-3893. Both RM in active state when Admin#transitionToActive failure from refeshAll() (Bibin A Chundatt via rohithsharmaks) 2015-09-02 15:22:48 +05:30
Varun Vasudev
bf669b6d9f YARN-4082. Container shouldn't be killed when node's label updated. Contributed by Wangda Tan. 2015-09-01 14:19:11 +05:30
Junping Du
beb65c9465 YARN-1556. NPE getting application report with a null appId. Contributed by Weiwei Yang. 2015-08-28 05:57:34 -07:00
Jian He
a9c8ea71aa YARN-3250. Support admin cli interface in for Application Priority. Contributed by Rohith Sharma K S 2015-08-27 13:25:53 -07:00
Jian He
57c7ae1aff YARN-4014. Support user cli interface in for Application Priority. Contributed by Rohith Sharma K S 2015-08-24 20:36:44 -07:00
Rohith Sharma K S
feaf034994 YARN-3896. RMNode transitioned from RUNNING to REBOOTED because its response id has not been reset synchronously. (Jun Gong via rohithsharmaks) 2015-08-24 11:25:07 +05:30
Xuan
37e1c3d82a YARN-221. NM should provide a way for AM to tell it not to aggregate
logs. Contributed by Ming Ma
2015-08-22 16:25:24 -07:00
Rohith Sharma K S
22de7c1dca YARN-3986. getTransferredContainers in AbstractYarnScheduler should be present in YarnScheduler interface 2015-08-21 10:51:11 +05:30
Zhihai Xu
3a76a010b8 YARN-3857: Memory leak in ResourceManager with SIMPLE mode. Contributed by mujunchao. 2015-08-18 10:36:40 -07:00
Jian He
e5003be907 YARN-4026. Refactored ContainerAllocator to accept a list of priorites rather than a single priority. Contributed by Wangda Tan 2015-08-12 15:07:50 -07:00
rohithsharmaks
1c12adb71f YARN-4023. Publish Application Priority to TimelineServer. (Sunil G via rohithsharmaks) 2015-08-12 14:45:41 +05:30
Xuan
3ae716fa69 YARN-3999. RM hangs on draing events. Contributed by Jian He 2015-08-11 18:25:11 -07:00
Jian He
fa1d84ae27 YARN-3887. Support changing Application priority during runtime. Contributed by Sunil G 2015-08-10 20:51:54 -07:00
Wangda Tan
cf9d3c9256 YARN-3873. PendingApplications in LeafQueue should also use OrderingPolicy. (Sunil G via wangda) 2015-08-10 14:54:55 -07:00
Rohith Sharma K S
b6265d39c5 YARN-3948. Display Application Priority in RM Web UI.(Sunil G via rohithsharmaks) 2015-08-07 10:43:41 +05:30
Carlo Curino
8572a5a14b YARN-3974. Refactor the reservation system test cases to use parameterized base test. (subru via curino) 2015-08-02 01:55:31 -07:00
Arun Suresh
154c9d2e42 YARN-3961. Expose pending, running and reserved containers of a queue in REST api and yarn top (adhoot via asuresh) 2015-08-05 23:14:14 -07:00
rohithsharmaks
df9e7280db YARN-3992. TestApplicationPriority.testApplicationPriorityAllocation fails intermittently. (Contributed by Sunil G) 2015-08-06 10:43:37 +05:30
Jian He
ba2313d614 YARN-3983. Refactored CapacityScheduleri#FiCaSchedulerApp to easier extend container allocation logic. Contributed by Wangda Tan 2015-08-05 13:47:40 -07:00
Arun Suresh
f271d37735 YARN-3736. Add RMStateStore apis to store and load accepted reservations for failover (adhoot via asuresh) 2015-08-05 12:57:12 -07:00
Xuan
0306d902f5 YARN-3543. ApplicationReport should be able to tell whether the
Application is AM managed or not. Contributed by Rohith Sharma K S
2015-08-03 15:46:00 -07:00
Jonathan Eagles
3cd02b9522 YARN-3978. Configurably turn off the saving of container info in Generic AHS (Eric Payne via jeagles) 2015-08-03 10:38:05 -05:00
Jason Lowe
32e490b6c0 YARN-3990. AsyncDispatcher may overloaded with RMAppNodeUpdateEvent when Node is connected/disconnected. Contributed by Bibin A Chundatt 2015-07-31 17:37:24 +00:00
Zhihai Xu
ab80e27703 YARN-433. When RM is catching up with node updates then it should not expire acquired containers. Contributed by Xuan Gong 2015-07-30 21:57:11 -07:00
Wangda Tan
91b42e7d6e YARN-3971. Skip RMNodeLabelsManager#checkRemoveFromClusterNodeLabelsOfQueue on nodelabel recovery. (Bibin A Chundatt via wangda) 2015-07-30 10:00:31 -07:00
ccurino
156f24ead0 YARN-3656. LowCost: A Cost-Based Placement Agent for YARN Reservations. (Jonathan Yaniv and Ishai Menache via curino) 2015-07-25 07:39:47 -07:00
Jian He
83fe34ac08 YARN-3026. Move application-specific container allocation logic from LeafQueue to FiCaSchedulerApp. Contributed by Wangda Tan 2015-07-24 14:00:25 -07:00
Karthik Kambatla
d19d187753 YARN-3957. FairScheduler NPE In FairSchedulerQueueInfo causing scheduler page to return 500. (Anubhav Dhoot via kasha) 2015-07-24 11:44:37 -07:00
carlo curino
0fcb4a8cf2 YARN-3969. Allow jobs to be submitted to reservation that is active but does not have any allocations. (subru via curino) 2015-07-23 19:33:59 -07:00
Robert Kanter
1d3026e7b3 YARN-3900. Protobuf layout of yarn_security_token causes errors in other protos that include it (adhoot via rkanter) 2015-07-23 14:46:54 -07:00
Wangda Tan
3bba180051 YARN-3941. Proportional Preemption policy should try to avoid sending duplicate PREEMPT_CONTAINER event to scheduler. (Sunil G via wangda) 2015-07-23 10:07:57 -07:00
Wangda Tan
76ec26de80 YARN-3932. SchedulerApplicationAttempt#getResourceUsageReport and UserInfo should based on total-used-resources. (Bibin A Chundatt via wangda) 2015-07-22 11:54:02 -07:00
Wangda Tan
c39ca541f4 YARN-2003. Support for Application priority : Changes in RM and Capacity Scheduler. (Sunil G via wangda) 2015-07-21 09:57:23 -07:00
Arun Suresh
9b272ccae7 YARN-3535. Scheduler must re-request container resources when RMContainer transitions from ALLOCATED to KILLED (rohithsharma and peng.zhang via asuresh) 2015-07-17 04:31:34 -07:00
Wangda Tan
3540d5fe4b YARN-3885. ProportionalCapacityPreemptionPolicy doesn't preempt if queue is more than 2 level. (Ajith S via wangda) 2015-07-16 16:13:32 -07:00
Arun Suresh
ac94ba3e18 YARN-3453. Ensure preemption logic in FairScheduler uses DominantResourceCalculator in DRF queues to prevent unnecessary thrashing. (asuresh) 2015-07-14 00:23:55 -07:00
Wangda Tan
5ed1fead6b YARN-3894. RM startup should fail for wrong CS xml NodeLabel capacity configuration. (Bibin A Chundatt via wangda) 2015-07-12 21:52:11 -07:00
Wangda Tan
1df39c1efc YARN-3849. Too much of preemption activity causing continuos killing of containers across queues. (Sunil G via wangda) 2015-07-11 10:26:46 -07:00
Zhijie Shen
1ea36299a4 YARN-3116. RM notifies NM whether a container is an AM container or normal task container. Contributed by Giovanni Matteo Fumarola. 2015-07-10 18:58:10 -07:00
Ming Ma
08244264c0 YARN-3445. Cache runningApps in RMNode for getting running apps on given NodeId. (Junping Du via mingma) 2015-07-10 08:30:10 -07:00
carlo curino
0e602fa3a1 YARN-3800. Reduce storage footprint for ReservationAllocation. Contributed by Anubhav Dhoot. 2015-07-09 16:51:59 -07:00
Wangda Tan
0e4b06690f YARN-3508. Prevent processing preemption events on the main RM dispatcher. (Varun Saxena via wangda) 2015-07-01 17:32:22 -07:00
Devaraj K
80a68d6056 YARN-3830. AbstractYarnScheduler.createReleaseCache may try to clean a
null attempt. Contributed by nijel.
2015-07-01 19:03:44 +05:30
Xuan
fe6c1bd73a YARN-2871. TestRMRestart#testRMRestartGetApplicationList sometime fails
in trunk. Contributed by zhihai xu
2015-06-26 19:43:59 -07:00
Xuan
5b5bb8dcdc YARN-3802. Two RMNodes for the same NodeId are used in RM sometimes
after NM is reconnected. Contributed by zhihai xu
2015-06-18 14:37:49 -07:00
Xuan
a826d432f9 YARN-3804. Both RM are on standBy state when kerberos user not in yarn.admin.acl. Contributed by Varun Saxena 2015-06-17 16:23:27 -07:00
Devaraj K
d8dcfa98e3 YARN-3794. TestRMEmbeddedElector fails because of ambiguous LOG reference.
Contributed by Chengbing Liu.
2015-06-12 13:42:49 +05:30
Xuan
5583f88bf7 YARN-3785. Support for Resource as an argument during submitApp call in
MockRM test class. Contributed by Sunil G
2015-06-10 21:40:48 -07:00
Jian He
960b8f19ca YARN-2716. Refactor ZKRMStateStore retry code with Apache Curator. Contributed by Karthik Kambatla 2015-06-08 14:50:58 -07:00
Karthik Kambatla
bd69ea408f YARN-3655. FairScheduler: potential livelock due to maxAMShare limitation and container reservation. (Zhihai Xu via kasha) 2015-06-07 11:37:52 -07:00
Xuan
3e000a919f YARN-1462. AHS API and other AHS changes to handle tags for completed MR jobs. Contributed by Xuan Gong 2015-06-05 12:48:52 -07:00
Karthik Kambatla
75885852cc YARN-3259. FairScheduler: Trigger fairShare updates on node events. (Anubhav Dhoot via kasha) 2015-06-05 09:39:41 -07:00
Jian He
6ad4e59cfc YARN-3764. CapacityScheduler should forbid moving LeafQueue from one parent to another. Contributed by Wangda Tan 2015-06-04 10:52:59 -07:00
Wangda Tan
ebd797c48f YARN-3733. Fix DominantRC#compare() does not work as expected if cluster resource is empty. (Rohith Sharmaks via wangda) 2015-06-04 10:22:57 -07:00
Junping Du
d7e7f6aa03 YARN-41. The RM should handle the graceful shutdown of the NM. Contributed by Devaraj K. 2015-06-04 04:59:27 -07:00
Xuan
5766a04428 YARN-3749. We should make a copy of configuration when init
MiniYARNCluster with multiple RMs. Contributed by Chun Chen
2015-06-03 17:20:15 -07:00
Zhijie Shen
bc85959edd Revert "YARN-1462. Made RM write application tags to timeline server and exposed them to users via generic history web UI and REST API. Contributed by Xuan Gong."
This reverts commit 4a9ec1a824.
2015-06-03 14:15:56 -07:00
Zhijie Shen
4a9ec1a824 YARN-1462. Made RM write application tags to timeline server and exposed them to users via generic history web UI and REST API. Contributed by Xuan Gong. 2015-05-30 21:05:36 -07:00
Vinod Kumar Vavilapalli
9acd24fec4 Fixed more FilesSystemRMStateStore issues. Contributed by Vinod Kumar Vavilapalli. 2015-05-28 15:25:56 -07:00
Allen Wittenauer
d6e3164d4a YARN-2355. MAX_APP_ATTEMPTS_ENV may no longer be a useful env var for a container (Darrell Taylor via aw) 2015-05-27 16:40:56 -07:00
Wangda Tan
ec0a852a37 YARN-3647. RMWebServices api's should use updated api from CommonNodeLabelsManager to get NodeLabel object. (Sunil G via wangda) 2015-05-27 12:51:20 -07:00
Wangda Tan
cdbd66be11 YARN-3686. CapacityScheduler should trim default_node_label_expression. (Sunil G via wangda) 2015-05-26 15:58:47 -07:00
Jian He
10732d515f YARN-3632. Ordering policy should be allowed to reorder an application when demand changes. Contributed by Craig Welch 2015-05-26 12:00:51 -07:00
Tsuyoshi Ozawa
9a3d617b63 YARN-2336. Fair scheduler's REST API returns a missing '[' bracket JSON for deep queue tree. Contributed by Kenji Kikushima and Akira Ajisaka. 2015-05-26 19:07:40 +09:00
Karthik Kambatla
4513761869 YARN-3675. FairScheduler: RM quits when node removal races with continuous-scheduling on the same node. (Anubhav Dhoot via kasha) 2015-05-21 13:44:42 -07:00
Jian He
8966d42179 YARN-3609. Load node labels from storage inside RM serviceStart. Contributed by Wangda Tan 2015-05-20 16:30:07 -07:00
Wangda Tan
563eb1ad2a YARN-3583. Support of NodeLabel object instead of plain String in YarnClient side. (Sunil G via wangda) 2015-05-19 16:54:38 -07:00
Wangda Tan
b37da52a1c YARN-3565. NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object instead of String. (Naganarasimha G R via wangda) 2015-05-19 16:34:17 -07:00
Jason Lowe
f7e051c431 YARN-2421. RM still allocates containers to an app in the FINISHING state. Contributed by Chang Li 2015-05-15 22:09:30 +00:00
Vinod Kumar Vavilapalli
9a2a9553ee Fixing HDFS state-store. Contributed by Arun Suresh. 2015-05-14 16:13:51 -07:00
Junping Du
15ccd967ee YARN-3505. Node's Log Aggregation Report with SUCCEED should not cached in RMApps. Contributed by Xuan Gong. 2015-05-14 10:58:12 -07:00
Wangda Tan
0e85044e26 YARN-3362. Add node label usage in RM CapacityScheduler web UI. (Naganarasimha G R via wangda) 2015-05-13 17:00:36 -07:00
Wangda Tan
7f19e7a254 YARN-3521. Support return structured NodeLabel objects in REST API (Sunil G via wangda) 2015-05-13 13:43:17 -07:00
Wangda Tan
341a476812 YARN-2921. Fix MockRM/MockAM#waitForState sleep too long. (Tsuyoshi Ozawa via wangda) 2015-05-13 13:06:07 -07:00
Karthik Kambatla
a60f78e98e YARN-3395. FairScheduler: Trim whitespaces when using username for queuename. (Zhihai Xu via kasha) 2015-05-09 15:41:20 -07:00
Karthik Kambatla
70fb37cd79 YARN-1287. Consolidate MockClocks. (Sebastian Wong and Anubhav Dhoot via kasha) 2015-05-09 14:34:54 -07:00
Karthik Kambatla
2fb44c8aaf YARN-3271. FairScheduler: Move tests related to max-runnable-apps from TestFairScheduler to TestAppRunnability. (nijel via kasha) 2015-05-08 16:39:10 -07:00
Jian He
f489a4ec96 YARN-2918. RM should not fail on startup if queue's configured labels do not exist in cluster-node-labels. Contributed by Wangda Tan 2015-05-07 17:35:41 -07:00
Akira Ajisaka
918af8efff YARN-3577. Misspelling of threshold in log4j.properties for tests. Contributed by Brahma Reddy Battula. 2015-05-07 13:33:03 +09:00
Vinod Kumar Vavilapalli
4c7b9b6abe YARN-3385. Fixed a race-condition in ResourceManager's ZooKeeper based state-store to avoid crashing on duplicate deletes. Contributed by Zhihai Xu. 2015-05-06 17:51:17 -07:00
Junping Du
31b627b2a8 YARN-3580. [JDK8] TestClientRMService.testGetLabelsToNodes fails. Contributed by Robert Kanter. 2015-05-06 16:51:05 -07:00
Jian He
e4c3b52c89 YARN-3343. Increased TestCapacitySchedulerNodeLabelUpdate#testNodeUpdate timeout. Contributed by Rohith Sharmaks 2015-05-05 11:33:47 -07:00
Jian He
d701acc9c6 YARN-2725. Added test cases of retrying creating znode in ZKRMStateStore. Contributed by Tsuyoshi Ozawa 2015-05-04 16:13:29 -07:00
Gera Shegalov
f8204e241d YARN-2893. AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream. (Zhihai Xu via gera) 2015-05-01 18:18:55 -07:00
Wangda Tan
e2e8f77118 YARN-3564. Fix TestContainerAllocation.testAMContainerAllocationWhenDNSUnavailable fails randomly. (Jian He via wangda) 2015-04-30 11:03:19 -07:00
Jian He
4c1af156ae YARN-3533. Test: Fix launchAM in MockRM to wait for attempt to be scheduled. Contributed by Anubhav Dhoot 2015-04-29 14:50:01 -07:00
tgraves
2e215484bd YARN-3517. RM web ui for dumping scheduler logs should be for admins only (Varun Vasudev via tgraves) 2015-04-29 21:25:42 +00:00
Karthik Kambatla
8f82970e0c YARN-3485. FairScheduler headroom calculation doesn't consider maxResources for Fifo and FairShare policies. (kasha) 2015-04-28 21:00:35 -07:00
Wangda Tan
db1b674b50 YARN-2740. Fix NodeLabelsManager to properly handle node label modifications when distributed node label configuration enabled. (Naganarasimha G R via wangda) 2015-04-27 16:24:38 -07:00
Jian He
d497f6ea2b YARN-2498. Respect labels in preemption policy of capacity scheduler for inter-queue preemption. Contributed by Wangda Tan 2015-04-24 17:03:13 -07:00
Jian He
d03dcb9635 YARN-3387. Previous AM's container completed status couldn't pass to current AM if AM and RM restarted during the same time. Contributed by Sandflee 2015-04-24 12:13:29 -07:00
Vinod Kumar Vavilapalli
f5fe35e297 YARN-3413. Changed Nodelabel attributes (like exclusivity) to be settable only via addToClusterNodeLabels but not changeable at runtime. (Wangda Tan via vinodkv) 2015-04-23 11:19:55 -07:00
Wangda Tan
395205444e YARN-3319. Implement a FairOrderingPolicy. (Craig Welch via wangda) 2015-04-23 10:47:15 -07:00
tgraves
189a63a719 YARN-3434. Interaction between reservations and userlimit can result in significant ULF violation 2015-04-23 14:39:25 +00:00
Junping Du
fad9d7e85b New parameter of CLI for decommissioning node gracefully in RMAdmin CLI. Contributed by Devaraj K 2015-04-22 10:07:20 -07:00
Jian He
bdd90110e6 YARN-3494. Expose AM resource limit and usage in CS QueueMetrics. Contributed by Rohith Sharmaks 2015-04-21 20:06:20 -07:00
Wangda Tan
e71d0d87d9 YARN-3410. YARN admin should be able to remove individual application records from RMStateStore. (Rohith Sharmaks via wangda) 2015-04-21 17:51:22 -07:00
Wangda Tan
44872b76fc YARN-3463. Integrate OrderingPolicy Framework with CapacityScheduler. (Craig Welch via wangda) 2015-04-20 17:12:32 -07:00
Wangda Tan
f65eeb412d YARN-3493. RM fails to come up with error "Failed to load/recover state" when mem settings are changed. (Jian He via wangda) 2015-04-17 17:11:22 -07:00
Jian He
d573f09fb9 YARN-2696. Queue sorting in CapacityScheduler should consider node label. Contributed by Wangda Tan 2015-04-17 13:36:59 -07:00
Junping Du
1db355a875 YARN-1402. Update related Web UI and CLI with exposing client API to check log aggregation status. Contributed by Xuan Gong. 2015-04-17 13:18:59 -07:00
Jian He
bb6dde68f1 YARN-3021. YARN's delegation-token handling disallows certain trust setups to operate properly over DistCp. Contributed by Yongjun Zhang 2015-04-16 19:43:37 -07:00
Jian He
1b89a3e173 YARN-3354. Add node label expression in ContainerTokenIdentifier to support RM recovery. Contributed by Wangda Tan 2015-04-15 13:57:06 -07:00
Tsuyoshi Ozawa
e48cedc663 YARN-3326. Support RESTful API for getLabelsToNodes. Contributed by Naganarasimha G R. 2015-04-15 14:03:55 -05:00
Wangda Tan
5004e75332 YARN-3318. Create Initial OrderingPolicy Framework and FifoOrderingPolicy. (Craig Welch via wangda) 2015-04-15 09:56:32 -07:00
Jian He
0fefda645b YARN-3361. CapacityScheduler side changes to support non-exclusive node labels. Contributed by Wangda Tan 2015-04-14 11:45:58 -07:00
Jian He
b46ee1e7a3 YARN-3266. RMContext#inactiveNodes should have NodeId as map key. Contributed by Chengbing Liu 2015-04-14 10:54:22 -07:00
Jian He
a1afbc48b5 YARN-3472. Fixed possible leak in DelegationTokenRenewer#allTokens. Contributed by Rohith Sharmaks 2015-04-13 14:07:17 -07:00
Junping Du
92431c9617 YARN-1376. NM need to notify the log aggregation status to RM through Node heartbeat. Contributed by Xuan Gong. 2015-04-10 08:56:18 -07:00
Xuan
afa5d4715a YARN-3293. Track and display capacity scheduler health metrics in web
UI. Contributed by Varun Vasudev
2015-04-09 23:38:04 -07:00
Vinod Kumar Vavilapalli
9c5911294e YARN-3055. Fixed ResourceManager's DelegationTokenRenewer to not stop token renewal of applications part of a bigger workflow. Contributed by Daryn Sharp. 2015-04-09 13:08:53 -07:00
Robert Kanter
99b08a748e YARN-2429. TestAMRMTokens.testTokenExpiry fails Intermittently with error message:Invalid AMRMToken (zxu via rkanter) 2015-04-06 14:11:20 -07:00
Tsuyoshi Ozawa
53959e69f7 TestFairScheduler.testContinuousScheduling fails Intermittently. Contributed by Zhihai Xu. 2015-04-06 20:19:13 +09:00
Sandy Ryza
6a6a59db7f YARN-3415. Non-AM containers can be counted towards amResourceUsage of a fairscheduler queue (Zhihai Xu via Sandy Ryza) 2015-04-02 13:56:08 -07:00
Xuan
4728bdfa15 YARN-3248. Display count of nodes blacklisted by apps in the web UI.
Contributed by Varun Vasudev
2015-04-01 04:19:18 -07:00
Karthik Kambatla
79f7f2aabf YARN-3412. RM tests should use MockRM where possible. (kasha) 2015-03-31 09:14:15 -07:00
Wangda Tan
2a945d24f7 YARN-2495. Allow admin specify labels from each NM (Distributed configuration for node label). (Naganarasimha G R via wangda) 2015-03-30 12:05:21 -07:00
Karthik Kambatla
2bc097cd14 YARN-3241. FairScheduler handles invalid queue names inconsistently. (Zhihai Xu via kasha) 2015-03-23 13:22:03 -07:00
cnauroth
6ca1f12024 YARN-3336. FileSystem memory leak in DelegationTokenRenewer. 2015-03-23 10:45:50 -07:00
Jian He
e1feb4ea1a YARN-3345. Add non-exclusive node label API. Contributed by Wangda Tan 2015-03-20 19:04:38 -07:00
Jian He
586348e4cb YARN-3356. Capacity Scheduler FiCaSchedulerApp should use ResourceUsage to track used-resources-by-label. Contributed by Wangda Tan 2015-03-20 13:54:01 -07:00
Devaraj K
93d0f4acc8 YARN-3357. Move TestFifoScheduler to FIFO package. Contributed by Rohith
Sharmaks.
2015-03-19 12:16:52 +05:30
Jian He
658097d6da YARN-3273. Improve scheduler UI to facilitate scheduling analysis and debugging. Contributed Rohith Sharmaks 2015-03-17 21:30:23 -07:00
Tsuyoshi Ozawa
3bc72cc16d YARN-3205. FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration. Contributed by Zhihai Xu. 2015-03-18 11:53:19 +09:00
Jian He
968425e9f7 YARN-3305. Normalize AM resource request on app submission. Contributed by Rohith Sharmaks 2015-03-17 13:49:59 -07:00
Jian He
487374b7fe YARN-3243. CapacityScheduler should pass headroom from parent to children to make sure ParentQueue obey its capacity limits. Contributed by Wangda Tan. 2015-03-17 10:24:23 -07:00
Tsuyoshi Ozawa
7522a643fa YARN-3349. Treat all exceptions as failure in TestFSRMStateStore#testFSRMStateStoreClientRetry. Contributed by Zhihai Xu. 2015-03-17 08:09:55 +09:00
Vinod Kumar Vavilapalli
863079bb87 YARN-3154. Added additional APIs in LogAggregationContext to avoid aggregating running logs of application when rolling is enabled. Contributed by Xuan Gong. 2015-03-12 13:32:29 -07:00
Zhijie Shen
85f6d67fa7 YARN-1884. Added nodeHttpAddress into ContainerReport and fixed the link to NM web page. Contributed by Xuan Gong. 2015-03-11 19:35:19 -07:00
Jason Lowe
27e8ea820f YARN-3275. CapacityScheduler: Preemption happening on non-preemptable queues. Contributed by Eric Payne 2015-03-06 22:37:26 +00:00
Jian He
95bfd087dc YARN-1809. Synchronize RM and TimeLineServer Web-UIs. Contributed by Zhijie Shen and Xuan Gong 2015-03-05 21:20:09 -08:00
Karthik Kambatla
8d88691d16 YARN-3242. Asynchrony in ZK-close can lead to ZKRMStateStore watcher receiving events for old client. (Zhihai Xu via kasha) 2015-03-04 19:49:05 -08:00
Karthik Kambatla
22426a1c9f YARN-3231. FairScheduler: Changing queueMaxRunningApps interferes with pending jobs. (Siqi Li via kasha) 2015-03-04 18:06:58 -08:00
Jian He
b2f1ec312e YARN-3222. Fixed NPE on RMNodeImpl#ReconnectNodeTransition when a node is reconnected with a different port. Contributed by Rohith Sharmaks 2015-03-03 16:28:28 -08:00
Wangda Tan
e17e5ba9d7 YARN-3272. Surface container locality info in RM web UI (Jian He via wangda) 2015-03-03 11:49:01 -08:00
Vinod Kumar Vavilapalli
14dd647c55 YARN-3265. Fixed a deadlock in CapacityScheduler by always passing a queue's available resource-limit from the parent queue. Contributed by Wangda Tan. 2015-03-02 17:52:47 -08:00
Wangda Tan
edcecedc1c YARN-3262. Surface application outstanding resource requests table in RM web UI. (Jian He via wangda) 2015-02-27 16:13:32 -08:00
Tsuyoshi Ozawa
01a1621930 YARN-2820. Retry in FileSystemRMStateStore when FS's operations fail due to IOException. Contributed by Zhihai Xu. 2015-02-28 00:56:44 +09:00
Devaraj K
0d4296f0e0 YARN-3256. TestClientToAMTokens#testClientTokenRace is not running against
all Schedulers even when using ParameterizedSchedulerTestBase. Contributed
by Anubhav Dhoot.
2015-02-26 15:45:41 +05:30
Tsuyoshi Ozawa
6cbd9f1113 YARN-3247. TestQueueMappings should use CapacityScheduler explicitly. Contributed by Zhihai Xu. 2015-02-25 10:38:11 +09:00
Xuan
fe7a302473 YARN-2797. TestWorkPreservingRMRestart should use
ParametrizedSchedulerTestBase. Contributed by Karthik Kambatla
2015-02-21 19:17:29 -08:00
Jason Lowe
a64dd3d24b YARN-3194. RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node. Contributed by Rohith 2015-02-20 15:10:10 +00:00
Jian He
c0d9b93953 YARN-933. Fixed InvalidStateTransitonException at FINAL_SAVING state in RMApp. Contributed by Rohith Sharmaks 2015-02-19 15:42:39 -08:00
Wangda Tan
d49ae725d5 YARN-3076. Add API/Implementation to YarnClient to retrieve label-to-node mapping (Varun Saxena via wangda) 2015-02-19 11:00:57 -08:00
Jian He
1c03376300 YARN-1514. Utility to benchmark ZKRMStateStore#loadState for RM HA. Contributed by Tsuyoshi OZAWA 2015-02-18 16:06:55 -08:00
Jian He
f5da5566d9 YARN-3132. RMNodeLabelsManager should remove node from node-to-label mapping when node becomes deactivated. Contributed by Wangda Tan 2015-02-18 11:51:51 -08:00
Jian He
18297e0972 YARN-3104. Fixed RM to not generate new AMRM tokens on every heartbeat between rolling and activation. Contributed by Jason Lowe 2015-02-12 16:02:45 -08:00
Jian He
18a594257e YARN-3124. Fixed CS LeafQueue/ParentQueue to use QueueCapacities to track capacities-by-label. Contributed by Wangda Tan 2015-02-12 14:58:09 -08:00
Xuan
65c69e296e YARN-3151. On Failover tracking url wrong in application cli for KILLED
application. Contributed by Rohith
2015-02-11 21:19:48 -08:00
Zhijie Shen
d5855c0e46 YARN-2246. Made the proxy tracking URL always be http(s)://proxy addr:port/proxy/<appId> to avoid duplicate sections. Contributed by Devaraj K. 2015-02-10 15:24:01 -08:00
Zhijie Shen
23bf6c7207 YARN-3100. Made YARN authorization pluggable. Contributed by Jian He. 2015-02-09 20:34:56 -08:00
Jian He
0af6a99a3f YARN-3094. Reset timer for liveness monitors after RM recovery. Contributed by Jun Gong 2015-02-09 13:47:08 -08:00
Karthik Kambatla
7e42088abf YARN-2990. FairScheduler's delay-scheduling always waits for node-local and rack-local delays, even for off-rack-only requests. (kasha) 2015-02-08 22:48:10 -08:00
Jason Lowe
da2fb2bc46 YARN-3143. RM Apps REST API can return NPE or entries missing id and other fields. Contributed by Jason Lowe 2015-02-06 21:47:32 +00:00
Jian He
c1957fef29 YARN-2694. Ensure only single node label specified in ResourceRequest. Contributed by Wangda Tan 2015-02-06 11:34:20 -08:00
Jason Lowe
69c8a7f45b YARN-1582. Capacity Scheduler: add a maximum-allocation-mb setting per queue. Contributed by Thomas Graves 2015-02-05 19:28:49 +00:00
Sandy Ryza
b6466deac6 YARN-3101. In Fair Scheduler, fix canceling of reservations for exceeding max share (Anubhav Dhoot via Sandy Ryza) 2015-02-05 09:39:28 -08:00
Zhijie Shen
e5b56e2af6 YARN-2543. Made resource usage be published to the timeline server too. Contributed by Naganarasimha G R. 2015-02-03 17:34:22 -08:00
Wangda Tan
5bd984691b YARN-3075. NodeLabelsManager implementation to retrieve label to node mapping (Varun Saxena via wangda) 2015-02-03 12:52:42 -08:00
Jian He
21d80b3dd9 YARN-3098. Created common QueueCapacities class in Capacity Scheduler to track capacities-by-labels of queues. Contributed by Wangda Tan 2015-02-03 11:43:12 -08:00
Jason Lowe
a761bf8726 YARN-3085. Application summary should include the application type. Contributed by Rohith 2015-02-03 14:56:34 +00:00
Jian He
054a947989 YARN-3077. Fixed RM to create zk root path recursively. Contributed by Chun Chen 2015-01-30 17:34:49 -08:00
Jian He
86358221fc YARN-3099. Capacity Scheduler LeafQueue/ParentQueue should use ResourceUsage to track used-resources-by-label. Contributed by Wangda Tan 2015-01-30 15:15:20 -08:00
Wangda Tan
7882bc0f14 YARN-3079. Scheduler should also update maximumAllocation when updateNodeResource. (Zhihai Xu via wangda) 2015-01-28 21:54:38 -08:00
Wangda Tan
18741adf97 YARN-2932. Add entry for preemptable status (enabled/disabled) to scheduler web UI and queue initialize/refresh logging. (Eric Payne via wangda) 2015-01-27 15:36:09 -08:00
Jian He
6f9fe76918 YARN-3092. Created a common ResourceUsage class to track labeled resource usages in Capacity Scheduler. Contributed by Wangda Tan 2015-01-26 15:38:00 -08:00
Tsuyoshi Ozawa
24aa462673 YARN-2800. Remove MemoryNodeLabelsStore and add a way to enable/disable node labels feature. Contributed by Wangda Tan. 2015-01-23 20:37:05 +09:00
Wangda Tan
0a2d3e717d YARN-2933. Capacity Scheduler preemption policy should only consider capacity without labels temporarily. Contributed by Mayank Bansal 2015-01-19 16:48:50 -08:00
Junping Du
5d1cca34fa YARN-3064. TestRMRestart/TestContainerResourceUsage/TestNodeManagerResync failure with allocation timeout. (Contributed by Jian He) 2015-01-16 00:08:36 -08:00
Jian He
c53420f583 YARN-2637. Fixed max-am-resource-percent calculation in CapacityScheduler when activating applications. Contributed by Craig Welch 2015-01-13 17:32:07 -08:00
Robert Kanter
ae7bf31fe1 YARN-3027. Scheduler should use totalAvailable resource from node instead of availableResource for maxAllocation. (adhoot via rkanter) 2015-01-12 10:47:52 -08:00
Zhijie Shen
60103fca04 YARN-2427. Added the API of moving apps between queues in RM web services. Contributed by Varun Vasudev. 2015-01-06 14:37:44 -08:00
Karthik Kambatla
0c4b112677 YARN-2881. [YARN-2574] Implement PlanFollower for FairScheduler. (Anubhav Dhoot via kasha) 2015-01-06 04:42:10 +05:30
Zhijie Shen
562a701945 YARN-2958. Made RMStateStore not update the last sequence number when updating the delegation token. Contributed by Varun Saxena. 2015-01-05 13:33:07 -08:00
Tsuyoshi Ozawa
ddc5be48fc YARN-2922. ConcurrentModificationException in CapacityScheduler's LeafQueue. Contributed by Rohith Sharmaks. 2015-01-05 00:08:31 +09:00
Karthik Kambatla
e7257acd8a YARN-2998. Abstract out scheduler independent PlanFollower components. (Anubhav Dhoot via kasha) 2014-12-30 19:55:24 -08:00
Jian He
e2351c7ae2 YARN-2987. Fixed ClientRMService#getQueueInfo to check against queue and app ACLs. Contributed by Varun Saxena 2014-12-30 17:15:37 -08:00
Jian He
b7442bf92e YARN-2493. Added node-labels page on RM web UI. Contributed by Wangda Tan 2014-12-30 16:49:01 -08:00
Jian He
746ad6e989 Revert "YARN-2492(wrong jira number). Added node-labels page on RM web UI. Contributed by Wangda Tan"
This reverts commit 5f57b904f5.
2014-12-30 16:48:49 -08:00
Jian He
5f57b904f5 YARN-2492. Added node-labels page on RM web UI. Contributed by Wangda Tan 2014-12-30 15:38:28 -08:00
Jian He
4f18018b7a YARN-2946. Fixed potential deadlock in RMStateStore. Contributed by Rohith Sharmaks 2014-12-23 22:14:29 -08:00
Jian He
149512a837 YARN-2837. Support TimeLine server to recover delegation token when restarting. Contributed by Zhijie Shen 2014-12-23 18:25:37 -08:00
Jian He
0d89859b51 YARN-2340. Fixed NPE when queue is stopped during RM restart. Contributed by Rohith Sharmaks 2014-12-22 21:53:22 -08:00
Jian He
fdf042dfff YARN-2920. Changed CapacityScheduler to kill containers on nodes where node labels are changed. Contributed by Wangda Tan 2014-12-22 16:51:15 -08:00
Karthik Kambatla
24ee9e3431 YARN-2975. FSLeafQueue app lists are accessed without required locks. (kasha) 2014-12-20 12:17:50 -08:00
Jian He
808cba3821 YARN-2952. Fixed incorrect version check in StateStore. Contributed by Rohith Sharmaks 2014-12-19 16:56:30 -08:00
Karthik Kambatla
a22ffc3188 YARN-2738. [YARN-2574] Add FairReservationSystem for FairScheduler. (Anubhav Dhoot via kasha) 2014-12-19 15:37:12 -08:00
Jason Lowe
0402bada19 YARN-2964. RM prematurely cancels tokens for jobs that submit jobs (oozie). Contributed by Jian He 2014-12-18 23:28:18 +00:00
Steve Loughran
3681de2039 YARN-2912 Jersey Tests failing with port in use. (varun saxena via stevel) 2014-12-12 17:11:07 +00:00
Jian He
2ed90a57fd YARN-2930. Fixed TestRMRestart#testRMRestartRecoveringNodeLabelManager intermittent failure. Contributed by Wangda Tan 2014-12-09 16:48:04 -08:00
Karthik Kambatla
a2e07a5456 YARN-2910. FSLeafQueue can throw ConcurrentModificationException. (Wilfred Spiegelenburg via kasha) 2014-12-09 14:00:31 -08:00
Jian He
e69af836f3 YARN-2869. CapacityScheduler should trim sub queue names when parse configuration. Contributed by Wangda Tan 2014-12-05 17:33:39 -08:00
Jason Lowe
4b13082199 YARN-2056. Disable preemption at Queue level. Contributed by Eric Payne 2014-12-05 21:06:48 +00:00
Jian He
258623ff8b YARN-2301. Improved yarn container command. Contributed by Naganarasimha G R 2014-12-04 12:53:18 -08:00
Jian He
73fbb3c66b YARN-2880. Added a test to make sure node labels will be recovered if RM restart is enabled. Contributed by Rohith Sharmaks 2014-12-03 17:14:52 -08:00
Jian He
392c3aaea8 YARN-2894. Fixed a bug regarding application view acl when RM fails over. Contributed by Rohith Sharmaks 2014-12-02 17:16:35 -08:00
Jian He
52bcefca8b YARN-2136. Changed RMStateStore to ignore store opearations when fenced. Contributed by Varun Saxena 2014-12-02 10:54:48 -08:00
Jian He
a7fba0bc28 YARN-2765. Added leveldb-based implementation for RMStateStore. Contributed by Jason Lowe 2014-12-01 16:38:25 -08:00
Junping Du
c732ed760e YARN-2907. SchedulerNode#toString should print all resource detail instead of only memory. (Contributed by Rohith) 2014-12-01 05:38:22 -08:00
Jian He
5805a81efb YARN-2404. Removed ApplicationAttemptState and ApplicationState class in RMStateStore. Contributed by Tsuyoshi OZAWA 2014-11-25 12:48:22 -08:00
Sandy Ryza
a128cca305 YARN-2669. FairScheduler: queue names shouldn't allow periods (Wei Yan via Sandy Ryza) 2014-11-21 16:06:41 -08:00
Karthik Kambatla
3114d4731d YARN-2604. Scheduler should consider max-allocation-* in conjunction with the largest node. (Robert Kanter via kasha) 2014-11-21 10:32:28 -08:00
Karthik Kambatla
a9a0cc3679 YARN-2315. FairScheduler: Set current capacity in addition to capacity. (Zhihai Xu via kasha) 2014-11-19 20:15:40 -08:00
Karthik Kambatla
c90fb84aaa YARN-2802. ClusterMetrics to include AM launch and register delays. (Zhihai Xu via kasha) 2014-11-19 19:50:12 -08:00
Jian He
9cb8b75ba5 YARN-2865. Fixed RM to always create a new RMContext when transtions from StandBy to Active. Contributed by Rohith Sharmaks 2014-11-19 19:48:52 -08:00
Karthik Kambatla
2fce6d6141 YARN-2690. [YARN-2574] Make ReservationSystem and its dependent classes independent of Scheduler type. (Anubhav Dhoot via kasha) 2014-11-17 16:45:57 -08:00
Jason Lowe
81c9d17af8 YARN-2414. RM web UI: app page will crash if app is failed before any attempt has been created. Contributed by Wangda Tan 2014-11-17 21:15:48 +00:00
Sandy Ryza
1a47f890ba YARN-2811. In Fair Scheduler, reservation fulfillments shouldn't ignore max share (Siqi Li via Sandy Ryza) 2014-11-14 15:18:56 -08:00
Jian He
d005404ef7 YARN-2856. Fixed RMAppImpl to handle ATTEMPT_KILLED event at ACCEPTED state on app recovery. Contributed by Rohith Sharmaks 2014-11-13 15:34:26 -08:00
Vinod Kumar Vavilapalli
3651fe1b08 YARN-2853. Fixed a bug in ResourceManager causing apps to hang when the user kill request races with ApplicationMaster finish. Contributed by Jian He. 2014-11-13 08:13:03 -08:00
Jason Lowe
f8aefa5e9c YARN-2780. Log aggregated resource allocation in rm-appsummary.log. Contributed by Eric Payne 2014-11-12 17:01:15 +00:00
Vinod Kumar Vavilapalli
e76faebc95 YARN-2834. Fixed ResourceManager to ignore token-renewal failures on recovery consistent with the (somewhat incorrect) behaviour in the non-recovery case. Contributed by Jian He. 2014-11-09 18:56:58 -08:00
Arun C. Murthy
43cd07b408 YARN-2830. Add backwords compatible ContainerId.newInstance constructor. Contributed by Jonathan Eagles. 2014-11-09 14:57:37 -08:00
Zhijie Shen
9a4e0d343e YARN-2505. Supported get/add/remove/change labels in RM REST API. Contributed by Craig Welch. 2014-11-07 20:35:46 -08:00
Vinod Kumar Vavilapalli
4cfd5bc7c1 YARN-2753. Fixed a bunch of bugs in the NodeLabelsManager classes. Contributed by Zhihai xu. 2014-11-07 14:15:53 -08:00
Vinod Kumar Vavilapalli
2ac1be7dec YARN-2824. Fixed Capacity Scheduler to not crash when some node-labels are not mapped to queues by making default capacities per label to be zero. Contributed by Wangda Tan. 2014-11-07 10:39:37 -08:00
Xuan
1e97f2f094 YARN-2810. TestRMProxyUsersConf fails on Windows VMs. Contributed by Varun Vasudev 2014-11-07 09:44:43 -08:00
Vinod Kumar Vavilapalli
a5657182a7 YARN-2823. Fixed ResourceManager app-attempt state machine to inform schedulers about previous finished attempts of a running appliation to avoid expectation mismatch w.r.t transferred containers. Contributed by Jian He. 2014-11-07 09:28:36 -08:00
Vinod Kumar Vavilapalli
a3839a9fbf YARN-2744. Fixed CapacityScheduler to validate node-labels correctly against queues. Contributed by Wangda Tan. 2014-11-06 17:28:12 -08:00
Jian He
395275af86 YARN-2579. Fixed a deadlock issue when EmbeddedElectorService and FatalEventDispatcher try to transition RM to StandBy at the same time. Contributed by Rohith Sharmaks 2014-11-05 16:59:54 -08:00
Zhijie Shen
b4c951ab83 YARN-2767. Added a test case to verify that http static user cannot kill or submit apps in the secure mode. Contributed by Varun Vasudev. 2014-11-05 10:57:38 -08:00
Karthik Kambatla
b2cd269802 YARN-2010. Handle app-recovery failures gracefully. (Jian He and Karthik Kambatla via kasha) 2014-11-04 17:45:24 -08:00
Vinod Kumar Vavilapalli
ec6cbece8e YARN-2795. Fixed ResourceManager to not crash loading node-label data from HDFS in secure mode. Contributed by Wangda Tan. 2014-11-03 13:44:06 -08:00
Zhijie Shen
27715ec63b YARN-2785. Fixed intermittent TestContainerResourceUsage failure. Contributed by Varun Vasudev. 2014-11-02 15:20:40 -08:00
Vinod Kumar Vavilapalli
e0233c16eb YARN-2698. Moved some node label APIs to be correctly placed in client protocol. Contributed by Wangda Tan. 2014-10-30 22:59:31 -07:00
Karthik Kambatla
179cab81e0 YARN-2712. TestWorkPreservingRMRestart: Augment FS tests with queue and headroom checks. (Tsuyoshi Ozawa via kasha) 2014-10-30 00:29:07 -07:00
Karthik Kambatla
782971ae7a YARN-2742. FairSchedulerConfiguration should allow extra spaces between value and unit. (Wei Yan via kasha) 2014-10-29 10:24:57 -07:00
Jian He
d5e0a09721 YARN-2503. Added node lablels in web UI. Contributed by Wangda Tan 2014-10-28 17:57:54 -07:00
Vinod Kumar Vavilapalli
a16d022ca4 YARN-2704. Changed ResourceManager to optionally obtain tokens itself for the sake of localization and log-aggregation for long-running services. Contributed by Jian He. 2014-10-27 15:49:47 -07:00
Vinod Kumar Vavilapalli
0186645505 YARN-2743. Fixed a bug in ResourceManager that was causing RMDelegationToken identifiers to be tampered and thus causing app submission failures in secure mode. Contributed by Jian He. 2014-10-26 11:14:34 -07:00
Jian He
5864dd99a4 YARN-1915. Fixed a race condition that client could use the ClientToAMToken to contact with AM before AM actually receives the ClientToAMTokenMasterKey. Contributed by Jason Lowe 2014-10-24 22:47:56 -07:00
Zhijie Shen
0f3b6900be YARN-2209. Replaced AM resync/shutdown command with corresponding exceptions and made related MR changes. Contributed by Jian He. 2014-10-23 21:56:03 -07:00
Vinod Kumar Vavilapalli
c0e034336c YARN-2715. Fixed ResourceManager to respect common configurations for proxy users/groups beyond just the YARN level config. Contributed by Zhijie Shen. 2014-10-21 20:09:40 -07:00
Vinod Kumar Vavilapalli
39063cd36f YARN-2676. Enhanced Timeline auth-filter to support proxy users. Contributed by Zhijie Shen. 2014-10-17 22:02:50 -07:00
Vinod Kumar Vavilapalli
e9c66e8fd2 YARN-2705. Fixed bugs in ResourceManager node-label manager that were causing test-failures: added a dummy in-memory labels-manager. Contributed by Wangda Tan. 2014-10-17 18:26:12 -07:00
Jian He
c3de2412eb YARN-1879. Marked Idempotent/AtMostOnce annotations to ApplicationMasterProtocol for RM fail over. Contributed by Tsuyoshi OZAWA 2014-10-17 16:35:27 -07:00
Jian He
a6aa6e42ca YARN-2588. Standby RM fails to transitionToActive if previous transitionToActive failed with ZK exception. Contributed by Rohith Sharmaks 2014-10-17 10:54:24 -07:00
Vinod Kumar Vavilapalli
abae63caf9 YARN-2699. Fixed a bug in CommonNodeLabelsManager that caused tests to fail when using ephemeral ports on NodeIDs. Contributed by Wangda Tan. 2014-10-17 08:58:08 -07:00
Vinod Kumar Vavilapalli
b3056c266a YARN-2685. Fixed a bug in CommonNodeLabelsManager that caused wrong resource tracking per label when a host runs multiple node-managers. Contributed by Wangda Tan. 2014-10-15 18:47:26 -07:00
Vinod Kumar Vavilapalli
f2ea555ac6 YARN-2496. Enhanced Capacity Scheduler to have basic support for allocating resources based on node-labels. Contributed by Wangda Tan.
YARN-2500. Ehnaced ResourceManager to support schedulers allocating resources based on node-labels. Contributed by Wangda Tan.
2014-10-15 18:33:06 -07:00
Jian He
0af1a2b5bc YARN-2312. Deprecated old ContainerId#getId API and updated MapReduce to use ContainerId#getContainerId instead. Contributed by Tsuyoshi OZAWA 2014-10-15 15:22:07 -07:00
Zhijie Shen
1220bb72d4 YARN-2656. Made RM web services authentication filter support proxy user. Contributed by Varun Vasudev and Zhijie Shen. 2014-10-14 21:50:46 -07:00
Zhijie Shen
cdce88376a HADOOP-11181. Generalized o.a.h.s.t.d.DelegationTokenManager to handle all sub-classes of AbstractDelegationTokenIdentifier. Contributed by Zhijie Shen. 2014-10-14 11:35:38 -07:00
Karthik Kambatla
da709a2eac YARN-2641. Decommission nodes on -refreshNodes instead of next NM-RM heartbeat. (Zhihai Xu via kasha) 2014-10-13 16:23:04 -07:00
Jian He
f9680d9a16 YARN-2308. Changed CapacityScheduler to explicitly throw exception if the queue
to which the apps were submitted is changed across RM restart. Contributed by Craig Welch & Chang Li
2014-10-13 14:09:04 -07:00
Zhijie Shen
4aed2d8e91 YARN-2651. Spun off LogRollingInterval from LogAggregationContext. Contributed by Xuan Gong. 2014-10-13 10:54:09 -07:00
Vinod Kumar Vavilapalli
db7f165319 YARN-2494. Added NodeLabels Manager internal API and implementation. Contributed by Wangda Tan. 2014-10-10 11:44:21 -07:00
Jian He
e16e25ab1b YARN-2649. Fixed TestAMRMRPCNodeUpdates test failure. Contributed by Ming Ma 2014-10-08 10:58:51 -07:00
Jian He
30d56fdbb4 YARN-1857. CapacityScheduler headroom doesn't account for other AM's running. Contributed by Chen He and Craig Welch 2014-10-07 13:45:04 -07:00
Jian He
519e5a7dd2 YARN-2644. Fixed CapacityScheduler to return up-to-date headroom when AM allocates. Contributed by Craig Welch 2014-10-06 15:48:46 -07:00
Jian He
ea26cc0b4a YARN-2615. Changed ClientToAMTokenIdentifier/RM(Timeline)DelegationTokenIdentifier to use protobuf as payload. Contributed by Junping Du 2014-10-06 10:47:43 -07:00
subru
a2986234be YARN-2611. Fixing jenkins findbugs warning and TestRMWebServicesCapacitySched for branch YARN-1051. Contributed by Subru Krishnan and Carlo Curino.
(cherry picked from commit c47464aba407d1dafe10be23fe454f0489cc4367)
2014-10-03 15:43:23 -07:00
subru
5e10a13bb4 YARN-2576. Making test patch pass in branch. Contributed by Subru Krishnan and Carlo Curino.
(cherry picked from commit 90ac0be86b898aefec5471db4027554c8e1b310c)
2014-10-03 15:43:13 -07:00
subru
6261f7cc69 YARN-2080. Integrating reservation system with ResourceManager and client-RM protocol. Contributed by Subru Krishnan and Carlo Curino.
(cherry picked from commit 8baeaead8532898163f1006276b731a237b1a559)

Conflicts:
	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
2014-10-03 15:42:43 -07:00
subru
3418c56bcf YARN-1712. Plan follower that synchronizes the current state of reservation subsystem with the scheduler. Contributed by Subru Krishnan and Carlo Curino.
(cherry picked from commit 169085319b8b76641f8b9f6840a3fef06d221e2b)
2014-10-03 15:42:10 -07:00
carlo curino
b6df0dddcd YARN-1711. Policy to enforce instantaneous and over-time quotas on user reservation. Contributed by Carlo Curino and Subru Krishnan.
(cherry picked from commit c4918cb4cb5a267a8cfd6eace28fcfe7ad6174e8)
2014-10-03 15:42:03 -07:00
carlo curino
f66ffcf832 YARN-1710. Logic to find allocations within a Plan that satisfy user ReservationRequest(s). Contributed by Carlo Curino and Subru Krishnan.
(cherry picked from commit aef7928899b37262773f3dc117157bb746bf8918)
2014-10-03 15:41:57 -07:00
subru
cf4b34282a YARN-1709. In-memory data structures used to track resources over time to enable reservations.
(cherry picked from commit 0d8b2cd88b958b1e602fd4ea4078ef8d4742a7c3)
2014-10-03 15:41:51 -07:00
carlo curino
1c6950354f YARN-2475. Logic for responding to capacity drops for the ReservationSystem. Contributed by Carlo Curino and Subru Krishnan.
(cherry picked from commit f83a07f266f2c5e6eead554d8a331ed7e75e10d5)
2014-10-03 15:41:21 -07:00
carlo curino
eb3e40b833 YARN-1707. Introduce APIs to add/remove/resize queues in the CapacityScheduler. Contributed by Carlo Curino and Subru Krishnan
(cherry picked from commit aac47fda7fecda9fc18ade34d633eca895865a70)

Conflicts:
	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
2014-10-03 15:41:02 -07:00
Karthik Kambatla
80d11eb68e YARN-2635. TestRM, TestRMRestart, TestClientToAMTokens should run with both CS and FS. (Wei Yan and kasha via kasha) 2014-10-03 11:49:49 -07:00
Jian He
054f285526 YARN-2628. Capacity scheduler with DominantResourceCalculator carries out reservation even though slots are free. Contributed by Varun Vasudev 2014-10-02 15:13:33 -07:00
Karthik Kambatla
5e0b49da9c YARN-2254. TestRMWebServicesAppsModification should run against both CS and FS. (Zhihai Xu via kasha) 2014-10-02 10:15:04 -07:00
Zhijie Shen
52bbe0f11b YARN-2630. Prevented previous AM container status from being acquired by the current restarted AM. Contributed by Jian He. 2014-10-01 15:38:11 -07:00
Jian He
bbff96be48 YARN-2602. Fixed possible NPE in ApplicationHistoryManagerOnTimelineStore. Contributed by Zhijie Shen 2014-09-30 16:44:17 -07:00
Jason Lowe
9c22065109 YARN-1769. CapacityScheduler: Improve reservations. Contributed by Thomas Graves 2014-09-29 14:12:18 +00:00
Jian He
5391919b09 YARN-668. Changed NMTokenIdentifier/AMRMTokenIdentifier/ContainerTokenIdentifier to use protobuf object as the payload. Contributed by Junping Du. 2014-09-26 17:48:41 -07:00
Jason Lowe
8269bfa613 YARN-2523. ResourceManager UI showing negative value for "Decommissioned Nodes" field. Contributed by Rohith 2014-09-25 22:37:05 +00:00
Zhijie Shen
72b0881ca6 YARN-2546. Made REST API for application creation/submission use numeric and boolean types instead of the string of them. Contributed by Varun Vasudev. 2014-09-24 17:57:32 -07:00
Zhijie Shen
c86674a3a4 YARN-2581. Passed LogAggregationContext to NM via ContainerTokenIdentifier. Contributed by Xuan Gong. 2014-09-24 17:50:26 -07:00
Karthik Kambatla
f5578207d2 YARN-2252. Intermittent failure of TestFairScheduler.testContinuousScheduling. (Ratandeep Ratti and kasha via kasha) 2014-09-23 00:03:16 -07:00
Karthik Kambatla
568d3dc2bb YARN-1959. Fix headroom calculation in FairScheduler. (Anubhav Dhoot via kasha) 2014-09-22 23:49:39 -07:00
Karthik Kambatla
43efdd30b5 YARN-2539. FairScheduler: Set the default value for maxAMShare to 0.5. (Wei Yan via kasha) 2014-09-22 16:09:52 -07:00
Jian He
0a641496c7 YARN-1372. Ensure all completed containers are reported to the AMs across RM restart. Contributed by Anubhav Dhoot 2014-09-22 10:30:53 -07:00
Karthik Kambatla
9721e2c1fe YARN-2453. TestProportionalCapacityPreemptionPolicy fails with FairScheduler. (Zhihai Xu via kasha) 2014-09-21 23:13:45 -07:00
Karthik Kambatla
c50fc92502 YARN-2452. TestRMApplicationHistoryWriter fails with FairScheduler. (Zhihai Xu via kasha) 2014-09-21 13:15:04 -07:00
Jian He
444acf8ea7 YARN-2565. Fixed RM to not use FileSystemApplicationHistoryStore unless explicitly set. Contributed by Zhijie Shen 2014-09-19 11:26:29 -07:00
Zhijie Shen
6fe5c6b746 YARN-2568. Fixed the potential test failures due to race conditions when RM work-preserving recovery is enabled. Contributed by Jian He. 2014-09-18 21:56:56 -07:00
Jason Lowe
a337f0e354 YARN-2561. MR job client cannot reconnect to AM after NM restart. Contributed by Junping Du 2014-09-18 21:34:40 +00:00
Jason Lowe
9ea7b6c063 YARN-2363. Submitted applications occasionally lack a tracking URL. Contributed by Jason Lowe 2014-09-18 20:13:16 +00:00
Vinod Kumar Vavilapalli
485c96e3cb YARN-2001. Added a time threshold for RM to wait before starting container allocations after restart/failover. Contributed by Jian He. 2014-09-18 11:03:12 -07:00
Jian He
ee21b13cbd YARN-2559. Fixed NPE in SystemMetricsPublisher when retrieving FinalApplicationStatus. Contributed by Zhijie Shen 2014-09-17 21:44:15 -07:00
junping_du
90a0c03f0a YARN-1250. Generic history service should support application-acls. (Contributed by Zhijie Shen) 2014-09-16 18:20:49 -07:00
Vinod Kumar Vavilapalli
14e2639fd0 YARN-611. Added an API to let apps specify an interval beyond which AM failures should be ignored towards counting max-attempts. Contributed by Xuan Gong. 2014-09-13 18:04:05 -07:00
XuanGong
e65ae575a0 YARN-2456. Possible livelock in CapacityScheduler when RM is recovering
apps. Contributed by Jian He
2014-09-12 15:21:46 -07:00
Jian He
3122daa802 YARN-2229. Changed the integer field of ContainerId to be long type. Contributed by Tsuyoshi OZAWA 2014-09-12 10:33:33 -07:00
junping_du
6b8b1608e6 YARN-2033. Merging generic-history into the Timeline Store (Contributed by Zhijie Shen) 2014-09-12 10:04:51 +08:00
Karthik Kambatla
c11ada5ea6 YARN-2534. FairScheduler: Potential integer overflow calculating totalMaxShare. (Zhihai Xu via kasha) 2014-09-11 12:06:06 -07:00
Jian He
83be3ad444 YARN-415. Capture aggregate memory allocation at the app-level for chargeback. Contributed by Eric Payne & Andrey Klochkov 2014-09-10 18:20:54 -07:00
Jian He
cbfe26370b YARN-2158. Fixed TestRMWebServicesAppsModification#testSingleAppKill test failure. Contributed by Varun Vasudev 2014-09-10 12:47:34 -07:00
XUAN
47bdfa044a YARN-2459. RM crashes if App gets rejected for any reason and HA is enabled. Contributed by Jian He 2014-09-10 11:44:41 -07:00
Vinod Kumar Vavilapalli
b67d5ba784 YARN-2448. Changed ApplicationMasterProtocol to expose RM-recognized resource types to the AMs. Contributed by Varun Vasudev. 2014-09-10 10:15:47 -07:00
Karthik Kambatla
3072c83b38 YARN-1458. FairScheduler: Zero weight can lead to livelock. (Zhihai Xu via kasha) 2014-09-10 08:26:14 -07:00
Karthik Kambatla
1dcaba9a7a YARN-2394. FairScheduler: Configure fairSharePreemptionThreshold per queue. (Wei Yan via kasha) 2014-09-03 10:27:36 -07:00
Karthik Kambatla
0f34e6f387 YARN-2395. FairScheduler: Preemption timeout should be configurable per queue. (Wei Yan via kasha) 2014-08-30 01:17:13 -07:00
Jian He
5c14bc426b YARN-1506. Changed RMNode/SchedulerNode to update resource with event notification. Contributed by Junping Du 2014-08-29 23:05:51 -07:00
Jian He
c686aa3533 YARN-2447. RM web service app submission doesn't pass secrets correctly. Contributed by Varun Vasudev 2014-08-29 11:40:47 -07:00
Karthik Kambatla
fa80ca49bd YARN-2405. NPE in FairSchedulerAppsBlock. (Tsuyoshi Ozawa via kasha) 2014-08-28 23:21:37 -07:00
Karthik Kambatla
d16bfd1d0f YARN-1326. RM should log using RMStore at startup time. (Tsuyoshi Ozawa via kasha) 2014-08-27 01:43:58 -07:00
Karthik Kambatla
0097b15e21 YARN-2393. FairScheduler: Add the notion of steady fair share. (Wei Yan via kasha)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1619845 13f79535-47bb-0310-9956-ffa450edef68
2014-08-22 15:44:47 +00:00
Jason Darrell Lowe
4236c6600e YARN-2434. RM should not recover containers from previously failed attempt when AM restart is not enabled. Contributed by Jian He
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1619614 13f79535-47bb-0310-9956-ffa450edef68
2014-08-21 22:41:34 +00:00
Zhijie Shen
f6a778c372 YARN-2249. Avoided AM release requests being lost on work preserving RM restart. Contributed by Jian He.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1618972 13f79535-47bb-0310-9956-ffa450edef68
2014-08-19 20:33:49 +00:00
Jian He
375c221960 YARN-2409. RM ActiveToStandBy transition missing stoping previous rmDispatcher. Contributed by Rohith
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1618915 13f79535-47bb-0310-9956-ffa450edef68
2014-08-19 17:49:39 +00:00
Jian He
519c4be95a YARN-2411. Support simple user and group mappings to queues. Contributed by Ram Venkatesh
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1618542 13f79535-47bb-0310-9956-ffa450edef68
2014-08-18 06:08:45 +00:00
Jian He
c3084d6c16 YARN-2389. Added functionality for schedulers to kill all applications in a queue. Contributed by Subramaniam Venkatraman Krishnan
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1618294 13f79535-47bb-0310-9956-ffa450edef68
2014-08-15 23:53:57 +00:00
Jian He
7360cec692 YARN-2378. Added support for moving applications across queues in CapacityScheduler. Contributed by Subramaniam Venkatraman Krishnan
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1618106 13f79535-47bb-0310-9956-ffa450edef68
2014-08-15 06:00:31 +00:00
Zhijie Shen
a9023c2736 YARN-2397. Avoided loading two authentication filters for RM and TS web interfaces. Contributed by Varun Vasudev.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1618054 13f79535-47bb-0310-9956-ffa450edef68
2014-08-14 21:17:20 +00:00
Karthik Kambatla
5197f8c3c5 YARN-1370. Fair scheduler to re-populate container allocation state. (Anubhav Dhoot via kasha)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1617645 13f79535-47bb-0310-9956-ffa450edef68
2014-08-13 01:38:59 +00:00
Karthik Kambatla
4239695588 YARN-2399. Delete old versions of files. FairScheduler: Merge AppSchedulable and FSSchedulerApp into FSAppAttempt. (kasha)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1617619 13f79535-47bb-0310-9956-ffa450edef68
2014-08-12 22:51:57 +00:00
Karthik Kambatla
486e718fc1 YARN-2399. FairScheduler: Merge AppSchedulable and FSSchedulerApp into FSAppAttempt. (kasha)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1617600 13f79535-47bb-0310-9956-ffa450edef68
2014-08-12 21:43:27 +00:00
Junping Du
c2febdcbaa YARN-1337. Recover containers upon nodemanager restart. (Contributed by Jason Lowe)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1617448 13f79535-47bb-0310-9956-ffa450edef68
2014-08-12 10:56:13 +00:00
Jian He
c4dc685343 YARN-2138. Cleaned up notifyDone* APIs in RMStateStore. Contributed by Varun Saxena
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1617341 13f79535-47bb-0310-9956-ffa450edef68
2014-08-11 18:24:24 +00:00
Xuan Gong
946be75704 YARN-2400: Addendum fix for TestAMRestart failure. Contributed by Jian He
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1617333 13f79535-47bb-0310-9956-ffa450edef68
2014-08-11 17:42:53 +00:00
Xuan Gong
743f7f30da YARN-2400. Fixed TestAMRestart fails intermittently. Contributed by Jian He:
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1617028 13f79535-47bb-0310-9956-ffa450edef68
2014-08-09 23:31:11 +00:00
Karthik Kambatla
a7643f4de7 YARN-2026. Fair scheduler: Consider only active queues for computing fairshare. (Ashwin Shankar via kasha)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1616915 13f79535-47bb-0310-9956-ffa450edef68
2014-08-09 02:10:00 +00:00
Xuan Gong
eeb4acd955 YARN-2212: ApplicationMaster needs to find a way to update the AMRMToken periodically. Contributed by Xuan Gong
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1616892 13f79535-47bb-0310-9956-ffa450edef68
2014-08-08 21:38:24 +00:00
Karthik Kambatla
14864e9c7c YARN-2352. FairScheduler: Collect metrics on duration of critical methods that affect performance. (kasha)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1616769 13f79535-47bb-0310-9956-ffa450edef68
2014-08-08 14:17:54 +00:00
Jian He
8437df8ba9 YARN-2008. Fixed CapacityScheduler to calculate headroom based on max available capacity instead of configured max capacity. Contributed by Craig Welch
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1616580 13f79535-47bb-0310-9956-ffa450edef68
2014-08-07 20:00:04 +00:00
Karthik Kambatla
8feddc4c84 YARN-2359. Application hangs when it fails to launch AM container. (Zhihai Xu via kasha)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1616375 13f79535-47bb-0310-9956-ffa450edef68
2014-08-07 00:06:17 +00:00
Junping Du
b8f151231b YARN-1354. Recover applications upon nodemanager restart. (Contributed by Jason Lowe)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1615550 13f79535-47bb-0310-9956-ffa450edef68
2014-08-04 13:25:37 +00:00
Xuan Gong
e52f67e389 YARN-1994. Expose YARN/MR endpoints on multiple interfaces. Contributed by Craig Welch, Milan Potocnik,and Arpit Agarwal
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1614981 13f79535-47bb-0310-9956-ffa450edef68
2014-07-31 20:06:02 +00:00
Zhijie Shen
1d6e178144 YARN-2347. Consolidated RMStateVersion and NMDBSchemaVersion into Version in yarn-server-common. Contributed by Junping Du.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1614838 13f79535-47bb-0310-9956-ffa450edef68
2014-07-31 09:27:43 +00:00
Karthik Kambatla
c0b49ff107 YARN-2328. FairScheduler: Verify update and continuous scheduling threads are stopped when the scheduler is stopped. (kasha)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1614432 13f79535-47bb-0310-9956-ffa450edef68
2014-07-29 17:41:52 +00:00
Zhijie Shen
d6532d3a77 YARN-2247. Made RM web services authenticate users via kerberos and delegation token. Contributed by Varun Vasudev.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1613821 13f79535-47bb-0310-9956-ffa450edef68
2014-07-27 17:55:06 +00:00
Jian He
d4fec34933 YARN-2211. Persist AMRMToken master key in RMStateStore for RM recovery. Contributed by Xuan Gong
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1613515 13f79535-47bb-0310-9956-ffa450edef68
2014-07-25 20:42:37 +00:00
Karthik Kambatla
1e553858f9 YARN-2214. FairScheduler: preemptContainerPreCheck() in FSParentQueue delays convergence towards fairness. (Ashwin Shankar via kasha)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1613459 13f79535-47bb-0310-9956-ffa450edef68
2014-07-25 16:13:07 +00:00
Jason Darrell Lowe
28fca92521 YARN-2147. client lacks delegation token exception details when application submit fails. Contributed by Chen He
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1612950 13f79535-47bb-0310-9956-ffa450edef68
2014-07-23 21:40:57 +00:00
Sanford Ryza
c88402f36d YARN-2313. Livelock can occur in FairScheduler when there are lots of running apps (Tsuyoshi Ozawa via Sandy Ryza)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1612769 13f79535-47bb-0310-9956-ffa450edef68
2014-07-23 05:00:52 +00:00
Karthik Kambatla
ff77582991 YARN-2273. NPE in ContinuousScheduling thread when we lose a node. (Wei Yan via kasha)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1612720 13f79535-47bb-0310-9956-ffa450edef68
2014-07-22 22:44:38 +00:00
Zhijie Shen
eac0701c96 YARN-2319. Made the MiniKdc instance start/close before/after the class of TestRMWebServicesDelegationTokens. Contributed by Wenwu Peng.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1612588 13f79535-47bb-0310-9956-ffa450edef68
2014-07-22 15:15:29 +00:00
Junping Du
afb9394c91 YARN-2242. Addendum patch. Improve exception information on AM launch crashes. (Contributed by Li Lu)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1612565 13f79535-47bb-0310-9956-ffa450edef68
2014-07-22 13:07:23 +00:00
Karthik Kambatla
8871d8ed9f YARN-2244. FairScheduler missing handling of containers for unknown application attempts. (Anubhav Dhoot via kasha)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1611840 13f79535-47bb-0310-9956-ffa450edef68
2014-07-19 00:12:05 +00:00
Xuan Gong
f1b831ccfb YARN-2208. AMRMTokenManager need to have a way to roll over AMRMToken. Contributed by Xuan Gong
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1611820 13f79535-47bb-0310-9956-ffa450edef68
2014-07-18 21:46:29 +00:00
Jian He
3c193811ca YARN-2219. Addendum patch for YARN-2219
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1611240 13f79535-47bb-0310-9956-ffa450edef68
2014-07-17 03:28:39 +00:00
Vinod Kumar Vavilapalli
bda23181bf YARN-2219. Changed ResourceManager to avoid AMs and NMs getting exceptions after RM recovery but before scheduler learns about apps and app-attempts. Contributed by Jian He.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1611222 13f79535-47bb-0310-9956-ffa450edef68
2014-07-17 00:14:56 +00:00
Vinod Kumar Vavilapalli
030580387a YARN-2233. Implemented ResourceManager web-services to create, renew and cancel delegation tokens. Contributed by Varun Vasudev.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1610876 13f79535-47bb-0310-9956-ffa450edef68
2014-07-15 23:00:17 +00:00
Mayank Bansal
43589a8df7 YARN-1408 Preemption caused Invalid State Event: ACQUIRED at KILLED and caused a task timeout for 30mins. (Sunil G via mayank)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1610860 13f79535-47bb-0310-9956-ffa450edef68
2014-07-15 21:48:58 +00:00
Vinod Kumar Vavilapalli
c6cc6a6a8e YARN-2260. Fixed ResourceManager's RMNode to correctly remember containers when nodes resync during work-preserving RM restart. Contributed by Jian He.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1610557 13f79535-47bb-0310-9956-ffa450edef68
2014-07-14 23:32:03 +00:00
Jian He
c9fb040c87 YARN-2181. Added preemption info to logs and RM web UI. Contributed by Wangda Tan
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1609561 13f79535-47bb-0310-9956-ffa450edef68
2014-07-10 20:03:35 +00:00
Karthik Kambatla
8fbca62a90 YARN-2131. Add a way to format the RMStateStore. (Robert Kanter via kasha)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1609278 13f79535-47bb-0310-9956-ffa450edef68
2014-07-09 19:58:43 +00:00
Zhijie Shen
12c4197b35 YARN-2158. Improved assertion messages of TestRMWebServicesAppsModification. Contributed by Varun Vasudev.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1608667 13f79535-47bb-0310-9956-ffa450edef68
2014-07-08 05:50:04 +00:00
Sanford Ryza
5644f529f3 YARN-2250. FairScheduler.findLowestCommonAncestorQueue returns null when queues not identical (Krisztian Horvath via Sandy Ryza)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1607872 13f79535-47bb-0310-9956-ffa450edef68
2014-07-04 15:16:43 +00:00
Junping Du
5cb489f9d3 YARN-2242. Improve exception information on AM launch crashes. (Contributed by Li Lu)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1607655 13f79535-47bb-0310-9956-ffa450edef68
2014-07-03 14:15:19 +00:00
Vinod Kumar Vavilapalli
45b191e38c YARN-2232. Fixed ResourceManager to allow DelegationToken owners to be able to cancel their own tokens in secure mode. Contributed by Varun Vasudev.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1607484 13f79535-47bb-0310-9956-ffa450edef68
2014-07-02 21:36:42 +00:00
Mayank Bansal
03a25d2cc1 YARN-2022 Preempting an Application Master container can be kept as least priority when multiple applications are marked for preemption by ProportionalCapacityPreemptionPolicy (Sunil G via mayank)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1607227 13f79535-47bb-0310-9956-ffa450edef68
2014-07-02 01:54:47 +00:00
Vinod Kumar Vavilapalli
075ff276ca YARN-1713. Added get-new-app and submit-app functionality to RM web services. Contributed by Varun Vasudev.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1607216 13f79535-47bb-0310-9956-ffa450edef68
2014-07-02 00:23:07 +00:00
Xuan Gong
e5ae7c55d1 TestRMApplicationHistoryWriter sometimes fails in trunk. Contributed by Zhijie Shen
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1606835 13f79535-47bb-0310-9956-ffa450edef68
2014-06-30 16:51:22 +00:00
Jian He
b0c51504c4 YARN-2052. Embedded an epoch number in container id to ensure the uniqueness of container id after RM restarts. Contributed by Tsuyoshi OZAWA
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1606557 13f79535-47bb-0310-9956-ffa450edef68
2014-06-29 18:24:03 +00:00
Jian He
b717d44b52 YARN-614. Changed ResourceManager to not count disk failure, node loss and RM restart towards app failures. Contributed by Xuan Gong
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1606407 13f79535-47bb-0310-9956-ffa450edef68
2014-06-28 23:37:46 +00:00
Zhijie Shen
55a0aa0bad YARN-2201. Made TestRMWebServicesAppsModification be independent of the changes on yarn-default.xml. Contributed by Varun Vasudev.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1606285 13f79535-47bb-0310-9956-ffa450edef68
2014-06-28 03:30:44 +00:00
Karthik Kambatla
f911f5495b YARN-2204. Addendum patch. TestAMRestart#testAMRestartWithExistingContainers assumes CapacityScheduler. (Robert Kanter via kasha)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1606168 13f79535-47bb-0310-9956-ffa450edef68
2014-06-27 18:09:41 +00:00
Vinod Kumar Vavilapalli
9571db19eb YARN-2171. Improved CapacityScheduling to not lock on nodemanager-count when AMs heartbeat in. Contributed by Jason Lowe.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1605616 13f79535-47bb-0310-9956-ffa450edef68
2014-06-25 21:56:42 +00:00
Karthik Kambatla
1a3a7e0c1a YARN-2204. TestAMRestart#testAMRestartWithExistingContainers assumes CapacityScheduler. (Robert Kanter via kasha)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1605548 13f79535-47bb-0310-9956-ffa450edef68
2014-06-25 18:50:53 +00:00
Jian He
c3f1c30e65 YARN-1365. Changed ApplicationMasterService to allow an app to re-register after RM restart. Contributed by Anubhav Dhoot
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1605263 13f79535-47bb-0310-9956-ffa450edef68
2014-06-25 04:42:39 +00:00
Vinod Kumar Vavilapalli
e285b98f0f YARN-2152. Added missing information into ContainerTokenIdentifier so that NodeManagers can report the same to RM when RM restarts. Contributed Jian He.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1605205 13f79535-47bb-0310-9956-ffa450edef68
2014-06-24 21:43:22 +00:00
Thomas Graves
1f9a0fd927 YARN-2072. RM/NM UIs and webservices are missing vcore information. (Nathan Roberts via tgraves)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1605162 13f79535-47bb-0310-9956-ffa450edef68
2014-06-24 19:34:34 +00:00
Karthik Kambatla
c0991d11eb YARN-2109. Fix TestRM to work with both schedulers. (Anubhav Dhoot via kasha)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1605142 13f79535-47bb-0310-9956-ffa450edef68
2014-06-24 17:30:53 +00:00
Karthik Kambatla
db4d277117 YARN-2192. TestRMHA fails when run with a mix of Schedulers. (Anubhav Dhoot via kasha)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1605138 13f79535-47bb-0310-9956-ffa450edef68
2014-06-24 17:05:35 +00:00
Sanford Ryza
29c102cad0 YARN-2111. In FairScheduler.attemptScheduling, we don't count containers as assigned if they have 0 memory but non-zero cores (Sandy Ryza)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1605113 13f79535-47bb-0310-9956-ffa450edef68
2014-06-24 15:40:39 +00:00
Vinod Kumar Vavilapalli
d16470025a YARN-2074. Changed ResourceManager to not count AM preemptions towards app failures. Contributed by Jian He.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1605106 13f79535-47bb-0310-9956-ffa450edef68
2014-06-24 15:15:12 +00:00
Jian He
59b5e9fa15 YARN-2191. Added a new test to ensure NM will clean up completed applications in the case of RM restart. Contributed by Wangda Tan
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1604949 13f79535-47bb-0310-9956-ffa450edef68
2014-06-23 22:52:38 +00:00
Karthik Kambatla
6fcbf9b848 YARN-2187. FairScheduler: Disable max-AM-share check by default. (Robert Kanter via kasha)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1604321 13f79535-47bb-0310-9956-ffa450edef68
2014-06-21 07:30:07 +00:00
Jian He
95897ca14b YARN-1885. Fixed a bug that RM may not send application-clean-up signal to NMs where the completed applications previously ran in case of RM restart. Contributed by Wangda Tan
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1603028 13f79535-47bb-0310-9956-ffa450edef68
2014-06-16 23:56:12 +00:00
Vinod Kumar Vavilapalli
dc7dd1fa19 YARN-1702. Added kill app functionality to RM web services. Contributed by Varun Vasudev.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1602298 13f79535-47bb-0310-9956-ffa450edef68
2014-06-12 21:31:52 +00:00
Karthik Kambatla
4bc91b44c9 YARN-2155. FairScheduler: Incorrect threshold check for preemption. (Wei Yan via kasha)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1602295 13f79535-47bb-0310-9956-ffa450edef68
2014-06-12 21:23:32 +00:00
Jian He
710a8693e5 YARN-2124. Fixed NPE in ProportionalCapacityPreemptionPolicy. Contributed by Wangda Tan
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1601964 13f79535-47bb-0310-9956-ffa450edef68
2014-06-11 17:30:18 +00:00
Karthik Kambatla
5de6f72054 YARN-1424. RMAppAttemptImpl should return the DummyApplicationResourceUsageReport for all invalid accesses. (Ray Chiang via kasha)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1601742 13f79535-47bb-0310-9956-ffa450edef68
2014-06-10 19:03:06 +00:00
Jian He
c94f2cec3a Augmented RMStateStore with state machine. Contributed by Binglin Chang.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1601491 13f79535-47bb-0310-9956-ffa450edef68
2014-06-09 19:44:31 +00:00
Vinod Kumar Vavilapalli
424fd9494f YARN-1368. Added core functionality of recovering container state into schedulers after ResourceManager Restart so as to preserve running work in the cluster. Contributed by Jian He.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1601303 13f79535-47bb-0310-9956-ffa450edef68
2014-06-09 03:09:21 +00:00
Karthik Kambatla
85d4c787e0 YARN-2128. FairScheduler: Incorrect calculation of amResource usage. (Wei Yan via kasha)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1601050 13f79535-47bb-0310-9956-ffa450edef68
2014-06-07 01:21:33 +00:00
Junping Du
0ceb742549 YARN-1977. Add tests on getApplicationRequest with filtering start time range. (Contributed by Junping Du)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1600644 13f79535-47bb-0310-9956-ffa450edef68
2014-06-05 13:15:44 +00:00
Sanford Ryza
16caa3fd18 YARN-1913. With Fair Scheduler, cluster can logjam when all resources are consumed by AMs (Wei Yan via Sandy Ryza)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1599400 13f79535-47bb-0310-9956-ffa450edef68
2014-06-03 00:56:48 +00:00
Karthik Kambatla
0aad2d56df YARN-1550. NPE in FairSchedulerAppsBlock#render. (Anubhav Dhoot via kasha)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1599345 13f79535-47bb-0310-9956-ffa450edef68
2014-06-02 20:22:52 +00:00
Karthik Kambatla
a4ba451802 YARN-1474. Make schedulers services. (Tsuyoshi Ozawa via kasha)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1598908 13f79535-47bb-0310-9956-ffa450edef68
2014-05-31 19:33:09 +00:00
Vinod Kumar Vavilapalli
23c325ad47 YARN-2115. Replaced RegisterNodeManagerRequest's ContainerStatus with a new NMContainerStatus which has more information that is needed for work-preserving RM-restart. Contributed by Jian He.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1598790 13f79535-47bb-0310-9956-ffa450edef68
2014-05-31 00:20:50 +00:00
Karthik Kambatla
49a3a0cd0c YARN-2054. Better defaults for YARN ZK configs for retries and retry-inteval when HA is enabled. (kasha)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1598630 13f79535-47bb-0310-9956-ffa450edef68
2014-05-30 15:24:49 +00:00
Arpit Agarwal
4a4868e523 HADOOP-10448. Support pluggable mechanism to specify proxy user settings (Contributed by Benoy Antony)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1598396 13f79535-47bb-0310-9956-ffa450edef68
2014-05-29 20:52:01 +00:00
Sanford Ryza
342da5b4d3 YARN-596. Use scheduling policies throughout the queue hierarchy to decide which containers to preempt (Wei Yan via Sandy Ryza)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1598197 13f79535-47bb-0310-9956-ffa450edef68
2014-05-29 04:01:24 +00:00
Sanford Ryza
edfbc8ad4a YARN-2105. Fix TestFairScheduler after YARN-2012. (Ashwin Shankar via Sandy Ryza)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1597902 13f79535-47bb-0310-9956-ffa450edef68
2014-05-27 23:46:22 +00:00
Karthik Kambatla
7dd378c274 YARN-2096. Race in TestRMRestart#testQueueMetricsOnRMRestart. (Anubhav Dhoot via kasha)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1597223 13f79535-47bb-0310-9956-ffa450edef68
2014-05-23 23:51:00 +00:00
Sanford Ryza
a00b2d4f37 YARN-2073. Fair Scheduler: Add a utilization threshold to prevent preempting resources when cluster is free (Karthik Kambatla via Sandy Ryza)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1597209 13f79535-47bb-0310-9956-ffa450edef68
2014-05-23 22:52:46 +00:00
Sanford Ryza
6c56612af5 YARN-2012. Fair Scheduler: allow default queue placement rule to take an arbitrary queue (Ashwin Shankar via Sandy Ryza)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1597204 13f79535-47bb-0310-9956-ffa450edef68
2014-05-23 22:38:52 +00:00
Vinod Kumar Vavilapalli
82f3454f5a YARN-2017. Merged some of the common scheduler code. Contributed by Jian He.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1596753 13f79535-47bb-0310-9956-ffa450edef68
2014-05-22 05:32:26 +00:00
Jian He
0f9147c857 YARN-2053. Fixed a bug in AMS to not add null NMToken into NMTokens list from previous attempts for work-preserving AM restart. Contributed by Wangda Tan
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1595116 13f79535-47bb-0310-9956-ffa450edef68
2014-05-16 06:22:22 +00:00
Sanford Ryza
84dfae2f8a YARN-1986. In Fifo Scheduler, node heartbeat in between creating app and attempt causes NPE (Hong Zhiguo via Sandy Ryza)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1594476 13f79535-47bb-0310-9956-ffa450edef68
2014-05-14 06:41:20 +00:00
Christopher Douglas
45b42676f9 YARN-1957. Consider the max capacity of the queue when computing the ideal
capacity for preemption. Contributed by Carlo Curino


git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1594414 13f79535-47bb-0310-9956-ffa450edef68
2014-05-13 23:15:27 +00:00
Jonathan Turner Eagles
1c48142807 YARN-1981. Nodemanager version is not updated when a node reconnects (Jason Lowe via jeagles)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1594358 13f79535-47bb-0310-9956-ffa450edef68
2014-05-13 20:03:58 +00:00
Jian He
41344a4a69 YARN-1975. Fix yarn application CLI to print the scheme of the tracking url of failed/killed applications. Contributed by Junping Du
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1593874 13f79535-47bb-0310-9956-ffa450edef68
2014-05-12 00:43:35 +00:00
Junping Du
ca95af7d23 YARN-2011. Fix typo and warning in TestLeafQueue (Contributed by Chen He)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1593804 13f79535-47bb-0310-9956-ffa450edef68
2014-05-11 15:13:29 +00:00
Sanford Ryza
cfc97a4e88 YARN-1864. Fair Scheduler Dynamic Hierarchical User Queues (Ashwin Shankar via Sandy Ryza)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1593190 13f79535-47bb-0310-9956-ffa450edef68
2014-05-08 07:21:11 +00:00
Arpit Agarwal
f4b687b873 YARN-2018. TestClientRMService.testTokenRenewalWrongUser fails after HADOOP-10562. (Contributed by Ming Ma)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1592783 13f79535-47bb-0310-9956-ffa450edef68
2014-05-06 15:45:49 +00:00
Junping Du
2ad1cee5da YARN-1201. TestAMAuthorization fails with local hostname cannot be resolved. (Wangda Tan via junping_du)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1592197 13f79535-47bb-0310-9956-ffa450edef68
2014-05-03 13:03:27 +00:00
Vinod Kumar Vavilapalli
7a241aee90 YARN-1929. Fixed a deadlock in ResourceManager that occurs when failover happens right at the time of shutdown. Contributed by Karthik Kambatla.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1591071 13f79535-47bb-0310-9956-ffa450edef68
2014-04-29 19:49:44 +00:00
Jason Darrell Lowe
a9775b4e49 YARN-738. TestClientRMTokens is failing irregularly while running all yarn tests. Contributed by Ming Ma
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1591030 13f79535-47bb-0310-9956-ffa450edef68
2014-04-29 17:47:11 +00:00
Chris Nauroth
84388525a3 YARN-1970. Prepare YARN codebase for JUnit 4.11. Contributed by Chris Nauroth.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1589001 13f79535-47bb-0310-9956-ffa450edef68
2014-04-21 23:31:18 +00:00
Vinod Kumar Vavilapalli
bad021534c YARN-1281. Fixed TestZKRMStateStoreZKClientConnections to not fail intermittently due to ZK-client timeouts. Contributed by Tsuyoshi Ozawa.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1588369 13f79535-47bb-0310-9956-ffa450edef68
2014-04-17 20:57:15 +00:00
Junping Du
bd43d2481e YARN-1947. TestRMDelegationTokens#testRMDTMasterKeyStateOnRollingMasterKey is failing intermittently. (Jian He via junping_du)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1588365 13f79535-47bb-0310-9956-ffa450edef68
2014-04-17 20:27:37 +00:00
Vinod Kumar Vavilapalli
eb7b33c298 YARN-1928. Fixed a race condition in TestAMRMRPCNodeUpdates which caused it to fail occassionally. Contributed by Zhijie Shen.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1587114 13f79535-47bb-0310-9956-ffa450edef68
2014-04-13 22:40:16 +00:00
Vinod Kumar Vavilapalli
c6b70f4760 YARN-1933. Fixed test issues with TestAMRestart and TestNodeHealthService. Contributed by Jian He.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1587104 13f79535-47bb-0310-9956-ffa450edef68
2014-04-13 21:51:38 +00:00