Commit Graph

3045 Commits

Author SHA1 Message Date
Wangda Tan
034b312d9f YARN-7411. Inter-Queue preemption's computeFixpointAllocation need to handle absolute resources while computing normalizedGuarantee. (Sunil G via wangda)
Change-Id: I41b1d7558c20fc4eb2050d40134175a2ef6330cb
2017-12-07 18:56:54 -08:00
Wangda Tan
aa3f62740f YARN-7332. Compute effectiveCapacity per each resource vector. (Sunil G via wangda) 2017-12-07 18:56:54 -08:00
Wangda Tan
d52627a7cb YARN-7254. UI and metrics changes related to absolute resource configuration. (Sunil G via wangda) 2017-12-07 18:56:54 -08:00
Wangda Tan
5e798b1a0d YARN-6471. Support to add min/max resource configuration for a queue. (Sunil G via wangda)
Change-Id: I9213f5297a6841fab5c573e85ee4c4e5f4a0b7ff
2017-12-07 18:56:54 -08:00
Weiwei Yang
e411dd6666 YARN-7607. Remove the trailing duplicated timestamp in container diagnostics message. Contributed by Weiwei Yang. 2017-12-07 17:29:40 +08:00
Weiwei Yang
05c347fe51 YARN-7611. Node manager web UI should display container type in containers page. Contributed by Weiwei Yang. 2017-12-06 12:21:52 +08:00
Sunil G
a957f1c60e YARN-7438. Additional changes to make SchedulingPlacementSet agnostic to ResourceRequest / placement algorithm. Contributed by Wangda Tan 2017-12-05 22:50:07 +05:30
Sunil G
f9f317b702 YARN-7586. Application Placement should be done before ACL checks in ResourceManager. Contributed by Suma Shivaprasad. 2017-12-05 18:28:31 +05:30
Robert Kanter
d8863fc16f YARN-5594. Handle old RMDelegationToken format when recovering RM (rkanter) 2017-12-04 13:14:55 -08:00
Arun Suresh
37ca416950 YARN-7587. Skip dispatching opportunistic containers to nodes whose queue is already full. (Weiwei Yang via asuresh) 2017-12-03 22:22:01 -08:00
Sunil G
81f6e46b2f YARN-6907. Node information page in the old web UI should report resource types. Contributed by Gergely Novák. 2017-12-04 11:27:23 +05:30
Sunil G
30f2646b15 YARN-7594. TestNMWebServices#testGetNMResourceInfo fails on trunk. Contributed by Gergely Novák. 2017-12-04 10:45:07 +05:30
Jason Lowe
60f95fb719 YARN-7455. quote_and_append_arg can overflow buffer. Contributed by Jim Brennan 2017-12-01 15:47:01 -06:00
Robert Kanter
c83fe44917 YARN-4813. TestRMWebServicesDelegationTokenAuthentication.testDoAs fails intermittently (grepas via rkanter) 2017-12-01 12:18:13 -08:00
Wangda Tan
7225ec0ceb YARN-6507. Add support in NodeManager to isolate FPGA devices with CGroups. (Zhankun Tang via wangda)
Change-Id: Ic9afd841805f1035423915a0b0add5f3ba96cf9d
2017-12-01 10:50:49 -08:00
Sunil G
556aea3f36 YARN-7487. Ensure volume to include GPU base libraries after created by plugin. Contributed by Wangda Tan. 2017-12-01 13:36:28 +05:30
Wangda Tan
a63d19d365 YARN-6124. Make SchedulingEditPolicy can be enabled / disabled / updated with RMAdmin -refreshQueues. (Zian Chen via wangda)
Change-Id: Id93656f3af7dcd78cafa94e33663c78d410d43c2
2017-11-30 15:57:22 -08:00
Wangda Tan
c9a54aab6b YARN-7573. Gpu Information page could be empty for nodes without GPU. (Sunil G via wangda)
Change-Id: I7f614e5a589a09ce4e4286c84b706e05c29abd14
2017-11-29 17:46:16 -08:00
Daniel Templeton
8498d287cd YARN-7541. Node updates don't update the maximum cluster capability for resources other than CPU and memory 2017-11-29 11:11:36 -08:00
Jason Lowe
a2c7a73e33 YARN-6647. RM can crash during transitionToStandby due to InterruptedException. Contributed by Bibin A Chundatt 2017-11-28 11:15:44 -06:00
Yufei Gu
d8923cdbf1 YARN-7363. ContainerLocalizer don't have a valid log4j config in case of Linux container executor. (Contributed by Yufei Gu) 2017-11-27 14:31:52 -08:00
Jian He
fedabcad42 YARN-6168. Restarted RM may not inform AM about all existing containers. Contributed by Chandni Singh 2017-11-27 10:19:58 -08:00
Yufei Gu
2bde3aedf1 YARN-7290. Method canContainerBePreempted can return true when it shouldn't. (Contributed by Steven Rand) 2017-11-24 23:32:46 -08:00
Wangda Tan
834e91ee91 YARN-7509. AsyncScheduleThread and ResourceCommitterService are still running after RM is transitioned to standby. (Tao Yang via wangda)
Change-Id: I7477fe355419fd4a0a6e2bdda7319abad4c4c748
2017-11-23 19:59:03 -08:00
Arun Suresh
b46ca7e73b YARN-6483. Add nodes transitioning to DECOMMISSIONING state to the list of updated nodes returned to the AM. (Juan Rodriguez Hortala via asuresh) 2017-11-22 19:18:30 -08:00
Sunil G
aab439593b YARN-7330. Add support to show GPU in UI including metrics. Contributed by Wangda Tan. 2017-11-23 07:54:20 +05:30
Yufei Gu
4cc9479dae YARN-7524. Remove unused FairSchedulerEventLog. (Contributed by Wilfred Spiegelenburg) 2017-11-22 14:18:36 -08:00
Eric Yang
d42a336cfa YARN-5534. Allow user provided Docker volume mount list. (Contributed by Shane Kumpf) 2017-11-22 13:05:34 -05:00
yufei
03c311eae3 YARN-7513. Remove the scheduler lock in FSAppAttempt.getWeight() (Contributed by Wilfred Spiegelenburg) 2017-11-21 10:33:34 -08:00
Wangda Tan
0d781dd03b YARN-7527. Over-allocate node resource in async-scheduling mode of CapacityScheduler. (Tao Yang via wangda)
Change-Id: I51ae6c2ab7a3d1febdd7d8d0519b63a13295ac7d
2017-11-20 11:48:15 -08:00
bibinchundatt
b5b81a4f08 YARN-7489. ConcurrentModificationException in RMAppImpl#getRMAppMetrics. Contributed by Tao Yang. 2017-11-18 19:25:29 +05:30
Subru Krishnan
d5f66888b8 YARN-6128. Add support for AMRMProxy HA. (Botong Huang via Subru). 2017-11-17 17:39:06 -08:00
Eric Yang
0940e4f692 YARN-7218. Decouple YARN Services REST API namespace from RM. (Contributed by Eric Yang) 2017-11-17 12:28:12 -05:00
Wangda Tan
0987a7b8cb YARN-7419. CapacityScheduler: Allow auto leaf queue creation after queue mapping. (Suma Shivaprasad via wangda)
Change-Id: Ia1704bb8cb5070e5b180b5a85787d7b9ca57ebc6
2017-11-16 11:25:52 -08:00
Sunil G
61ace174cd YARN-7469. Capacity Scheduler Intra-queue preemption: User can starve if newest app is exactly at user limit. Contributed by Eric Payne. 2017-11-16 22:34:23 +05:30
Daniel Templeton
b246c54749 YARN-7414. FairScheduler#getAppWeight() should be moved into FSAppAttempt#getWeight()
(Contributed by Soumabrata Chakraborty via Daniel Templeton)
2017-11-15 10:03:29 -08:00
Junping Du
e14f03dfbf YARN-6078. Containers stuck in Localizing state. Contributed by Billie Rinaldi. 2017-11-13 15:27:37 -08:00
Wangda Tan
dd07038ffa YARN-6909. Use LightWeightedResource when number of resource types more than two. (Sunil G via wangda)
Change-Id: I90e021c5dea7abd9ec6bd73b2287c8adebe14595
2017-11-09 14:51:15 -08:00
Konstantinos Karanasos
ac4d2b1081 YARN-7437. Rename PlacementSet and SchedulingPlacementSet. (Wangda Tan via kkaranasos) 2017-11-09 13:01:24 -08:00
Robert Kanter
a2c150a736 YARN-7386. Duplicate Strings in various places in Yarn memory (misha@cloudera.com via rkanter) 2017-11-09 12:12:52 -08:00
Haibo Chen
a1382a18df YARN-7388. TestAMRestart should be scheduler agnostic. 2017-11-09 10:49:50 -08:00
bibinchundatt
0a72c2f56c YARN-7454. RMAppAttemptMetrics#getAggregateResourceUsage can NPE due to double lookup. Contributed by Jason Lowe. 2017-11-09 21:01:19 +05:30
Daniel Templeton
49b4c0b334 YARN-7458. TestContainerManagerSecurity is still flakey
(Contributed by Robert Kanter via Daniel Templeton)

Change-Id: Ibb1975ad086c3a33f8af0b4f8b9a13c3cdca3f7d
2017-11-08 17:31:14 -08:00
Daniel Templeton
0de10680b7 YARN-7166. Container REST endpoints should report resource types
Change-Id: If9c2fe58d4cf758bb6b6cf363dc01f35f8720987
2017-11-08 16:43:49 -08:00
Arun Suresh
cb35a59589 YARN-7343. Add a junit test for ContainerScheduler recovery. (Sampada Dehankar via asuresh) 2017-11-08 08:14:02 -08:00
Arun Suresh
a9c70b0e84 YARN-7453. Fix issue where RM fails to switch to active after first successful start. (Rohith Sharma K S via asuresh) 2017-11-08 08:00:53 -08:00
Daniel Templeton
8db9d61ac2 YARN-7401. Reduce lock contention in ClusterNodeTracker#getClusterCapacity() 2017-11-07 14:53:48 -08:00
Wangda Tan
13fa2d4e3e YARN-7394. Merge code paths for Reservation/Plan queues and Auto Created queues. (Suma Shivaprasad via wangda) 2017-11-06 21:38:24 -08:00
Haibo Chen
8f214dc4f8 YARN-7360. TestRM.testNMTokenSentForNormalContainer() should be scheduler agnostic. 2017-11-06 15:45:37 -08:00
Jian He
a55d0738f1 YARN-7371. Added allocateRequestId in NMContainerStatus for recovery. Contributed by Chandni Singh 2017-11-06 13:30:20 -08:00
Jian He
c723021579 YARN-6626. Embed REST API service into RM. Contributed by Eric Yang 2017-11-06 13:30:17 -08:00
Jian He
673c0db43c Revert "YARN-6626. Embed REST API service into RM. Contributed by Eric Yang"
This reverts commit 63d1084e9781e0fee876916190b69f6242dd00e4.
2017-11-06 13:30:17 -08:00
Jian He
9e677fa05c YARN-6626. Embed REST API service into RM. Contributed by Eric Yang 2017-11-06 13:30:17 -08:00
Billie Rinaldi
ce74e64363 YARN-7210. Some NPE fixes in Registry DNS. Contributed by Jian He 2017-11-06 13:30:16 -08:00
Jian He
bd96c4c235 Rebase onto latest trunk. minor conflicts 2017-11-06 13:30:13 -08:00
Billie Rinaldi
1888318c89 YARN-6903. Yarn-native-service framework core rewrite. Contributed by Jian He 2017-11-06 13:30:11 -08:00
Jian He
8d335e59cf YARN-6804. [yarn-native-services changes] Allow custom hostname for docker containers in native services. Contributed by Billie Rinaldi 2017-11-06 13:30:10 -08:00
Billie Rinaldi
ce05c6e981 YARN-6545. Followup fix for YARN-6405. Contributed by Jian He 2017-11-06 13:30:07 -08:00
bibinchundatt
dcd99c4b9a Add containerId to Localizer failed logs. Contributed by Prabhu Joseph 2017-11-06 22:39:10 +05:30
Inigo Goiri
6fc09beac4 YARN-7434. Router getApps REST invocation fails with multiple RMs. Contributed by Inigo Goiri. 2017-11-02 21:29:53 -07:00
Eric Payne
e6ec02001f YARN-7370: Preemption properties should be refreshable. Contrubted by Gergely Novák. 2017-11-02 12:37:33 -05:00
Jason Lowe
d00b6f7c1f YARN-7286. Add support for docker to have no capabilities. Contributed by Eric Badger 2017-11-02 09:37:17 -05:00
Rohith Sharma K S
940ffe3f9c addendum patch for YARN-7289. 2017-11-02 13:55:19 +05:30
Jian He
0cc98ae0ec YARN-7396. NPE when accessing container logs due to null dirsHandler. Contributed by Jonathan Hung 2017-11-01 17:00:32 -07:00
Eric Yang
7a49ddfdde YARN-7412. Fix unit test for docker mount check on ubuntu. (Contributed by Eric Badger) 2017-11-01 18:39:56 -04:00
Inigo Goiri
70f1a9470c YARN-7276 addendum to add timeline service depencies. Contributed by Inigo Goiri. 2017-11-01 13:26:37 -07:00
Daniel Templeton
9711b78998 YARN-7374. Improve performance of DRF comparisons for resource types in fair scheduler 2017-10-29 18:54:33 -07:00
Yufei Gu
d4811c8cfa YARN-6747. TestFSAppStarvation.testPreemptionEnable fails intermittently. (Contributed by Miklos Szegedi) 2017-10-29 16:44:16 -07:00
Sunil G
9114d7a5a0 YARN-7224. Support GPU isolation for docker container. Contributed by Wangda Tan. 2017-10-29 11:08:44 +05:30
Daniel Templeton
e62bbbca7a YARN-7397. Reduce lock contention in FairScheduler#getAppWeight() 2017-10-28 09:13:13 -07:00
Arun Suresh
9c5c68745e YARN-7299. Fix TestDistributedScheduler. (asuresh) 2017-10-27 23:08:18 -07:00
Inigo Goiri
8be5707067 YARN-7276. Federation Router Web Service fixes. Contributed by Inigo Goiri. 2017-10-27 16:46:05 -07:00
Jason Lowe
665bb147aa YARN-7244. ShuffleHandler is not aware of disks that are added. Contributed by Kuhu Shukla 2017-10-27 16:56:05 -05:00
Rohith Sharma K S
5c799ecf09 YARN-7289. Application lifetime does not work with FairScheduler. Contributed by Miklos Szegedi. 2017-10-27 22:46:38 +05:30
Sunil G
792388e1c0 YARN-7375. Possible NPE in RMWebapp when HA is enabled and the active RM fails. Contributed by Chandni Singh. 2017-10-27 20:53:57 +05:30
Wangda Tan
36e158ae98 YARN-7307. Allow client/AM update supported resource types via YARN APIs. (Sunil G via wangda)
Change-Id: I14c5ea7252b7c17e86ab38f692b5f9d43196dbe0
2017-10-26 20:15:19 -07:00
Robert Kanter
b1de78619f YARN-7262. Add a hierarchy into the ZKRMStateStore for delegation token znodes to prevent jute buffer overflow (rkanter) 2017-10-26 17:47:32 -07:00
Robert Kanter
088ffee716 YARN-7320. Duplicate LiteralByteStrings in SystemCredentialsForAppsProto.credentialsForApp_ Addendum (misha@cloudera.com via rkanter) 2017-10-26 15:50:14 -07:00
Subru Krishnan
25932da6d1 YARN-5516. Add REST API for supporting recurring reservations. (Sean Po via Subru). 2017-10-26 12:10:14 -07:00
Robert Kanter
2da654e34a YARN-7358. TestZKConfigurationStore and TestLeveldbConfigurationStore should explicitly set capacity scheduler (haibochen via rkanter) 2017-10-26 10:25:10 -07:00
Subru Krishnan
3fae675383 YARN-4827. Document configuration of ReservationSystem for FairScheduler. (Yufei Gu via Subru). 2017-10-25 15:07:50 -07:00
Haibo Chen
d7f3737f3b YARN-7389. Make TestResourceManager Scheduler agnostic. (Robert Kanter via Haibo Chen) 2017-10-24 22:17:56 -07:00
Robert Kanter
03af442e76 YARN-7385. TestFairScheduler#testUpdateDemand and TestFSLeafQueue#testUpdateDemand are failing with NPE (yufeigu via rkanter) 2017-10-24 13:36:50 -07:00
Carlo Curino
1c5c2b5dde YARN-7339. LocalityMulticastAMRMProxyPolicy should handle cancel request properly. (Botong Huang via curino) 2017-10-24 10:39:04 -07:00
Robert Kanter
025c656572 YARN-7382. NoSuchElementException in FairScheduler after failover causes RM crash (rkanter) 2017-10-24 10:21:44 -07:00
Robert Kanter
5da295a34e YARN-7320. Duplicate LiteralByteStrings in SystemCredentialsForAppsProto.credentialsForApp_ (misha@cloudera.com via rkanter) 2017-10-23 17:56:56 -07:00
Daniel Templeton
9e77dc2bd1 YARN-7357. Several methods in TestZKRMStateStore.TestZKRMStateStoreTester.TestZKRMStateStoreInternal should have @Override annotations
(Contributed by Sen Zhao via Daniel Templeton)
2017-10-23 13:51:19 -07:00
Eric Payne
921338cd86 YARN-4163: Audit getQueueInfo and getApplications calls 2017-10-23 11:43:41 -05:00
Haibo Chen
480187aebb YARN-7372. TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic is flaky. 2017-10-20 14:24:17 -07:00
Yufei Gu
0799fde35e YARN-7261. Add debug message for better download latency monitoring. (Yufei Gu) 2017-10-20 10:00:13 -07:00
Eric Yang
b61144a93d YARN-7353. Improved volume mount check for directories and unit test compatibility on RHEL7. Contributed by Eric Badger. 2017-10-20 12:02:06 -04:00
Yufei Gu
1f4cdf1068 YARN-4090. Make Collections.sort() more efficient by caching resource usage. (Contributed by Yufei Gu, Shilong Zhang and Xianyin Xin) 2017-10-20 01:32:20 -07:00
Yufei Gu
7b4b018780 YARN-7359. TestAppManager.testQueueSubmitWithNoPermission() should be scheduler agnostic. (Contributed by Haibo Chen) 2017-10-19 16:51:47 -07:00
Yufei Gu
cbd2b73ef8 YARN-7294. TestSignalContainer#testSignalRequestDeliveryToNM fails intermittently with Fair Scheduler. (Contributed by Miklos Szegedi) 2017-10-19 16:39:25 -07:00
Wangda Tan
c1b08ba720 YARN-7345. GPU Isolation: Incorrect minor device numbers written to devices.deny file. (Jonathan Hung via wangda) 2017-10-19 14:45:44 -07:00
Subru Krishnan
75323394fb YARN-7311. Fix TestRMWebServicesReservation parametrization for fair scheduler. (Yufei Gu via Subru). 2017-10-17 12:38:06 -07:00
Haibo Chen
acabc657ff YARN-7341. TestRouterWebServiceUtil#testMergeMetrics is flakey. (Robert Kanter via Haibo Chen) 2017-10-17 10:15:53 -07:00
Robert Kanter
8a61525928 YARN-7308. TestApplicationACLs fails with FairScheduler (rkanter) 2017-10-16 15:34:32 -07:00
Nathan Roberts
4540ffd15f YARN-7333. container-executor fails to remove entries from a directory that is not writable or executable. Contributed by Jason Lowe. 2017-10-16 17:00:38 -05:00
Arun Suresh
a50be1b8f4 YARN-7275. NM Statestore cleanup for Container updates. (Kartheek Muthyala via asuresh) 2017-10-16 13:12:15 -07:00