Commit Graph

3090 Commits

Author SHA1 Message Date
Wangda Tan
a81144daa0 YARN-7666. Introduce scheduler specific environment variable support in ApplicationSubmissionContext for better scheduling placement configurations. (Sunil G via wangda)
Change-Id: I0fd826490f5160d47d42af2a9ac0bd8ec4e959dc
2018-01-05 15:12:04 -08:00
Robert Kanter
2aa4f0a559 YARN-7645. TestContainerResourceUsage#testUsageAfterAMRestartWithMultipleContainers is flakey with FairScheduler (rkanter) 2018-01-05 13:55:09 -08:00
Robert Kanter
f8e7dd9b10 YARN-7557. It should be possible to specify resource types in the fair scheduler increment value (grepas via rkanter) 2018-01-05 11:15:06 -08:00
Sunil G
0c75d0634b YARN-7619. Max AM Resource value in Capacity Scheduler UI has to be refreshed for every user. Contributed by Eric Payne. 2018-01-05 14:42:17 +05:30
Jason Lowe
d795661868 YARN-7678. Ability to enable logging of container memory stats. Contributed by Jim Brennan 2018-01-04 10:15:52 -06:00
Robert Kanter
7a55044803 YARN-7622. Allow fair-scheduler configuration on HDFS (gphillips via rkanter) 2018-01-03 15:31:50 -08:00
Haibo Chen
2f6c038be6 YARN-7602. NM should reference the singleton JvmMetrics instance. 2018-01-03 09:41:26 -08:00
Rohith Sharma K S
c9bf813c9a YARN-7692. Skip validating priority acls while recovering applications. Contributed by Sunil G. 2018-01-03 18:20:04 +05:30
Arun Suresh
c0c7cce81d YARN-7691. Add Unit Tests for ContainersLauncher. (Sampada Dehankar via asuresh) 2018-01-02 22:03:00 -08:00
Miklos Szegedi
7f515f57ed YARN-7585. NodeManager should go unhealthy when state store throws DBException. Contributed by Wilfred Spiegelenburg. 2018-01-02 18:03:04 -08:00
Miklos Szegedi
b82049b4f0 YARN-7580. ContainersMonitorImpl logged message lacks detail when exceeding memory limits. Contributed by Wilfred Spiegelenburg. 2017-12-29 12:49:37 -08:00
Arun Suresh
a55884c68e YARN-7542. Fix issue that causes some Running Opportunistic Containers to be recovered as PAUSED. (Sampada Dehankar via asuresh) 2017-12-28 22:20:42 -08:00
Robert Kanter
382215c72b YARN-7577. Unit Fail: TestAMRestart#testPreemptedAMRestartOnRMRestart (miklos.szegedi@cloudera.com via rkanter) 2017-12-20 13:39:00 -08:00
Sunil G
d62932c3b2 YARN-7032. [ATSv2] NPE while starting hbase co-processor when HBase authorization is enabled. Contributed by Rohith Sharma K S. 2017-12-20 11:31:15 +05:30
Eric Yang
94a2ac6b71 YARN-7466. addendum patch for failing unit test. (Contributed by Chandni Singh) 2017-12-19 18:42:27 -05:00
Varun Saxena
c0aeb666a4 YARN-7662. [ATSv2] Define new set of configurations for reader and collectors to bind (Rohith Sharma K S via Varun Saxena) 2017-12-19 22:29:24 +05:30
Jason Lowe
811fabdebe YARN-7661. NodeManager metrics return wrong value after update node resource. Contributed by Yang Wang 2017-12-18 15:20:06 -06:00
Akira Ajisaka
001008958d
YARN-7664. Several javadoc errors. Contributed by Sean Mackrory. 2017-12-18 22:24:51 +09:00
Wangda Tan
44825f0960 YARN-7629. TestContainerLaunch# fails after YARN-7381. (Jason Lowe via wangda)
Change-Id: Ia6a3f05c9a7e797d8190123d304ecc4e2b018e33
2017-12-15 15:40:56 -08:00
Wangda Tan
631b5c2db7 YARN-5418. When partial log aggregation is enabled, display the list of aggregated files on the container log page. (Xuan Gong via wangda)
Change-Id: I1befb0bbaeb89fb315bafe3e2f3379663f8cf1ec
2017-12-15 15:38:36 -08:00
Rohith Sharma K S
09d996fdd4 YARN-7190. Ensure only NM classpath in 2.x gets TSv2 related hbase jars, not the user classpath. Contributed by Varun Saxena. 2017-12-15 21:50:28 +05:30
Sunil G
890d3d0645 YARN-7638. Unit tests related to preemption for auto created leaf queues feature.Contributed by Suma Shivaprasad. 2017-12-15 13:00:57 +05:30
Subru Krishnan
17ba74be29 YARN-7630. Fix AMRMToken rollover handling in AMRMProxy. Contributed by Botong Huang. 2017-12-14 14:03:55 -08:00
Chen Liang
46e18c8da7 HADOOP-14914. Change to a safely casting long to int. Contributed by Ajay Kumar. 2017-12-13 14:56:14 -08:00
Sunil G
cb87e4dc92 YARN-7643. Handle recovery of applications in case of auto-created leaf queue mapping. Contributed by Suma Shivaprasad. 2017-12-13 22:49:58 +05:30
Weiwei Yang
7efc4f7688 YARN-7647. NM print inappropriate error log when node-labels is enabled. Contributed by Yang Wang. 2017-12-13 13:11:41 +08:00
Jason Lowe
2abab1d7c5 YARN-7595. Container launching code suppresses close exceptions after writes. Contributed by Jim Brennan 2017-12-12 16:04:15 -06:00
Jason Lowe
06f0eb2dce YARN-7625. Expose NM node/containers resource utilization in JVM metrics. Contributed by Weiwei Yang 2017-12-12 12:56:26 -06:00
Sunil G
8bb83a8f62 Queue ACL validations should validate parent queue ACLs before auto-creating leaf queues. Contributed by Suma Shivaprasad. 2017-12-12 15:20:59 +05:30
Sunil G
5c87fb2f62 YARN-7635. TestRMWebServicesSchedulerActivities fails in trunk. Contributed by Sunil G. 2017-12-12 15:08:18 +05:30
Sunil G
312ceebde8 YARN-7632. Effective min and max resource need to be set for auto created leaf queues upon creation and capacity management. Contributed by Suma Shivaprasad. 2017-12-11 19:20:02 +05:30
Weiwei Yang
a2edc4cbf5 YARN-7608. Incorrect sTarget column causing DataTable warning on RM application and scheduler web page. Contributed by Gergely Novák. 2017-12-11 10:31:46 +08:00
Subru Krishnan
670e8d4ec7 YARN-6704. Add support for work preserving NM restart when FederationInterceptor is enabled in AMRMProxyService. (Botong Huang via Subru). 2017-12-08 15:39:18 -08:00
Wangda Tan
04b84da245 YARN-7443. Add native FPGA module support to do isolation with cgroups. (Zhankun Tang via wangda)
Change-Id: Ic4b7f9f3e032986b8f955139c9fe4d3a6c818a53
2017-12-08 15:18:22 -08:00
Wangda Tan
adca1a72e4 YARN-7591. NPE in async-scheduling mode of CapacityScheduler. (Tao Yang via wangda)
Change-Id: I46689e530550ee0a6ac7a29786aab2cc1bdf314f
2017-12-08 15:17:02 -08:00
Wangda Tan
a8316df8c0 YARN-7520. Queue Ordering policy changes for ordering auto created leaf queues within Managed parent Queues. (Suma Shivaprasad via wangda)
Change-Id: I482f086945bd448d512cb5b3879d7371e37ee134
2017-12-08 15:11:28 -08:00
Wangda Tan
f548bfffbd YARN-7420. YARN UI changes to depict auto created queues. (Suma Shivaprasad via wangda)
Change-Id: I8039d3772a191ddede132cd1f8b08a8ca2e275b7
2017-12-08 15:10:47 -08:00
Wangda Tan
b38643c9a8 YARN-7473. Implement Framework and policy for capacity management of auto created queues. (Suma Shivaprasad via wangda)
Change-Id: Icca7805fe12f6f7fb335effff4b121b6f7f6337b
2017-12-08 15:10:16 -08:00
Wangda Tan
74665e3a7d YARN-7274. Ability to disable elasticity at leaf queue level. (Zian Chen via wangda)
Change-Id: Ic8d43e297f0f5de788b562f7eff8106c5c35e8d2
2017-12-08 15:07:56 -08:00
Sunil G
4db4a4a165 YARN-7575. NPE in scheduler UI when max-capacity is not configured. Contributed by Sunil G. 2017-12-07 18:56:54 -08:00
Sunil G
daa1cdd062 YARN-7564. Cleanup to fix checkstyle issues of YARN-5881 branch. Contributed by Sunil G. 2017-12-07 18:56:54 -08:00
Wangda Tan
1012b901c8 YARN-7544. Use queue-path.capacity/maximum-capacity to specify absolute min/max resources. (Sunil G via wangda)
Change-Id: I685341be213eee500f51e02f01c91def89391c17
2017-12-07 18:56:54 -08:00
Wangda Tan
b7b8cd5324 YARN-7538. Fix performance regression introduced by Capacity Scheduler absolute min/max resource refactoring. (Sunil G via wangda)
Change-Id: Ic9bd7e599c56970fe01cb0e1bba6df7d1f77eb29
2017-12-07 18:56:54 -08:00
Wangda Tan
7462c38277 YARN-7483. CapacityScheduler test cases cleanup post YARN-5881. (Sunil G via wangda)
Change-Id: I9741a6baf5cb7352d05636efb6c0b24790e7589a
2017-12-07 18:56:54 -08:00
Rohith Sharma K S
e65ca92fb6 YARN-7482. Max applications calculation per queue has to be retrospected with absolute resource support. Contributed by Sunil G. 2017-12-07 18:56:54 -08:00
Wangda Tan
034b312d9f YARN-7411. Inter-Queue preemption's computeFixpointAllocation need to handle absolute resources while computing normalizedGuarantee. (Sunil G via wangda)
Change-Id: I41b1d7558c20fc4eb2050d40134175a2ef6330cb
2017-12-07 18:56:54 -08:00
Wangda Tan
aa3f62740f YARN-7332. Compute effectiveCapacity per each resource vector. (Sunil G via wangda) 2017-12-07 18:56:54 -08:00
Wangda Tan
d52627a7cb YARN-7254. UI and metrics changes related to absolute resource configuration. (Sunil G via wangda) 2017-12-07 18:56:54 -08:00
Wangda Tan
5e798b1a0d YARN-6471. Support to add min/max resource configuration for a queue. (Sunil G via wangda)
Change-Id: I9213f5297a6841fab5c573e85ee4c4e5f4a0b7ff
2017-12-07 18:56:54 -08:00
Weiwei Yang
e411dd6666 YARN-7607. Remove the trailing duplicated timestamp in container diagnostics message. Contributed by Weiwei Yang. 2017-12-07 17:29:40 +08:00
Weiwei Yang
05c347fe51 YARN-7611. Node manager web UI should display container type in containers page. Contributed by Weiwei Yang. 2017-12-06 12:21:52 +08:00
Sunil G
a957f1c60e YARN-7438. Additional changes to make SchedulingPlacementSet agnostic to ResourceRequest / placement algorithm. Contributed by Wangda Tan 2017-12-05 22:50:07 +05:30
Sunil G
f9f317b702 YARN-7586. Application Placement should be done before ACL checks in ResourceManager. Contributed by Suma Shivaprasad. 2017-12-05 18:28:31 +05:30
Robert Kanter
d8863fc16f YARN-5594. Handle old RMDelegationToken format when recovering RM (rkanter) 2017-12-04 13:14:55 -08:00
Arun Suresh
37ca416950 YARN-7587. Skip dispatching opportunistic containers to nodes whose queue is already full. (Weiwei Yang via asuresh) 2017-12-03 22:22:01 -08:00
Sunil G
81f6e46b2f YARN-6907. Node information page in the old web UI should report resource types. Contributed by Gergely Novák. 2017-12-04 11:27:23 +05:30
Sunil G
30f2646b15 YARN-7594. TestNMWebServices#testGetNMResourceInfo fails on trunk. Contributed by Gergely Novák. 2017-12-04 10:45:07 +05:30
Jason Lowe
60f95fb719 YARN-7455. quote_and_append_arg can overflow buffer. Contributed by Jim Brennan 2017-12-01 15:47:01 -06:00
Robert Kanter
c83fe44917 YARN-4813. TestRMWebServicesDelegationTokenAuthentication.testDoAs fails intermittently (grepas via rkanter) 2017-12-01 12:18:13 -08:00
Wangda Tan
7225ec0ceb YARN-6507. Add support in NodeManager to isolate FPGA devices with CGroups. (Zhankun Tang via wangda)
Change-Id: Ic9afd841805f1035423915a0b0add5f3ba96cf9d
2017-12-01 10:50:49 -08:00
Sunil G
556aea3f36 YARN-7487. Ensure volume to include GPU base libraries after created by plugin. Contributed by Wangda Tan. 2017-12-01 13:36:28 +05:30
Wangda Tan
a63d19d365 YARN-6124. Make SchedulingEditPolicy can be enabled / disabled / updated with RMAdmin -refreshQueues. (Zian Chen via wangda)
Change-Id: Id93656f3af7dcd78cafa94e33663c78d410d43c2
2017-11-30 15:57:22 -08:00
Wangda Tan
c9a54aab6b YARN-7573. Gpu Information page could be empty for nodes without GPU. (Sunil G via wangda)
Change-Id: I7f614e5a589a09ce4e4286c84b706e05c29abd14
2017-11-29 17:46:16 -08:00
Daniel Templeton
8498d287cd YARN-7541. Node updates don't update the maximum cluster capability for resources other than CPU and memory 2017-11-29 11:11:36 -08:00
Jason Lowe
a2c7a73e33 YARN-6647. RM can crash during transitionToStandby due to InterruptedException. Contributed by Bibin A Chundatt 2017-11-28 11:15:44 -06:00
Yufei Gu
d8923cdbf1 YARN-7363. ContainerLocalizer don't have a valid log4j config in case of Linux container executor. (Contributed by Yufei Gu) 2017-11-27 14:31:52 -08:00
Jian He
fedabcad42 YARN-6168. Restarted RM may not inform AM about all existing containers. Contributed by Chandni Singh 2017-11-27 10:19:58 -08:00
Yufei Gu
2bde3aedf1 YARN-7290. Method canContainerBePreempted can return true when it shouldn't. (Contributed by Steven Rand) 2017-11-24 23:32:46 -08:00
Wangda Tan
834e91ee91 YARN-7509. AsyncScheduleThread and ResourceCommitterService are still running after RM is transitioned to standby. (Tao Yang via wangda)
Change-Id: I7477fe355419fd4a0a6e2bdda7319abad4c4c748
2017-11-23 19:59:03 -08:00
Arun Suresh
b46ca7e73b YARN-6483. Add nodes transitioning to DECOMMISSIONING state to the list of updated nodes returned to the AM. (Juan Rodriguez Hortala via asuresh) 2017-11-22 19:18:30 -08:00
Sunil G
aab439593b YARN-7330. Add support to show GPU in UI including metrics. Contributed by Wangda Tan. 2017-11-23 07:54:20 +05:30
Yufei Gu
4cc9479dae YARN-7524. Remove unused FairSchedulerEventLog. (Contributed by Wilfred Spiegelenburg) 2017-11-22 14:18:36 -08:00
Eric Yang
d42a336cfa YARN-5534. Allow user provided Docker volume mount list. (Contributed by Shane Kumpf) 2017-11-22 13:05:34 -05:00
yufei
03c311eae3 YARN-7513. Remove the scheduler lock in FSAppAttempt.getWeight() (Contributed by Wilfred Spiegelenburg) 2017-11-21 10:33:34 -08:00
Wangda Tan
0d781dd03b YARN-7527. Over-allocate node resource in async-scheduling mode of CapacityScheduler. (Tao Yang via wangda)
Change-Id: I51ae6c2ab7a3d1febdd7d8d0519b63a13295ac7d
2017-11-20 11:48:15 -08:00
bibinchundatt
b5b81a4f08 YARN-7489. ConcurrentModificationException in RMAppImpl#getRMAppMetrics. Contributed by Tao Yang. 2017-11-18 19:25:29 +05:30
Subru Krishnan
d5f66888b8 YARN-6128. Add support for AMRMProxy HA. (Botong Huang via Subru). 2017-11-17 17:39:06 -08:00
Eric Yang
0940e4f692 YARN-7218. Decouple YARN Services REST API namespace from RM. (Contributed by Eric Yang) 2017-11-17 12:28:12 -05:00
Wangda Tan
0987a7b8cb YARN-7419. CapacityScheduler: Allow auto leaf queue creation after queue mapping. (Suma Shivaprasad via wangda)
Change-Id: Ia1704bb8cb5070e5b180b5a85787d7b9ca57ebc6
2017-11-16 11:25:52 -08:00
Sunil G
61ace174cd YARN-7469. Capacity Scheduler Intra-queue preemption: User can starve if newest app is exactly at user limit. Contributed by Eric Payne. 2017-11-16 22:34:23 +05:30
Daniel Templeton
b246c54749 YARN-7414. FairScheduler#getAppWeight() should be moved into FSAppAttempt#getWeight()
(Contributed by Soumabrata Chakraborty via Daniel Templeton)
2017-11-15 10:03:29 -08:00
Junping Du
e14f03dfbf YARN-6078. Containers stuck in Localizing state. Contributed by Billie Rinaldi. 2017-11-13 15:27:37 -08:00
Wangda Tan
dd07038ffa YARN-6909. Use LightWeightedResource when number of resource types more than two. (Sunil G via wangda)
Change-Id: I90e021c5dea7abd9ec6bd73b2287c8adebe14595
2017-11-09 14:51:15 -08:00
Konstantinos Karanasos
ac4d2b1081 YARN-7437. Rename PlacementSet and SchedulingPlacementSet. (Wangda Tan via kkaranasos) 2017-11-09 13:01:24 -08:00
Robert Kanter
a2c150a736 YARN-7386. Duplicate Strings in various places in Yarn memory (misha@cloudera.com via rkanter) 2017-11-09 12:12:52 -08:00
Haibo Chen
a1382a18df YARN-7388. TestAMRestart should be scheduler agnostic. 2017-11-09 10:49:50 -08:00
bibinchundatt
0a72c2f56c YARN-7454. RMAppAttemptMetrics#getAggregateResourceUsage can NPE due to double lookup. Contributed by Jason Lowe. 2017-11-09 21:01:19 +05:30
Daniel Templeton
49b4c0b334 YARN-7458. TestContainerManagerSecurity is still flakey
(Contributed by Robert Kanter via Daniel Templeton)

Change-Id: Ibb1975ad086c3a33f8af0b4f8b9a13c3cdca3f7d
2017-11-08 17:31:14 -08:00
Daniel Templeton
0de10680b7 YARN-7166. Container REST endpoints should report resource types
Change-Id: If9c2fe58d4cf758bb6b6cf363dc01f35f8720987
2017-11-08 16:43:49 -08:00
Arun Suresh
cb35a59589 YARN-7343. Add a junit test for ContainerScheduler recovery. (Sampada Dehankar via asuresh) 2017-11-08 08:14:02 -08:00
Arun Suresh
a9c70b0e84 YARN-7453. Fix issue where RM fails to switch to active after first successful start. (Rohith Sharma K S via asuresh) 2017-11-08 08:00:53 -08:00
Daniel Templeton
8db9d61ac2 YARN-7401. Reduce lock contention in ClusterNodeTracker#getClusterCapacity() 2017-11-07 14:53:48 -08:00
Wangda Tan
13fa2d4e3e YARN-7394. Merge code paths for Reservation/Plan queues and Auto Created queues. (Suma Shivaprasad via wangda) 2017-11-06 21:38:24 -08:00
Haibo Chen
8f214dc4f8 YARN-7360. TestRM.testNMTokenSentForNormalContainer() should be scheduler agnostic. 2017-11-06 15:45:37 -08:00
Jian He
a55d0738f1 YARN-7371. Added allocateRequestId in NMContainerStatus for recovery. Contributed by Chandni Singh 2017-11-06 13:30:20 -08:00
Jian He
c723021579 YARN-6626. Embed REST API service into RM. Contributed by Eric Yang 2017-11-06 13:30:17 -08:00
Jian He
673c0db43c Revert "YARN-6626. Embed REST API service into RM. Contributed by Eric Yang"
This reverts commit 63d1084e9781e0fee876916190b69f6242dd00e4.
2017-11-06 13:30:17 -08:00
Jian He
9e677fa05c YARN-6626. Embed REST API service into RM. Contributed by Eric Yang 2017-11-06 13:30:17 -08:00
Billie Rinaldi
ce74e64363 YARN-7210. Some NPE fixes in Registry DNS. Contributed by Jian He 2017-11-06 13:30:16 -08:00
Jian He
bd96c4c235 Rebase onto latest trunk. minor conflicts 2017-11-06 13:30:13 -08:00