Commit Graph

2370 Commits

Author SHA1 Message Date
Andrew Chung
d3f0b7eab7
YARN-10760. Number of allocated OPPORTUNISTIC containers can dip below 0 (#3642) 2021-11-23 13:21:51 -08:00
Andrew Chung
5b1b2c8ef6
YARN-11003. Make RMNode aware of all (OContainer inclusive) allocated resources (#3646) 2021-11-23 13:20:08 -08:00
Viraj Jasani
c0bdba8fac
HADOOP-18017. unguava: remove Preconditions from hadoop-yarn-project modules (#3687) 2021-11-23 13:36:22 +09:00
Szilard Nemeth
7cb887e6c2 YARN-10997. Revisit allocation and reservation logging. Contributed by Andras Gyori 2021-11-12 15:43:45 +01:00
Szilard Nemeth
e220e88eca YARN-10996. Fix race condition of User object acquisitions. Contributed by Andras Gyori 2021-11-12 15:33:39 +01:00
Szilard Nemeth
d598904046
YARN-10904. Investigate: Remove unnecessary fields from AbstractCSQueue (#3551) contributed by Szilard Nemeth 2021-10-27 19:03:45 +02:00
Szilard Nemeth
66ac476b48
YARN-10924. Clean up CapacityScheduler#initScheduler (#3581) Contributed by Szilard Nemeth 2021-10-27 17:13:49 +02:00
Jack
9cfd8d0a83 YARN-10909. AbstractCSQueue: Annotate all methods with VisibleForTesting that are only used by test code. Contributed by JackWangCS, Szilard Nemeth 2021-10-23 14:47:09 +02:00
9uapaw
32ecaed9c3 YARN-10930. Introduce universal capacity resource vector. Contributed by Andras Gyori 2021-10-22 17:32:33 +02:00
Adam Antal
23772d946b YARN-10948. Rename SchedulerQueue#activeQueue to activateQueue. Contributed by Adam Antal 2021-10-22 16:33:03 +02:00
Ahmed Hussein
d286994009 YARN-1115: Provide optional means for a scheduler to check real user ACLs. Contributed by Eric Payne (epayne) 2021-10-20 22:18:36 +00:00
Szilard Nemeth
20aeb5ecc3
YARN-10916. Investigate and simplify GuaranteedOrZeroCapacityOverTimePolicy#computeQueueManagementChanges. Contributed by Szilard Nemeth 2021-10-20 15:52:37 +02:00
Andras Gyori
35b8441fd9
YARN-10949. Simplify AbstractCSQueue#updateMaxAppRelatedField and find a more meaningful name for this method. Contributed by Andras Gyori 2021-10-20 12:56:41 +02:00
Szilard Nemeth
414d40155c
YARN-10958. Use correct configuration for Group service init in CSMappingPlacementRule (#3560)
* YARN-10958. Initial commit

* Fix javadoc + behaviour

* Fix review comments

* fix checkstyle + blanks

* fix checkstyle + blanks

* Fix checkstyle + blanks
2021-10-20 10:48:42 +02:00
9uapaw
616cea2e80 YARN-10954. Remove commented code block from CSQueueUtils#loadCapacitiesByLabelsFromConf. Contributed by Andras Gyori 2021-10-19 13:06:45 +02:00
Szilard Nemeth
025f97c8c2
YARN-10942. Move AbstractCSQueue fields to separate objects that are tracking usage. Contributed by Szilard Nemeth 2021-10-19 12:24:58 +02:00
Viraj Jasani
d336227e5c
HADOOP-17963. Replace Guava VisibleForTesting by Hadoop's own annotation in hadoop-yarn-project modules (#3541) 2021-10-14 18:03:01 +09:00
Benjamin Teke
35eff54556
YARN-10934. Fix LeafQueue#activateApplication NPE when the user of the pending application is missing from usersManager. Contributed by Benjamin Teke
Co-authored-by: Benjamin Teke <bteke@cloudera.com>
2021-10-07 20:11:42 +02:00
9uapaw
4b1b6b858a
YARN-10953. Make CapacityScheduler#getOrCreateQueueFromPlacementConte… Contributed by Andras Gyori 2021-10-07 17:09:38 +02:00
9uapaw
ed8e879320
YARN-10823. Expose all node labels for root without explicit configurations. Contributed by Andras Gyori 2021-10-01 04:20:36 +02:00
Neil
4bd0c36189
YARN-10970. Standby RM should expose prom endpoint (#3480)
Reviewed-by: Adam Antal <adamantal@apache.org>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2021-09-29 15:46:55 +09:00
9uapaw
9f6430c9ed
YARN-10897. Introduce QueuePath class. Contributed by Andras Gyori 2021-09-21 16:08:24 +02:00
Szilard Nemeth
4df4389325
YARN-10911. AbstractCSQueue: Create a separate class for usernames and weights that are travelling in a Map. Contributed by Szilard Nemeth 2021-09-20 16:47:46 +02:00
Adam Antal
a9b2469a53
YARN-10950. Code cleanup in QueueCapacities (#3454) 2021-09-19 14:42:02 +02:00
Szilard Nemeth
aa74a303ed
YARN-10913. AbstractCSQueue: Group preemption methods and fields into a separate class (#3420) 2021-09-19 13:11:56 +02:00
Eric Badger
43f0a34dd4 YARN-10935. AM Total Queue Limit goes below per-user AM Limit if parent is full. Contributed by Eric Payne. 2021-09-15 20:03:45 +00:00
Benjamin Teke
5dc2f7b137
YARN-10915. AbstractCSQueue: Simplify complex logic in methods: deriveCapacityFromAbsoluteConfigurations and updateEffectiveResources (#3418)
Co-authored-by: Benjamin Teke <bteke@cloudera.com>
2021-09-14 18:05:40 +02:00
Tamas Domok
783d94f5cd
YARN-10917. Investigate and simplify CapacitySchedulerConfigValidator#validateQueueHierarchy (#3403)
* YARN-10917. Investigate and simplify CapacitySchedulerConfigValidator#validateQueueHierarchy.

Co-authored-by: Tamas Domok <tdomok@cloudera.com>
2021-09-14 17:54:25 +02:00
Tamas Domok
63c892278f
YARN-10912. AbstractCSQueue#updateConfigurableResourceRequirement: Separate validation logic from initialization logic (#3390)
- capacityConfigType update is extracted to a separate method
 - validation logic is extracted to a helper function
 - min resource must not be greater than max resource is now checked
   after the max resource is updated

Change-Id: I731c2639281721afed32c30854bafcf048d6ee28

Co-authored-by: Tamas Domok <tdomok@cloudera.com>
2021-09-14 17:30:44 +02:00
Weihao Zheng
ad1d40970a
YARN-10928. Support default queue config for minimum-user-limit-percent/user-limit-factor (#3389)
Contributed by Weihao Zheng
2021-09-13 11:06:53 +08:00
Jack
d8026e387e
YARN-10903. Fix the headroom check in ParentQueue and RegularContainerAllocator for DRF (#3352)
Contributed by Jie Wang <jie.wang@hulu.com>
2021-09-13 10:54:11 +08:00
Benjamin Teke
971f1b8b0a
YARN-10872. Replace getPropsWithPrefix calls in AutoCreatedQueueTemplate (#3396)
Co-authored-by: Benjamin Teke <bteke@cloudera.com>
2021-09-10 17:32:42 +02:00
9uapaw
811fd23f23
YARN-10852. Optimise CSConfiguration getAllUserWeightsForQueue (#3392) 2021-09-10 16:59:46 +02:00
Benjamin Teke
b229e5a345
YARN-10910. AbstractCSQueue#setupQueueConfigs: Separate validation logic from initialization logic (#3407)
Co-authored-by: Benjamin Teke <bteke@cloudera.com>
2021-09-10 16:48:58 +02:00
Tamas Domok
29a6f141d4
YARN-10914. Simplify duplicated code for tracking ResourceUsage in AbstractCSQueue (#3402)
Co-authored-by: Tamas Domok <tdomok@cloudera.com>
2021-09-10 15:57:46 +02:00
Szilard Nemeth
2ff3fc50e4 YARN-10870. Missing user filtering check -> yarn.webapp.filter-entity-list-by-user for RM Scheduler page. Contributed by Gergely Pollak 2021-09-08 18:01:39 +02:00
Jack
4e209a31da
YARN-10919. Remove LeafQueue#scheduler field (#3382)
Co-authored-by: Jie Wang <jie.wang@hulu.com>
2021-09-08 16:19:29 +02:00
Tamas Domok
16e6030e25
YARN-10891. Extend QueueInfo with max-parallel-apps in CS. (#3314)
Co-authored-by: Tamas Domok <tdomok@cloudera.com>
2021-08-27 23:09:54 +02:00
Szilard Nemeth
e06a5cb197 YARN-10838. Implement an optimised version of Configuration getPropsWithPrefix. Contributed by Andras Gyori, Benjamin Teke 2021-08-24 15:27:34 +02:00
srinivasst
4f3f26ce09
YARN-10873: Account for scheduled AM containers before deactivating node (#3287)
* Account for scheduled AM containers before deactivating node

* Move AM container check to separate method.

* Fix UTs

* Fix UTs

* Remove unnecessary import

* Add timeout for UT
2021-08-17 14:18:55 +05:30
zhuqi-lucas
efb3fa2bf5 YARN-10854. Support marking inactive node as untracked without configured include path. Contributed by Tao Yang. 2021-08-02 18:23:33 +08:00
Benjamin Teke
ac0a4e7f58
YARN-10869. CS considers only the default maximum-allocation-mb/vcore property as a maximum when it creates dynamic queues (#3225)
Co-authored-by: Benjamin Teke <bteke@cloudera.com>
2021-07-29 17:56:14 +02:00
Szilard Nemeth
1b9efe58c9 YARN-10790. CS Flexible AQC: Add separate parent and leaf template property. Contributed by Andras Gyori 2021-07-28 16:50:14 +02:00
Szilard Nemeth
8d0297c213 YARN-10727. ParentQueue does not validate the queue on removal. Contributed by Andras Gyori 2021-07-28 14:49:10 +02:00
zhuqi-lucas
2da9b95d4d YARN-10657. We should make max application per queue to support node label. Contributed by Andras Gyori. 2021-07-22 20:30:43 +08:00
zhuqi-lucas
0441efe1fc YARN-10860. Make max container per heartbeat configs refreshable. Contributed by Eric Badger. 2021-07-21 15:31:44 +08:00
Jim Brennan
632f64cadb YARN-10456. RM PartitionQueueMetrics records are named QueueMetrics in Simon metrics registry. Contributed by Eric Payne. 2021-07-15 14:23:31 +00:00
Jim Brennan
dc6f456e95 YARN-10834. Intra-queue preemption: apps that don't use defined custom resource won't be preempted. Contributed by Eric Payne. 2021-06-28 14:52:19 +00:00
Peter Bacsko
0934e783cf YARN-10780. Optimise retrieval of configured node labels in CS queues. Contributed by Andras Gyori. 2021-06-24 20:15:10 +02:00
Szilard Nemeth
6562391737 YARN-10813. Set default capacity of root for node labels. Contributed by Andras Gyori 2021-06-16 18:55:09 +02:00
Szilard Nemeth
428478bbe2 YARN-10801. Fix Auto Queue template to properly set all configuration properties. Contributed by Andras Gyori 2021-06-16 18:26:58 +02:00
Szilard Nemeth
e31d06032b YARN-10802. Change Capacity Scheduler minimum-user-limit-percent to accept decimal values. Contributed by Benjamin Teke 2021-06-14 22:33:04 +02:00
Szilard Nemeth
7003997e36 YARN-10789. RM HA startup can fail due to race conditions in ZKConfigurationStore. Contributed by Tarun Parimi 2021-06-12 14:49:52 +02:00
Viraj Jasani
81d7069316
YARN-10805. Replace Guava Lists usage by Hadoop's own Lists in hadoop-yarn-project (#3075) 2021-06-09 15:15:47 +09:00
Prabhu Josephraj
9445abb500 YARN-10792. Set Completed AppAttempt LogsLink to Log Server URL. Contributed by Abhinaba Sarkar 2021-06-08 20:37:40 +05:30
zhuqi-lucas
ec16b1d3b9 YARN-10807. Parents node labels are incorrectly added to child queues in weight mode. Contributed by Benjamin Teke. 2021-06-08 21:03:43 +08:00
Szilard Nemeth
200eec8f2e YARN-10796. Capacity Scheduler: dynamic queue cannot scale out properly if its capacity is 0%. Contributed by Peter Bacsko 2021-06-03 13:27:36 +02:00
Szilard Nemeth
2707f69251 YARN-10787. Queue submit ACL check is wrong when CS queue is ambiguous. Contributed by Gergely Pollak 2021-06-01 16:01:39 +02:00
Gergely Pollak
e9339aa376 YARN-10797. Logging parameter issues in scheduler package. Contributed by Szilard Nemeth 2021-06-01 15:57:22 +02:00
Szilard Nemeth
b86a6eb871 YARN-10782. Extend /scheduler endpoint with template properties. Contributed by Andras Gyori 2021-05-25 18:27:53 +02:00
Szilard Nemeth
2541efa496 YARN-10783. Allow definition of auto queue template properties in root. Contributed by Andras Gyori 2021-05-25 13:55:59 +02:00
Viraj Jasani
996d31f2dc
HADOOP-17721. Replace Guava Sets usage by Hadoop's own Sets in hadoop-yarn-project (#3033)
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2021-05-25 18:10:20 +09:00
zhuqi-lucas
59172ada90 YARN-10771. Add cluster metric for size of SchedulerEventQueue and RMEventQueue. Contributed by chaosju. 2021-05-24 23:12:07 +08:00
Szilard Nemeth
1e44bdb84c YARN-7769. FS QueueManager should not create default queue at init. Contributed by Benjamin Teke 2021-05-22 14:55:01 +02:00
Peter Bacsko
8891e5c028 YARN-10763. Add the number of containers assigned per second metrics to ClusterMetrics. Contributed by chaosju. 2021-05-17 13:30:12 +02:00
lujiefsi
d92a25b790
YARN-10555. Missing access check before getAppAttempts (#2608)
Co-authored-by: lujie <lujie@foxmail.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2021-05-17 13:44:38 +09:00
zhuqi
e7f0e8073b YARN-10761: Add more event type to RM Dispatcher event metrics. Contributed by Qi Zhu. 2021-05-14 13:36:07 +08:00
zhuqi
d2b0675d61
YARN-10737: Fix typos in CapacityScheduler#schedule. (#2911)
Contributed by Qi Zhu.
2021-05-14 13:12:28 +08:00
Peter Bacsko
626be24c3e YARN-10571. Refactor dynamic queue handling logic. Contributed by Andras Gyori. 2021-05-12 14:54:47 +02:00
Peter Bacsko
9166bfeb74 YARN-10637. fs2cs: add queue autorefresh policy during conversion. Contributed by Qi Zhu. 2021-04-24 10:10:10 +02:00
Eric Badger
6857a05d6a YARN-10479. Can't remove all node labels after add node label without
nodemanager port, broken by YARN-10647. Contributed by D M Murali Krishna Reddy
2021-04-23 22:02:04 +00:00
Peter Bacsko
14a84c47b0 YARN-10705. Misleading DEBUG log for container assignment needs to be removed when the container is actually reserved, not assigned in FairScheduler. Contributed by Siddharth Ahuja. 2021-04-23 17:38:47 +02:00
Szilard Nemeth
919daec36b YARN-10746. RmWebApp add default-node-label-expression to the queue info. Contributed by Gergely Pollak 2021-04-23 16:12:12 +02:00
Szilard Nemeth
f76a2a7606 YARN-10654. Dots '.' in CSMappingRule path variables should be replaced. Contributed by Peter Bacsko 2021-04-23 16:07:58 +02:00
Eric Badger
6cb90005a7 YARN-10723. Change CS nodes page in UI to support custom resource. Contributed by Qi Zhu 2021-04-20 17:34:49 +00:00
Eric Badger
213d3deb26 YARN-10503. Support queue capacity in terms of absolute resources with custom
resourceType. Contributed by Qi Zhu.
2021-04-09 00:34:15 +00:00
Peter Bacsko
ca9aa91d10 YARN-10564. Support Auto Queue Creation template configurations. Contributed by Andras Gyori. 2021-04-08 12:42:48 +02:00
Szilard Nemeth
9cd69c20c4 YARN-10714. Remove dangling dynamic queues on reinitialization. Contributed by Andras Gyori 2021-04-07 11:52:21 +02:00
Eric Badger
26b8f678b2 YARN-10702. Add cluster metric for amount of CPU used by RM Event Processor.
Contributed by Jim Brennan.
2021-04-06 01:16:14 +00:00
Peter Bacsko
158758c5bf YARN-10726. Log the size of DelegationTokenRenewer event queue in case of too many pending events. Contributed by Qi Zhu. 2021-04-01 16:09:52 +02:00
Peter Bacsko
9f1655baf2 YARN-9618. NodesListManager event improvement. Contributed by Qi Zhu. 2021-04-01 11:39:40 +02:00
Szilard Nemeth
6fd0c661b6 YARN-10597. CSMappingPlacementRule should not create new instance of Groups. Contributed by Gergely Pollak 2021-03-31 16:14:21 +02:00
Peter Bacsko
ff6ec20d84 YARN-10718. Fix CapacityScheduler#initScheduler log error. Contributed by Qi Zhu. 2021-03-31 10:55:14 +02:00
Eric Badger
19e418c10d YARN-10713. ClusterMetrics should support custom resource capacity related metrics. Contributed by Qi Zhu. 2021-03-25 22:33:58 +00:00
Peter Bacsko
ceb75e1e2a YARN-10674. fs2cs should generate auto-created queue deletion properties. Contributed by Qi Zhu. 2021-03-24 08:15:06 +01:00
Jim Brennan
174f3a96b1 YARN-10697. Resources are displayed in bytes in UI for schedulers other than capacity. Contributed by Bilwa S T. 2021-03-23 18:21:45 +00:00
Cyrus Jackson
cd44e917d0
YARN-10476. Queue metrics for Unmanaged applications (#2674). Contributed by Cyrus Jackson 2021-03-19 15:49:05 +05:30
Peter Bacsko
ce6bfd5718 YARN-10641. Refactor the max app related update, and fix maxApllications update error when add new queues. Contributed by Qi Zhu. 2021-03-18 13:40:16 +01:00
Szilard Nemeth
a5745711dd YARN-10659. Improve CS MappingRule %secondary_group evaluation. Contributed by Gergely Pollak 2021-03-18 12:43:01 +01:00
Peter Bacsko
d7eeca4d0c YARN-10685. Fix typos in AbstractCSQueue. Contributed by Qi Zhu. 2021-03-18 11:49:16 +01:00
Eric Badger
49f89f1d3d YARN-10688. ClusterMetrics should support GPU capacity related metrics.. Contributed by Qi Zhu. 2021-03-17 18:11:37 +00:00
Peter Bacsko
3e58d5611d YARN-10497. Fix an issue in CapacityScheduler which fails to delete queues. Contributed by Wangda Tan and Qi Zhu. 2021-03-17 13:38:20 +01:00
Wilfred Spiegelenburg
f276f1af80
YARN-10652. Capacity Scheduler fails to handle user weights for a user that has a "." (dot) in it
This only fixes the user name resolution for weights in the queues. It
does not add generic support for user names with dots in all use cases
in the capacity scheduler.

Contributed by: Siddharth Ahuja
2021-03-17 10:55:05 +11:00
Peter Bacsko
b80588b688 YARN-10682. The scheduler monitor policies conf should trim values separated by comma. Contributed by Qi Zhu. 2021-03-16 15:23:27 +01:00
zhuqi
e9c98548e9
YARN-10689. Fix the finding bugs in extractFloatValueFromWeightConfig. (#2760)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2021-03-16 13:50:29 +09:00
Eric Payne
aa4c17b9d7 YARN-10588. Percentage of queue and cluster is zero in WebUI . Contributed by Bilwa S T 2021-03-15 19:09:40 +00:00
Szilard Nemeth
5db4c0bf70 YARN-10412. Move CS placement rule related changes to a separate package. Contributed by Gergely Pollak 2021-03-12 14:10:16 +01:00
Peter Bacsko
d5e035dbe1 YARN-9615. Add dispatcher metrics to RM. Contributed by Jonathan Hung and Qi Zhu. 2021-03-09 14:33:14 +01:00
Peter Bacsko
3851994cd6 Revert "YARN-9615. Add dispatcher metrics to RM. Contributed by Qi Zhu."
This reverts commit 369f75b7a7.
2021-03-09 14:32:02 +01:00
Peter Bacsko
369f75b7a7 YARN-9615. Add dispatcher metrics to RM. Contributed by Qi Zhu. 2021-03-09 14:28:23 +01:00