From 4890b855ece2362f05491b10d4546860014fb00c Mon Sep 17 00:00:00 2001 From: Szilard Nemeth Date: Sat, 31 Oct 2020 15:17:01 +0100 Subject: [PATCH] YARN-10420. Update CS MappingRule documentation with the new format and features. Contributed by Peter Bacsko --- .../src/site/markdown/CapacityScheduler.md | 227 +++++++++++++++++- 1 file changed, 222 insertions(+), 5 deletions(-) diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/CapacityScheduler.md b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/CapacityScheduler.md index 6a857e9f7b..3e63c3a610 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/CapacityScheduler.md +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/CapacityScheduler.md @@ -116,6 +116,7 @@ Configuration ``` + ###Queue Properties * Resource Allocation @@ -172,6 +173,15 @@ Configuration **Note:** An *ACL* is of the form *user1*,*user2* *space* *group1*,*group2*. The special value of * implies *anyone*. The special value of *space* implies *no one*. The default is * for the root queue if not specified. + * Queue lifetime for applications + + The `CapacityScheduler` supports the following parameters to lifetime of an application: + +| Property | Description | +|:---- |:---- | +| `yarn.scheduler.capacity..maximum-application-lifetime` | Maximum lifetime (in seconds) of an application which is submitted to a queue. Any value less than or equal to zero will be considered as disabled. The default is -1. If positive value is configured then any application submitted to this queue will be killed after it exceeds the configured lifetime. User can also specify lifetime per application in application submission context. However, user lifetime will be overridden if it exceeds queue maximum lifetime. It is point-in-time configuration. Note: This feature can be set at any level in the queue hierarchy. Child queues will inherit their parent's value unless overridden at the child level. A value of 0 means no max lifetime and will override a parent's max lifetime. If this property is not set or is set to a negative number, then this queue's max lifetime value will be inherited from it's parent.| +| `yarn.scheduler.capacity.root..default-application-lifetime` | Default lifetime (in seconds) of an application which is submitted to a queue. Any value less than or equal to zero will be considered as disabled. If the user has not submitted application with lifetime value then this value will be taken. It is point-in-time configuration. This feature can be set at any level in the queue hierarchy. Child queues will inherit their parent's value unless overridden at the child level. If set to less than or equal to 0, the queue's max value must also be unlimited. Default lifetime can't exceed maximum lifetime. | + * Queue Mapping based on User or Group, Application Name or user defined placement rules The `CapacityScheduler` supports the following parameters to configure the queue mapping based on user or group, user & group, or application name. User can also define their own placement rule: @@ -247,14 +257,221 @@ Below example covers single mapping separately. In case of multiple mappings wit ``` - * Queue lifetime for applications +###JSON-based queue mapping configuration - The `CapacityScheduler` supports the following parameters to lifetime of an application: + In order to make the queue mapping feature more versatile, a new format and evaluation engine has been added to Capacity Scheduler. The new engine is fully backwards compatible with the old one and adds several new features. Note that it can also parse the old format, but the new features are only available if you specify the mappings in JSON. -| Property | Description | + * Syntax + + Based on the current JSON schema, users can define mapping rules the following way: + +``` +{ + "rules": [ + { + "type": "...", + "matches": "...", + "policy": "...", + "parentQueue": "...", + "customPlacement": "...", + "fallbackResult":"...", + "create": true/false, + "value": "...", + "customPlacement": "..." + }, + { + ... next rule ... + } + ] +} +``` + +Rules are evaluated from top to bottom. Compared to the legacy mapping rule evaluator, it can be adjusted more flexibly what happens when the evaluation stops and a given rule does not match. + + * Rules + + Each mapping rule can have the following settings: + +| Setting | Description | |:---- |:---- | -| `yarn.scheduler.capacity..maximum-application-lifetime` | Maximum lifetime (in seconds) of an application which is submitted to a queue. Any value less than or equal to zero will be considered as disabled. The default is -1. If positive value is configured then any application submitted to this queue will be killed after it exceeds the configured lifetime. User can also specify lifetime per application in application submission context. However, user lifetime will be overridden if it exceeds queue maximum lifetime. It is point-in-time configuration. Note: This feature can be set at any level in the queue hierarchy. Child queues will inherit their parent's value unless overridden at the child level. A value of 0 means no max lifetime and will override a parent's max lifetime. If this property is not set or is set to a negative number, then this queue's max lifetime value will be inherited from it's parent.| -| `yarn.scheduler.capacity.root..default-application-lifetime` | Default lifetime (in seconds) of an application which is submitted to a queue. Any value less than or equal to zero will be considered as disabled. If the user has not submitted application with lifetime value then this value will be taken. It is point-in-time configuration. This feature can be set at any level in the queue hierarchy. Child queues will inherit their parent's value unless overridden at the child level. If set to less than or equal to 0, the queue's max value must also be unlimited. Default lifetime can't exceed maximum lifetime. | +| `type` | Possible values: `user`, `group`, `application`. It tells the engine what the current rule should be matched against. | +| `matches` | The string to match, or an asterisk "*" which means "all". For example, if the type is `user` and this string is "hadoop" then the rule will only be evaluated if the submitter user is "hadoop". The "*" does not work with groups. | +| `policy` | Selects a list of pre-defined policies which defines where the application should be placed. This will be explained later in the "Policies" section. | +| `parentQueue` | In case of `user`, `primaryGroup`, `primaryGroupUser`, `secondaryGroup`, `secondaryGroupUser` policies, this tells the engine where the matching queue should be looked for. For example, if the policy is `primaryGroup`, parent is `root.groups` and the submitter's group is "admins", then the resulting queue will be "root.groups.admin" | +| `fallbackResult` | If the target queue does not exist or it cannot be created (ie. it exists under a regular parent), it defines a fallback action. Valid values are `skip`, `reject` and `placeDefault`. | +| `create` | Only applies to managed queue parents. If set to "false", then the queue will not be created if it does not exist. | +| `value` | If the policy is `setDefaultQueue`, then the default queue will change to this setting from "root.default". Otherwise ignored. | +| `customPlacement` | Only works with `custom` placement policy. The value of this field will be evaluated directly by the engine, which means that various placeholders such as `%application` or `%primary_group` will be replaced with their respective values. | + + + `type` is the equivalent of the first column in the old format. It is either "g" or "u" and there is a separate property for application mappings. `matches` is the second column. The only difference is that `%user` means to match all users, but it's not expressive enough. So in the new format, it's been changed to `*`. + The `fallbackResult` setting is checked what to do when the target queue cannot be created or does not exist. The three settings work the following way: +* `skip`: ignore the current rule and proceed to the next. This is how Fair Scheduler evaluates placement rules. +* `placeDefault`: place the application to the default queue `root.default` (unless it's overridden to something else). This is how Capacity Scheduler works with the old mapping rules. +* `reject`: rejects the submission. + + The `create` flag has no effect on the queue if the parent is not managed. + + * Policies + + There are a number of pre-defined placement policies which are similar to those in Fair Scheduler. Many of them can be expressed as a "custom" placement policy as you will see soon, but in many cases, it's safer and more straightforward to use them directly. + +| Policy | Description | +|:---- |:---- | +| `specified` | Places the application to the queue that was defined during submission. | +| `reject` | Rejects the submission. | +| `defaultQueue` | Places the application into the default queue `root.default` or to its overwritten value set by `setDefaultQueue`. | +| `user` | Places the application into a queue which matches the username of the submitter. | +| `applicationName` | Places the application into a queue which matches the name of the application. Important: it is case-sensitive, white spaces are not removed. | +| `primaryGroup` | Places the application into a queue which matches the primary group of the submitter. | +| `primaryGroupUser` | Places the application into the queue hierarchy `root.[parentQueue]..`. Note that `parentQueue` is optional. | +| `secondaryGroup` | Places the application into a queue which matches the secondary group of the submitter. | +| `secondaryGroupUser` | Places the application into the queue hierarchy `root.[parentQueue]..`. Note that `parentQueue` is optional. | +| `setDefaultQueue` | Changes the default queue from `root.default`. The change is permament in a sense that it is not restored in the next rule. You can change the default queue at any point and as many times as necessary. | +| `custom` | Enables the user to use custom placement strings. See explanation below. | + +Notes: + +1. The `setDefaultQueue` rule only changes the default queue. If you want to restore the default queue back to `root.default`, then it has to be added to the rule chain again. + +2. The nested rules `primaryGroupUser` and `secondaryGroupUser` expects the parent queues to exist, ie. they cannot be created automatically. More specifically: when you use `primaryGroupUser`, it will result in a queue path like `root..` and `root.` must exist. It can be a managed parent in order to have `userName` leaf created automatically, but the parent still has to be created by hand (this is in contrast to Fair Scheduler, where this scenario is more flexible). + +3. The `custom` placement policy can describe other policies with the appropriate variable placeholders (see below). For example, `primaryGroupUser` with the parent queue `root.groups` can be expressed as `root.groups.%primary_group.%user`. The primary reason for the rules to exist is that its easier to understand for user who have background in configuring Fair Scheduler and it is more natural to configure the mapping rules this way. It is also more robust because it's less likely that the user makes a mistake. The "Variables" section describes what variables are available if you intend to use the `custom` policy. + + + * Variables + + Internally, the tool populates certain variables with appropriate values. These can be used if `custom` mapping policy is selected. Note that the engine does only minimal verification when it comes to replacing them - therefore it is your responsibility to provide the correct string. + +| Variable | Meaning | +|:---- |:---- | +| `%application` | The name of the submitted application. | +| `%user` | The user who submitted the application. | +| `%primary_group` | Primary group of the submitter. | +| `%secondary_group` | Secondary (supplementary) group of the submitter. | +| `%default` | The default queue of the scheduler. | +| `%specified` | Contains the queue what the submitter defined. | + +Example: let's say we submit a MapReduce application to a queue `root.users.mrjobs`. In this case, the value of `%specified` will be set to `root.users.mrjobs`. + +As explained in the "Policies" section, quite a few policies can be achieved with `custom`. So, instead of using the `specified` policy, you can use `custom` with setting the `customPlacement` field to `%specified`. However, you have much greater control over it, because you can also append or prepend an extra string to these variables. So the following setting is possible: `%specified.%user.largejobs`. Keep in mind that the string must be resolved to a valid queue path in order to have a proper placement. + + + * Converting the old mapping rule format to the new one + + In this table, you can see how to rewrite the old, colon-separated rules to the new format. + +| Old mapping rule | JSON-based mapping rule | +|:---- |:---- | +| `u:username:root.user.queue` | { "type": "user",
"matches": "username",
"policy": "custom",
"customPlacement": "root.user.queue",
"fallbackResult":"placeDefault" }
| +| `u:%user:%user` | { "type": "user",
"matches": "*",
"policy": "user",
"fallbackResult":"placeDefault" }
| +| `u:%user:root.parent.%user` | { "type": "user",
"matches": "*",
"policy": "user",
"parentQueue": "root.parent",
"fallbackResult":"placeDefault" }
| +| `u:%user:%primary_group` | { "type": "user",
"matches": "*",
"policy": "primaryGroup",
"fallbackResult":"placeDefault" }
| +| `u:%user:%primary_group.%user` | { "type": "user",
"matches": "*",
"policy": "primaryGroupUser",
"fallbackResult":"placeDefault" }
| +| `u:%user:root.groups.%primary_group.%user` | { "type": "user",
"matches": "*",
"policy": "primaryGroupUser",
"parentQueue": "root.groups",
"fallbackResult":"placeDefault" }
| +| `u:%user:%secondary_group` | { "type": "user",
"matches": "*",
"policy": "secondaryGroup",
"fallbackResult":"placeDefault" }
| +| `u:%user:%secondary_group.%user` | { "type": "user",
"matches": "*",
"policy": "secondaryGroupUser",
"fallbackResult":"placeDefault" }
| +| `u:%user:root.groups.%secondary_group.%user` | { "type": "user",
"matches": "*",
"policy": "secondaryGroupUser",
"parentQueue": "root.groups",
"fallbackResult":"placeDefault" }
| +| `g:hadoop:root.groups.hadoop` | { "type": "group",
"matches": "hadoop",
"policy": "custom",
"customPlacement": "root.groups.hadoop",
"fallbackResult":"placeDefault" }
| +| `%application:%application` (application mapping) | { "type": "user",
"matches": "*",
"policy": "applicationName",
"fallbackResult":"placeDefault" }
| +| `hive_query:root.query.hive` (application mapping) | { "type": "application",
"matches": "hive_query",
"policy": "custom",
"customPlacement": "root.query.hive",
"fallbackResult":"placeDefault" }
| + + It's worth noting that `%application:%application` requires a `user` type matcher. It is because internally, the "*" is interpreted only for users. If you set the `type` to `application`, then the "*" means to match an application which is named "*". + + * Example + + We have a cluster which is shared among developers, QA engineers and test developers. + + We'd like to achieve the following placement logic: + +1. If the user belongs to the `devs` primary group, it should be placed to `root.users.devs`. This is reserved for developers. + +2. If the user belongs to the `qa` primary group, then the application should go to `root.users.lowpriogroups.`. These queues have lower capacities and are intended for testers. + +3. If the user belongs to the `qa-dev` primary group, then the application should go to `root.users.highpriogroups.`. These queues have higher capacities and are intended for test developers. + +4. Put the application into the queue which matches the user name. + +5. If there is no such queue, take the queue from the application submission context, but the queue should not be created if it does not exist and the parent is managed. + +6. If none of the above matches, then the application should be placed to `root.default`. + +7. If the default placement fails for whatever reason, we change the default queue to `root.users.default`. + +8. Try a placement to the default queue again. + +9. If that fails, reject the submission altogether. + + This means a chain of 9 rules: + + ```json + { + "rules":[ + { + "type": "group", + "matches": "devs", + "policy": "custom", + "customPlacement": "root.users.devs", + "fallbackResult":"skip" + }, + { + "type": "group", + "matches": "qa", + "policy": "primaryGroup", + "parentQueue": "root.users.lowpriogroups", + "fallbackResult":"skip" + }, + { + "type": "group", + "matches": "qa-dev", + "policy": "primaryGroup", + "parentQueue": "root.users.highpriogroups", + "fallbackResult":"skip" + }, + { + "type": "user", + "matches": "*", + "policy": "user", + "fallbackResult":"skip" + }, + { + "type": "user", + "matches": "*", + "policy": "specified", + "create": false, + "fallbackResult":"skip" + }, + { + "type": "user", + "matches": "*", + "policy": "defaultQueue", + "fallbackResult":"skip" + }, + { + "type": "user", + "matches": "*", + "policy": "setDefaultQueue", + "value": "root.users.default", + "fallbackResult": "skip" + }, + { + "type": "user", + "matches": "*", + "policy": "defaultQueue", + "fallbackResult":"skip" + }, + { + "type":"user", + "matches":"*", + "policy":"reject" + } + ] +} +``` + + Note: it's actually possible to set the `fallbackResult` to `reject` on the 8th rule, so you don't need the final `reject`. But using `reject` on its own has its merits: since the `type` and `matches` fields are mandatory, you can reject submissions from certain groups, applications or users. + + ###Setup for application priority.