From 8c378d1ea1d6026511981d6bda6c2b22bed03986 Mon Sep 17 00:00:00 2001 From: slfan1989 <55643692+slfan1989@users.noreply.github.com> Date: Sun, 7 Apr 2024 20:50:46 +0800 Subject: [PATCH] YARN-11444. Improve YARN md documentation format. (#6711) Contributed by Shilun Fan. Reviewed-by: Ayush Saxena Signed-off-by: Shilun Fan --- .../hadoop-yarn-site/src/site/markdown/CapacityScheduler.md | 4 ++-- .../src/site/markdown/DevelopYourOwnDevicePlugin.md | 2 +- .../hadoop-yarn-site/src/site/markdown/DockerContainers.md | 6 +++--- .../src/site/markdown/GracefulDecommission.md | 4 ++-- .../hadoop-yarn-site/src/site/markdown/NodeManager.md | 2 +- .../src/site/markdown/NodeManagerCGroupsMemory.md | 2 +- .../src/site/markdown/PluggableDeviceFramework.md | 2 +- .../hadoop-yarn-site/src/site/markdown/ReservationSystem.md | 2 +- .../hadoop-yarn-site/src/site/markdown/ResourceModel.md | 2 +- .../hadoop-yarn-site/src/site/markdown/RuncContainers.md | 2 +- .../hadoop-yarn-site/src/site/markdown/TimelineServer.md | 6 +++--- .../hadoop-yarn-site/src/site/markdown/UsingGpus.md | 2 +- .../hadoop-yarn-site/src/site/markdown/WebServicesIntro.md | 6 +++--- .../src/site/markdown/YarnApplicationSecurity.md | 2 +- 14 files changed, 22 insertions(+), 22 deletions(-) diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/CapacityScheduler.md b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/CapacityScheduler.md index 1b94d6ff33..2b46c68bd7 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/CapacityScheduler.md +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/CapacityScheduler.md @@ -633,7 +633,7 @@ The following configuration parameters can be configured in yarn-site.xml to con | `yarn.resourcemanager.reservation-system.planfollower.time-step` | *Optional* parameter: the frequency in milliseconds of the `PlanFollower` timer. Long value expected. The default value is *1000*. | -The `ReservationSystem` is integrated with the `CapacityScheduler` queue hierachy and can be configured for any **LeafQueue** currently. The `CapacityScheduler` supports the following parameters to tune the `ReservationSystem`: +The `ReservationSystem` is integrated with the `CapacityScheduler` queue hierarchy and can be configured for any **LeafQueue** currently. The `CapacityScheduler` supports the following parameters to tune the `ReservationSystem`: | Property | Description | |:---- |:---- | @@ -879,7 +879,7 @@ Changing queue/scheduler properties and adding/removing queues can be done in tw Remove the queue configurations from the file and run refresh as described above ### Enabling periodic configuration refresh -Enabling queue configuration periodic refresh allows reloading and applying the configuration by editing the *conf/capacity-scheduler.xml* without the necessicity of calling yarn rmadmin -refreshQueues. +Enabling queue configuration periodic refresh allows reloading and applying the configuration by editing the *conf/capacity-scheduler.xml* without the necessity of calling yarn rmadmin -refreshQueues. | Property | Description | |:---- |:---- | diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/DevelopYourOwnDevicePlugin.md b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/DevelopYourOwnDevicePlugin.md index 0331f72615..e129c6c2e0 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/DevelopYourOwnDevicePlugin.md +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/DevelopYourOwnDevicePlugin.md @@ -173,5 +173,5 @@ class and want to give it a try in your Hadoop cluster. Firstly, put the jar file under a directory in Hadooop classpath. -(recommend $HADOOP_COMMOND_HOME/share/hadoop/yarn). Secondly, +(recommend $HADOOP_COMMAND_HOME/share/hadoop/yarn). Secondly, follow the configurations described in [Pluggable Device Framework](./PluggableDeviceFramework.html) and restart YARN. \ No newline at end of file diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/DockerContainers.md b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/DockerContainers.md index fe7d1e1ad3..e512363d02 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/DockerContainers.md +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/DockerContainers.md @@ -216,7 +216,7 @@ The following properties should be set in yarn-site.xml: Optional. This configuration setting determines the capabilities assigned to docker containers when they are launched. While these may not be case-sensitive from a docker perspective, it is best to keep these - uppercase. To run without any capabilites, set this value to + uppercase. To run without any capabilities, set this value to "none" or "NONE" @@ -568,7 +568,7 @@ There are several challenges with this bind mount approach that need to be considered. 1. Any users and groups defined in the image will be overwritten by the host's users and groups -2. No users and groups can be added once the container is started, as /etc/passwd and /etc/group are immutible in the container. Do not mount these read-write as it can render the host inoperable. +2. No users and groups can be added once the container is started, as /etc/passwd and /etc/group are immutable in the container. Do not mount these read-write as it can render the host inoperable. This approach is not recommended beyond testing given the inflexibility to modify running containers. @@ -715,7 +715,7 @@ Fine grained access control can also be defined using `docker.privileged-contain docker.trusted.registries=library ``` -In development environment, local images can be tagged with a repository name prefix to enable trust. The recommendation of choosing a repository name is using a local hostname and port number to prevent accidentially pulling docker images from Docker Hub or use reserved Docker Hub keyword: "local". Docker run will look for docker images on Docker Hub, if the image does not exist locally. Using a local hostname and port in image name can prevent accidental pulling of canonical images from docker hub. Example of tagging image with localhost:5000 as trusted registry: +In development environment, local images can be tagged with a repository name prefix to enable trust. The recommendation of choosing a repository name is using a local hostname and port number to prevent accidentally pulling docker images from Docker Hub or use reserved Docker Hub keyword: "local". Docker run will look for docker images on Docker Hub, if the image does not exist locally. Using a local hostname and port in image name can prevent accidental pulling of canonical images from docker hub. Example of tagging image with localhost:5000 as trusted registry: ``` docker tag centos:latest localhost:5000/centos:latest diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/GracefulDecommission.md b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/GracefulDecommission.md index e7ce657d41..d9770761f5 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/GracefulDecommission.md +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/GracefulDecommission.md @@ -41,7 +41,7 @@ Graceful Decommission of YARN Nodes is the mechanism to decommission NMs while m To do a normal decommissioning: -1. Start a YARN cluster (with NodeManageres and ResourceManager) +1. Start a YARN cluster (with NodeManagers and ResourceManager) 2. Start a yarn job (for example with `yarn jar...` ) 3. Add `yarn.resourcemanager.nodes.exclude-path` property to your `yarn-site.xml` (Note: you don't need to restart the ResourceManager) 4. Create a text file (the location is defined in the previous step) with one line which contains the name of a selected NodeManager @@ -112,7 +112,7 @@ host3 Note: In the future more file formats are planned with timeout support. Follow the [YARN-5536](https://issues.apache.org/jira/browse/YARN-5536) if you are interested. -Important to mention, that the timeout is not persited. In case of a RM restart/failover the node will be immediatelly decommission. (Follow the [YARN-5464](https://issues.apache.org/jira/browse/YARN-5464) for changes in this behavior). +Important to mention, that the timeout is not persisted. In case of a RM restart/failover the node will be immediately decommission. (Follow the [YARN-5464](https://issues.apache.org/jira/browse/YARN-5464) for changes in this behavior). ### Client or server side timeout diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/NodeManager.md b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/NodeManager.md index 596a47e4e6..38bcf8d1bd 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/NodeManager.md +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/NodeManager.md @@ -106,7 +106,7 @@ Step 4. Configure a valid RPC address for the NodeManager. Step 5. Auxiliary services. - * NodeManagers in a YARN cluster can be configured to run auxiliary services. For a completely functional NM restart, YARN relies on any auxiliary service configured to also support recovery. This usually includes (1) avoiding usage of ephemeral ports so that previously running clients (in this case, usually containers) are not disrupted after restart and (2) having the auxiliary service itself support recoverability by reloading any previous state when NodeManager restarts and reinitializes the auxiliary service. + * NodeManagers in a YARN cluster can be configured to run auxiliary services. For a completely functional NM restart, YARN relies on any auxiliary service configured to also support recovery. This usually includes (1) avoiding usage of ephemeral ports so that previously running clients (in this case, usually containers) are not disrupted after restart and (2) having the auxiliary service itself support recoverability by reloading any previous state when NodeManager restarts and reinitialized the auxiliary service. * A simple example for the above is the auxiliary service 'ShuffleHandler' for MapReduce (MR). ShuffleHandler respects the above two requirements already, so users/admins don't have to do anything for it to support NM restart: (1) The configuration property **mapreduce.shuffle.port** controls which port the ShuffleHandler on a NodeManager host binds to, and it defaults to a non-ephemeral port. (2) The ShuffleHandler service also already supports recovery of previous state after NM restarts. diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/NodeManagerCGroupsMemory.md b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/NodeManagerCGroupsMemory.md index d1988a5048..5f1a8c8c8b 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/NodeManagerCGroupsMemory.md +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/NodeManagerCGroupsMemory.md @@ -52,7 +52,7 @@ There is no reason to set them both. If the system runs with swap disabled, both Virtual memory measurement and swapping -------------------------------------------- -There is a difference between the virtual memory reported by the container monitor and the virtual memory limit specified in the elastic memory control feature. The container monitor uses `ProcfsBasedProcessTree` by default for measurements that returns values from the `proc` file system. The virtual memory returned is the size of the address space of all the processes in each container. This includes anonymous pages, pages swapped out to disk, mapped files and reserved pages among others. Reserved pages are not backed by either physical or swapped memory. They can be a large part of the virtual memory usage. The reservabe address space was limited on 32 bit processors but it is very large on 64-bit ones making this metric less useful. Some Java Virtual Machines reserve large amounts of pages but they do not actually use it. This will result in gigabytes of virtual memory usage shown. However, this does not mean that anything is wrong with the container. +There is a difference between the virtual memory reported by the container monitor and the virtual memory limit specified in the elastic memory control feature. The container monitor uses `ProcfsBasedProcessTree` by default for measurements that returns values from the `proc` file system. The virtual memory returned is the size of the address space of all the processes in each container. This includes anonymous pages, pages swapped out to disk, mapped files and reserved pages among others. Reserved pages are not backed by either physical or swapped memory. They can be a large part of the virtual memory usage. The reservable address space was limited on 32 bit processors but it is very large on 64-bit ones making this metric less useful. Some Java Virtual Machines reserve large amounts of pages but they do not actually use it. This will result in gigabytes of virtual memory usage shown. However, this does not mean that anything is wrong with the container. Because of this you can now use `CGroupsResourceCalculator`. This shows only the sum of the physical memory usage and swapped pages as virtual memory usage excluding the reserved address space. This reflects much better what the application and the container allocated. diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/PluggableDeviceFramework.md b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/PluggableDeviceFramework.md index d8733754ed..c835df858f 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/PluggableDeviceFramework.md +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/PluggableDeviceFramework.md @@ -29,7 +29,7 @@ Some of the pain points for current device plugin development and integration are listed below: -* At least 6 classes to be implemented (If you wanna support +* At least 6 classes to be implemented (If you want to support Docker, you’ll implement one more “DockerCommandPlugin”). * When implementing the “ResourceHandler” interface, the developer must understand the YARN NM internal concepts like container diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ReservationSystem.md b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ReservationSystem.md index ac3269d1e8..cf9b0bd38f 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ReservationSystem.md +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ReservationSystem.md @@ -41,7 +41,7 @@ With reference to the figure above, a typical reservation proceeds as follows: * **Step 2** The ReservationSystem leverages a ReservationAgent (GREE in the figure) to find a plausible allocation for the reservation in the Plan, a data structure tracking all reservation currently accepted and the available resources in the system. - * **Step 3** The SharingPolicy provides a way to enforce invariants on the reservation being accepted, potentially rejecting reservations. For example, the CapacityOvertimePolicy allows enforcement of both instantaneous max-capacity a user can request across all of his/her reservations and a limit on the integral of resources over a period of time, e.g., the user can reserve up to 50% of the cluster capacity instantanesouly, but in any 24h period of time he/she cannot exceed 10% average. + * **Step 3** The SharingPolicy provides a way to enforce invariants on the reservation being accepted, potentially rejecting reservations. For example, the CapacityOvertimePolicy allows enforcement of both instantaneous max-capacity a user can request across all of his/her reservations and a limit on the integral of resources over a period of time, e.g., the user can reserve up to 50% of the cluster capacity instantaneously, but in any 24h period of time he/she cannot exceed 10% average. * **Step 4** Upon a successful validation the ReservationSystem returns to the user a ReservationId (think of it as an airline ticket). diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ResourceModel.md b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ResourceModel.md index 8a449a801d..d4c8447f4a 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ResourceModel.md +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ResourceModel.md @@ -130,7 +130,7 @@ resource may also have optional minimum and maximum properties. The properties must be named `yarn.resource-types..minimum-allocation` and `yarn.resource-types..maximum-allocation`. -The `yarn.resource-types` property and any unit, mimimum, or maximum properties +The `yarn.resource-types` property and any unit, minimum, or maximum properties may be defined in either the usual `yarn-site.xml` file or in a file named `resource-types.xml`. For example, the following could appear in either file: diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/RuncContainers.md b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/RuncContainers.md index 7a7ef9057c..2ad59a390b 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/RuncContainers.md +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/RuncContainers.md @@ -651,7 +651,7 @@ There are several challenges with this bind mount approach that need to be considered. 1. Any users and groups defined in the image will be overwritten by the host's users and groups -2. No users and groups can be added once the container is started, as /etc/passwd and /etc/group are immutible in the container. Do not mount these read-write as it can render the host inoperable. +2. No users and groups can be added once the container is started, as /etc/passwd and /etc/group are immutable in the container. Do not mount these read-write as it can render the host inoperable. This approach is not recommended beyond testing given the inflexibility to modify running containers. diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/TimelineServer.md b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/TimelineServer.md index 5b3cca9ac7..463d396e10 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/TimelineServer.md +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/TimelineServer.md @@ -859,7 +859,7 @@ Below is the elements of a single event object. Note that `value` of | Item | Data Type | Description| |:---- |:---- |:---- | | `eventtype` | string | The event type | -| `eventinfo` | map | The information of the event, which is orgainzied in a map of `key` : `value` | +| `eventinfo` | map | The information of the event, which is organized in a map of `key` : `value` | | `timestamp` | long | The timestamp of the event | ### Response Examples: @@ -1317,7 +1317,7 @@ None | `queue` | string | The queue to which the application submitted | | `appState` | string | The application state according to the ResourceManager - valid values are members of the YarnApplicationState enum: `FINISHED`, `FAILED`, `KILLED` | | `finalStatus` | string | The final status of the application if finished - reported by the application itself - valid values are: `UNDEFINED`, `SUCCEEDED`, `FAILED`, `KILLED` | -| `progress` | float | The reported progress of the application as a percent. Long-lived YARN services may not provide a meaninful value here —or use it as a metric of actual vs desired container counts | +| `progress` | float | The reported progress of the application as a percent. Long-lived YARN services may not provide a meaningful value here —or use it as a metric of actual vs desired container counts | | `trackingUrl` | string | The web URL of the application (via the RM Proxy) | | `originalTrackingUrl` | string | The actual web URL of the application | | `diagnosticsInfo` | string | Detailed diagnostics information on a completed application| @@ -2019,7 +2019,7 @@ querying some entities, such as Domains; here the API deliberately downgrades permission-denied outcomes as empty and not-founds responses. This hides details of other domains from an unauthorized caller. 1. If the content of timeline entity PUT operations is invalid, -this failure *will not* result in an HTTP error code being retured. +this failure *will not* result in an HTTP error code being returned. A status code of 200 will be returned —however, there will be an error code in the list of failed entities for each entity which could not be added. diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/UsingGpus.md b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/UsingGpus.md index 85412af88e..1a33d5c900 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/UsingGpus.md +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/UsingGpus.md @@ -80,7 +80,7 @@ By default, YARN will automatically detect and config GPUs when above config is device number of GPUs is using `nvidia-smi -q` and search `Minor Number` output. - When minor numbers are specified manually, admin needs to include indice of GPUs + When minor numbers are specified manually, admin needs to include indices of GPUs as well, format is `index:minor_number[,index:minor_number...]`. An example of manual specification is `0:0,1:1,2:2,3:4"`to allow YARN NodeManager to manage GPU devices with indices `0/1/2/3` and minor number `0/1/2/4`. diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/WebServicesIntro.md b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/WebServicesIntro.md index 11452f6578..e067edd1da 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/WebServicesIntro.md +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/WebServicesIntro.md @@ -48,7 +48,7 @@ Currently only GET is supported. It retrieves information about the resource spe ### Security -The web service REST API's go through the same security as the web UI. If your cluster adminstrators have filters enabled you must authenticate via the mechanism they specified. +The web service REST API's go through the same security as the web UI. If your cluster administrators have filters enabled you must authenticate via the mechanism they specified. ### Headers Supported @@ -70,7 +70,7 @@ This release supports gzip compression if you specify gzip in the Accept-Encodin This release of the web service REST APIs supports responses in JSON and XML formats. JSON is the default. To set the response format, you can specify the format in the Accept header of the HTTP request. -As specified in HTTP Response Codes, the response body can contain the data that represents the resource or an error message. In the case of success, the response body is in the selected format, either JSON or XML. In the case of error, the resonse body is in either JSON or XML based on the format requested. The Content-Type header of the response contains the format requested. If the application requests an unsupported format, the response status code is 500. Note that the order of the fields within response body is not specified and might change. Also, additional fields might be added to a response body. Therefore, your applications should use parsing routines that can extract data from a response body in any order. +As specified in HTTP Response Codes, the response body can contain the data that represents the resource or an error message. In the case of success, the response body is in the selected format, either JSON or XML. In the case of error, the response body is in either JSON or XML based on the format requested. The Content-Type header of the response contains the format requested. If the application requests an unsupported format, the response status code is 500. Note that the order of the fields within response body is not specified and might change. Also, additional fields might be added to a response body. Therefore, your applications should use parsing routines that can extract data from a response body in any order. ### Response Errors @@ -101,7 +101,7 @@ Response Body: ```json { - app": + "app": { "id":"application_1324057493980_0001", "user":"user1", diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/YarnApplicationSecurity.md b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/YarnApplicationSecurity.md index fc58b0a3d8..f93ed55547 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/YarnApplicationSecurity.md +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/YarnApplicationSecurity.md @@ -576,7 +576,7 @@ system property in AM). `[ ]` Web browser interaction verified in secure cluster. -`[ ]` REST client interation (GET operations) tested. +`[ ]` REST client integration (GET operations) tested. `[ ]` Application continues to run after Kerberos Token expiry.