From 556fbcf02504673e9f8374630da8dcdfd3958295 Mon Sep 17 00:00:00 2001 From: slfan1989 <55643692+slfan1989@users.noreply.github.com> Date: Wed, 3 Jan 2024 07:17:37 +0800 Subject: [PATCH] YARN-11632. [Doc] Add allow-partial-result description to Yarn Federation documentation. (#6340) Contributed by Shilun Fan. Reviewed-by: Inigo Goiri Signed-off-by: Shilun Fan --- .../hadoop-yarn-site/src/site/markdown/Federation.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/Federation.md b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/Federation.md index 631a62896a..5702f8d165 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/Federation.md +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/Federation.md @@ -606,6 +606,18 @@ We allow the Router to initiate a separate thread for periodically monitoring th **Note** We don't need to configure the subCluster deregister checking threads for all Routers; using 1-2 Routers for checking is sufficient. +#### How to configure allow partial result + +The Router is used to connect multiple YARN SubClusters and plays a role in merging the returned results from multiple subClusters for certain interfaces. However, if a subcluster undergoes RM upgrade or encounters RM failure, calling that particular RM will not return the correct results. +To address this issue, the Router provides configuration that allows returning partial results. When we configure the relevant parameters, the Router will skip the failed subClusters and only return results from the other subClusters. +This ensures that we can obtain at least some correct results. + +| Property | Example | Description | +|:------------------------------------------------------|:--------|:----------------------------------------------| +| `yarn.router.interceptor.allow-partial-result.enable` | `false` | Whether to support returning partial results. | + +**Note** It is important to note that even if we configure the parameters, if all sub-clusters return failures, the Router will still throw an exception. This is because there are no available results to return, making it impossible to provide a valid response. + #### How to use Router Command Line ##### Cmd1: deregisterSubCluster