YARN-2157. Added YARN metrics in the documentaion. Contributed by Akira AJISAKA
This commit is contained in:
parent
ef38fb9758
commit
90a968d675
@ -605,6 +605,145 @@ dfs context
|
||||
| packets in nanoseconds
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|
||||
yarn context
|
||||
|
||||
* ClusterMetrics
|
||||
|
||||
ClusterMetrics shows the metrics of the YARN cluster from the
|
||||
ResourceManager's perspective. Each metrics record contains
|
||||
Hostname tag as additional information along with metrics.
|
||||
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|| Name || Description
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<NumActiveNMs>>> | Current number of active NodeManagers
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<NumDecommissionedNMs>>> | Current number of decommissioned NodeManagers
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<NumLostNMs>>> | Current number of lost NodeManagers for not sending
|
||||
| heartbeats
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<NumUnhealthyNMs>>> | Current number of unhealthy NodeManagers
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<NumRebootedNMs>>> | Current number of rebooted NodeManagers
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|
||||
* QueueMetrics
|
||||
|
||||
QueueMetrics shows an application queue from the
|
||||
ResourceManager's perspective. Each metrics record shows
|
||||
the statistics of each queue, and contains tags such as
|
||||
queue name and Hostname as additional information along with metrics.
|
||||
|
||||
In <<<running_>>><num> metrics such as <<<running_0>>>, you can set the
|
||||
property <<<yarn.resourcemanager.metrics.runtime.buckets>>> in yarn-site.xml
|
||||
to change the buckets. The default values is <<<60,300,1440>>>.
|
||||
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|| Name || Description
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<running_0>>> | Current number of running applications whose elapsed time are
|
||||
| less than 60 minutes
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<running_60>>> | Current number of running applications whose elapsed time are
|
||||
| between 60 and 300 minutes
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<running_300>>> | Current number of running applications whose elapsed time are
|
||||
| between 300 and 1440 minutes
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<running_1440>>> | Current number of running applications elapsed time are
|
||||
| more than 1440 minutes
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<AppsSubmitted>>> | Total number of submitted applications
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<AppsRunning>>> | Current number of running applications
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<AppsPending>>> | Current number of applications that have not yet been
|
||||
| assigned by any containers
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<AppsCompleted>>> | Total number of completed applications
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<AppsKilled>>> | Total number of killed applications
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<AppsFailed>>> | Total number of failed applications
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<AllocatedMB>>> | Current allocated memory in MB
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<AllocatedVCores>>> | Current allocated CPU in virtual cores
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<AllocatedContainers>>> | Current number of allocated containers
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<AggregateContainersAllocated>>> | Total number of allocated containers
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<AggregateContainersReleased>>> | Total number of released containers
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<AvailableMB>>> | Current available memory in MB
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<AvailableVCores>>> | Current available CPU in virtual cores
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<PendingMB>>> | Current pending memory resource requests in MB that are
|
||||
| not yet fulfilled by the scheduler
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<PendingVCores>>> | Current pending CPU allocation requests in virtual
|
||||
| cores that are not yet fulfilled by the scheduler
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<PendingContainers>>> | Current pending resource requests that are not
|
||||
| yet fulfilled by the scheduler
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<ReservedMB>>> | Current reserved memory in MB
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<ReservedVCores>>> | Current reserved CPU in virtual cores
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<ReservedContainers>>> | Current number of reserved containers
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<ActiveUsers>>> | Current number of active users
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<ActiveApplications>>> | Current number of active applications
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<FairShareMB>>> | (FairScheduler only) Current fair share of memory in MB
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<FairShareVCores>>> | (FairScheduler only) Current fair share of CPU in
|
||||
| virtual cores
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<MinShareMB>>> | (FairScheduler only) Minimum share of memory in MB
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<MinShareVCores>>> | (FairScheduler only) Minimum share of CPU in virtual
|
||||
| cores
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<MaxShareMB>>> | (FairScheduler only) Maximum share of memory in MB
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<MaxShareVCores>>> | (FairScheduler only) Maximum share of CPU in virtual
|
||||
| cores
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|
||||
* NodeManagerMetrics
|
||||
|
||||
NodeManagerMetrics shows the statistics of the containers in the node.
|
||||
Each metrics record contains Hostname tag as additional information
|
||||
along with metrics.
|
||||
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|| Name || Description
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<containersLaunched>>> | Total number of launched containers
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<containersCompleted>>> | Total number of successfully completed containers
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<containersFailed>>> | Total number of failed containers
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<containersKilled>>> | Total number of killed containers
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<containersIniting>>> | Current number of initializing containers
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<containersRunning>>> | Current number of running containers
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<allocatedContainers>>> | Current number of allocated containers
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<allocatedGB>>> | Current allocated memory in GB
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|<<<availableGB>>> | Current available memory in GB
|
||||
*-------------------------------------+--------------------------------------+
|
||||
|
||||
ugi context
|
||||
|
||||
* UgiMetrics
|
||||
|
@ -75,6 +75,9 @@ Release 2.7.0 - UNRELEASED
|
||||
YARN-2690. [YARN-2574] Make ReservationSystem and its dependent classes
|
||||
independent of Scheduler type. (Anubhav Dhoot via kasha)
|
||||
|
||||
YARN-2157. Added YARN metrics in the documentaion. (Akira AJISAKA via
|
||||
jianhe)
|
||||
|
||||
OPTIMIZATIONS
|
||||
|
||||
BUG FIXES
|
||||
|
Loading…
Reference in New Issue
Block a user