+```
The supported configuration parameters are explained in the
following sections.
@@ -262,14 +267,14 @@ recorded in the trace. It constructs jobs of two types:
A synthetic job where each task does *nothing* but sleep
for a certain duration as observed in the production trace. The
- scalability of the Job Tracker is often limited by how many
+ scalability of the ResourceManager is often limited by how many
heartbeats it can handle every second. (Heartbeats are periodic
- messages sent from Task Trackers to update their status and grab new
- tasks from the Job Tracker.) Since a benchmark cluster is typically
+ messages sent from NodeManagers to update their status and grab new
+ tasks from the ResourceManager.) Since a benchmark cluster is typically
a fraction in size of a production cluster, the heartbeat traffic
generated by the slave nodes is well below the level of the
- production cluster. One possible solution is to run multiple Task
- Trackers on each slave node. This leads to the obvious problem that
+ production cluster. One possible solution is to run multiple
+ NodeManagers on each slave node. This leads to the obvious problem that
the I/O workload generated by the synthetic jobs would thrash the
slave nodes. Hence the need for such a job. |
@@ -334,7 +339,7 @@ Job Submission Policies
GridMix controls the rate of job submission. This control can be
based on the trace information or can be based on statistics it gathers
-from the Job Tracker. Based on the submission policies users define,
+from the ResourceManager. Based on the submission policies users define,
GridMix uses the respective algorithm to control the job submission.
There are currently three types of policies:
@@ -407,9 +412,9 @@ The following configuration parameters affect the job submission policy:
gridmix.throttle.jobs-to-tracker-ratio
|
- In STRESS mode, the minimum ratio of running jobs to Task
- Trackers in a cluster for the cluster to be considered
- *overloaded* . This is the threshold TJ referred to earlier.
+ | In STRESS mode, the minimum ratio of running jobs to
+ NodeManagers in a cluster for the cluster to be considered
+ *overloaded* . This is the threshold TJ referred to earlier.
The default is 1.0. |
@@ -688,20 +693,16 @@ correctly emulate compression.
Emulating High-Ram jobs
-----------------------
-MapReduce allows users to define a job as a High-Ram job. Tasks from a
-High-Ram job can occupy multiple slots on the task-trackers.
-Task-tracker assigns fixed virtual memory for each slot. Tasks from
-High-Ram jobs can occupy multiple slots and thus can use up more
-virtual memory as compared to a default task.
-
-Emulating this behavior is important because of the following reasons
+MapReduce allows users to define a job as a High-Ram job. Tasks from a
+High-Ram job can occupy larger fraction of memory in task processes.
+Emulating this behavior is important because of the following reasons.
* Impact on scheduler: Scheduling of tasks from High-Ram jobs
- impacts the scheduling behavior as it might result into slot
- reservation and slot/resource utilization.
+ impacts the scheduling behavior as it might result into
+ resource reservation and utilization.
-* Impact on the node : Since High-Ram tasks occupy multiple slots,
- trackers do some bookkeeping for allocating extra resources for
+* Impact on the node : Since High-Ram tasks occupy larger memory,
+ NodeManagers do some bookkeeping for allocating extra resources for
these tasks. Thus this becomes a precursor for memory emulation
where tasks with high memory requirements needs to be considered
as a High-Ram task.
@@ -808,11 +809,11 @@ job traces and cannot be accurately reproduced in GridMix:
Appendix
--------
+There exist older versions of the GridMix tool.
Issues tracking the original implementations of
-GridMix1,
-GridMix2,
-and GridMix3
+[GridMix1](https://issues.apache.org/jira/browse/HADOOP-2369),
+[GridMix2](https://issues.apache.org/jira/browse/HADOOP-3770),
+and [GridMix3](https://issues.apache.org/jira/browse/MAPREDUCE-776)
can be found on the Apache Hadoop MapReduce JIRA. Other issues tracking
the current development of GridMix can be found by searching
-
-the Apache Hadoop MapReduce JIRA
+[the Apache Hadoop MapReduce JIRA](https://issues.apache.org/jira/browse/MAPREDUCE/component/12313086).