14 KiB
Registry Configuration
The YARN service registry is built on top of Apache Zookeeper.
It is configured by way of a Hadoop Configuration
class:
the instance used to create the service controls the behavior of the client.
This document lists the configuration parameters which control the registry client and its deployment in the YARN Resource Manager.
The default values of all these settings are defined in core-default.xml
.
The values in this file may not match those listed in this document.
If this is the case, the values in core-default.xml
MUST be considered normative.
Changing the configuration values
Changes to the configuration values SHOULD be done in core-site.xml
.
This will ensure that client and non-YARN applications will pick up
the values, so enabling them to read from and potentially write to the registry.
Core Settings
Enabling the Registry in the Resource Manager
The Resource Manager manages user directory creation and record cleanup on YARN container/application attempt/application completion.
<property>
<description>
Is the registry enabled in the YARN Resource Manager?
If true, the YARN RM will, as needed.
create the user and system paths, and purge
service records when containers, application attempts
and applications complete.
If false, the paths must be created by other means,
and no automatic cleanup of service records will take place.
</description>
<name>hadoop.registry.rm.enabled</name>
<value>false</value>
</property>
If the property is set in core-site.xml
or yarn-site.xml
,
the YARN Resource Manager will behave as follows:
- On startup: create the initial root paths of
/
,/services
and/users
. On a secure cluster, access will be restricted to the system accounts (see below). - When a user submits a job: create the user path under
/users
. - When a container is completed: delete from the registry all service records
with a
yarn:persistence
field of valuecontainer
, and ayarn:id
field whose value matches the ID of the completed container. - When an application attempt is completed: remove all service records with
yarn:persistence
set toapplication-attempt
andyarn:id
set to the pplication attempt ID. - When an application finishes: remove all service records with
yarn:persistence
set toapplication
andyarn:id
set to the application ID.
All these operations are asynchronous, so that zookeeper connectivity problems do not delay RM operations or work scheduling.
If the property hadoop.registry.rm.enabled
is set to false
, the RM will
not interact with the registry —and the listed operations will not take place.
The root paths may be created by other means, but service record cleanup will not take place.
Setting the Zookeeper Quorum: hadoop.registry.zk.quorum
This is an essential setting: it identifies the lists of zookeeper hosts and the ports on which the ZK services are listening.
<property>
<description>
List of hostname:port pairs defining the
zookeeper quorum binding for the registry
</description>
<name>hadoop.registry.zk.quorum</name>
<value>localhost:2181</value>
</property>
It takes a comma-separated list, such as zk1:2181 ,zk2:2181, zk3:2181
Setting the Zookeeper Registry Base path: hadoop.registry.zk.root
This path sets the base zookeeper node for the registry
<property>
<description>
The root zookeeper node for the registry
</description>
<name>hadoop.registry.zk.root</name>
<value>/registry</value>
</property>
The default value of /registry
is normally sufficient. A different value
may be needed for security reasons or because the /registry
path is in use.
The root value is prepended to all registry paths so as to create the absolute path. For example:
/
maps to/registry
/services
maps to/registry/services
/users/yarn
maps to/registry/users/yarn
A different value of hadoop.registry.zk.root
would result in a different
mapping to absolute zookeeper paths.
Security Options
Registry security is enabled when the property hadoop.registry.secure
is set to true
. Once set, nodes are created with permissions, so that
only a specific user and the configured cluster "superuser" accounts
can write under their home path of ${hadoop.registry.zk.root}/users
.
Only the superuser accounts
will be able to manipulate the root path, including ${hadoop.registry.zk.root}/services
and ${hadoop.registry.zk.root}/users
.
All write operations on the registry (including deleting entries and paths) must be authenticated. Read operations are still permitted by unauthenticated callers.
The key settings for secure registry support are:
- enabling the secure mode:
hadoop.registry.secure
- listing the superuser zookeeper ACLs:
hadoop.registry.system.acls
- listing the kerberos realm for the principals:
hadoop.registry.kerberos.realm
- identifying the JAAS context within the JAAS configuration which defines
the user:
hadoop.registry.jaas.context
Enabling security
<property>
<description>
Key to set if the registry is secure. Turning it on
changes the permissions policy from "open access"
to restrictions on kerberos with the option of
a user adding one or more auth key pairs down their
own tree.
</description>
<name>hadoop.registry.secure</name>
<value>false</value>
</property>
Identifying the client JAAS context
The registry clients must identify the JAAS context which they use to authenticate to the registry.
<property>
<description>
Key to define the JAAS context. Used in secure mode
</description>
<name>hadoop.registry.jaas.context</name>
<value>Client</value>
</property>
Note as the Resource Manager is simply another client of the registry, it too must have this context defined.
Identifying the system accounts hadoop.registry.system.acls
These are the the accounts which are given full access to the base of the registry. The Resource Manager needs this option to create the root paths.
Client applications writing to the registry access to the nodes it creates.
- The property
hadoop.registry.system.acls
takes a comma-separated list of zookeeperACLs
which are given full access to created nodes; the permissionsREAD | WRITE | CREATE | DELETE | ADMIN
. - Any zookeeper ACL scheme may be added to this, such as the
digest:
scheme. - The SASL scheme,
sasl:
, is used to identify which callers identified by sasl have full access. These are the superuser accounts. - They may be identified by elements such as
sasl:yarn@REALM.COM
. - To aid portability of SASL settings, especially the default value,
any
sasl:
entry without the realm value —that is, any entry that terminates in the@
symbol— has the current realm appended to it. - This realm is set to that of the current user.
- It may be overridden by the property
hadoop.registry.kerberos.realm
.
<property>
<description>
A comma separated list of Zookeeper ACL identifiers with
system access to the registry in a secure cluster.
These are given full access to all entries.
If there is an "@" at the end of a SASL entry it
instructs the registry client to append the default kerberos domain.
</description>
<name>hadoop.registry.system.acls</name>
<value>sasl:yarn@, sasl:mapred@, sasl:mapred@, sasl:hdfs@</value>
</property>
<property>
<description>
The kerberos realm: used to set the realm of
system principals which do not declare their realm,
and any other accounts that need the value.
If empty, the default realm of the running process
is used.
If neither are known and the realm is needed, then the registry
service/client will fail.
</description>
<name>hadoop.registry.kerberos.realm</name>
<value></value>
</property>
Example: an hadoop.registry.system.acls
entry of
sasl:yarn@, sasl:admin@EXAMPLE.COM, sasl:system@REALM2
,
would, in a YARN cluster with the realm EXAMPLE.COM
, add the following
admin accounts to every node
sasl:yarn@EXAMPLE.COM
sasl:admin@EXAMPLE.COM
sasl:system@REALM2
The identity of a client application creating registry entries will be
automatically included in the permissions of all entries created.
If, for example, the account creating an entry was hbase
, another
entry would be created
sasl:hbase@EXAMPLE.COM
Important: when setting the system ACLS, it is critical to include the identity of the YARN Resource Manager.
The RM needs to be able to create the root and user paths, and delete service records during application and container cleanup.
Zookeeper connection management options
Some low level options manage the ZK connection —more specifically, its failure handling.
The Zookeeper registry clients use Apache Curator to connect to Zookeeper, a library which detects timeouts and attempts to reconnect to one of the servers which forms the zookeeper quorum. It is only after a timeout is detected that a retry is triggered.
<property>
<description>
Zookeeper session timeout in milliseconds
</description>
<name>hadoop.registry.zk.session.timeout.ms</name>
<value>60000</value>
</property>
<property>
<description>
Zookeeper connection timeout in milliseconds
</description>
<name>hadoop.registry.zk.connection.timeout.ms</name>
<value>15000</value>
</property>
<property>
<description>
Zookeeper connection retry count before failing
</description>
<name>hadoop.registry.zk.retry.times</name>
<value>5</value>
</property>
<property>
<description>
</description>
<name>hadoop.registry.zk.retry.interval.ms</name>
<value>1000</value>
</property>
<property>
<description>
Zookeeper retry limit in milliseconds, during
exponential backoff.
This places a limit even
if the retry times and interval limit, combined
with the backoff policy, result in a long retry
period
</description>
<name>hadoop.registry.zk.retry.ceiling.ms</name>
<value>60000</value>
</property>
The retry strategy used in the registry client is
BoundedExponentialBackoffRetry
:
This backs off exponentially on connection failures before eventually
concluding that the quorum is unreachable and failing.
Complete Set of Configuration Options
<!-- YARN registry -->
<property>
<description>
Is the registry enabled: does the RM start it up,
create the user and system paths, and purge
service records when containers, application attempts
and applications complete
</description>
<name>hadoop.registry.rm.enabled</name>
<value>false</value>
</property>
<property>
<description>
List of hostname:port pairs defining the
zookeeper quorum binding for the registry
</description>
<name>hadoop.registry.zk.quorum</name>
<value>localhost:2181</value>
</property>
<property>
<description>
The root zookeeper node for the registry
</description>
<name>hadoop.registry.zk.root</name>
<value>/registry</value>
</property>
<property>
<description>
Key to set if the registry is secure. Turning it on
changes the permissions policy from "open access"
to restrictions on kerberos with the option of
a user adding one or more auth key pairs down their
own tree.
</description>
<name>hadoop.registry.secure</name>
<value>false</value>
</property>
<property>
<description>
A comma separated list of Zookeeper ACL identifiers with
system access to the registry in a secure cluster.
These are given full access to all entries.
If there is an "@" at the end of a SASL entry it
instructs the registry client to append the default kerberos domain.
</description>
<name>hadoop.registry.system.acls</name>
<value>sasl:yarn@, sasl:mapred@, sasl:mapred@, sasl:hdfs@</value>
</property>
<property>
<description>
The kerberos realm: used to set the realm of
system principals which do not declare their realm,
and any other accounts that need the value.
If empty, the default realm of the running process
is used.
If neither are known and the realm is needed, then the registry
service/client will fail.
</description>
<name>hadoop.registry.kerberos.realm</name>
<value></value>
</property>
<property>
<description>
Key to define the JAAS context. Used in secure
mode
</description>
<name>hadoop.registry.jaas.context</name>
<value>Client</value>
</property>
<property>
<description>
Zookeeper session timeout in milliseconds
</description>
<name>hadoop.registry.zk.session.timeout.ms</name>
<value>60000</value>
</property>
<property>
<description>
Zookeeper session timeout in milliseconds
</description>
<name>hadoop.registry.zk.connection.timeout.ms</name>
<value>15000</value>
</property>
<property>
<description>
Zookeeper connection retry count before failing
</description>
<name>hadoop.registry.zk.retry.times</name>
<value>5</value>
</property>
<property>
<description>
</description>
<name>hadoop.registry.zk.retry.interval.ms</name>
<value>1000</value>
</property>
<property>
<description>
Zookeeper retry limit in milliseconds, during
exponential backoff: {@value}
This places a limit even
if the retry times and interval limit, combined
with the backoff policy, result in a long retry
period
</description>
<name>hadoop.registry.zk.retry.ceiling.ms</name>
<value>60000</value>
</property>