YARN-4653. Document YARN security model from the perspective of Application Developers. Contributed by Steve Loughran
This commit is contained in:
parent
ec12ce8f48
commit
dea90c9a86
@ -126,6 +126,7 @@
|
||||
<item name="Web Application Proxy" href="hadoop-yarn/hadoop-yarn-site/WebApplicationProxy.html"/>
|
||||
<item name="Timeline Server" href="hadoop-yarn/hadoop-yarn-site/TimelineServer.html"/>
|
||||
<item name="Writing YARN Applications" href="hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html"/>
|
||||
<item name="YARN Application Security" href="hadoop-yarn/hadoop-yarn-site/YarnApplicationSecurity.html"/>
|
||||
<item name="NodeManager" href="hadoop-yarn/hadoop-yarn-site/NodeManager.html"/>
|
||||
<item name="DockerContainerExecutor" href="hadoop-yarn/hadoop-yarn-site/DockerContainerExecutor.html"/>
|
||||
<item name="Using CGroups" href="hadoop-yarn/hadoop-yarn-site/NodeManagerCgroups.html"/>
|
||||
|
@ -1433,6 +1433,9 @@ Release 2.7.3 - UNRELEASED
|
||||
YARN-4492. Add documentation for preemption supported in Capacity
|
||||
scheduler (Naganarasimha G R via jlowe)
|
||||
|
||||
YARN-4653. Document YARN security model from the perspective of
|
||||
Application Developers. (Steve Loughran via jianhe)
|
||||
|
||||
OPTIMIZATIONS
|
||||
|
||||
BUG FIXES
|
||||
|
@ -0,0 +1,560 @@
|
||||
<!---
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License. See accompanying LICENSE file.
|
||||
-->
|
||||
|
||||
# YARN Application Security
|
||||
|
||||
Anyone writing a YARN application needs to understand the process, in order
|
||||
to write short-lived applications or long-lived services. They also need to
|
||||
start testing on secure clusters during early development stages, in order
|
||||
to write code that actually works.
|
||||
|
||||
## How YARN Security works
|
||||
|
||||
YARN Resource Managers (RMs) and Node Managers (NMs) co-operate to execute
|
||||
the user's application with the identity and hence access rights of that user.
|
||||
|
||||
The (active) Resource Manager:
|
||||
|
||||
1. Finds space in a cluster to deploy the core of the application,
|
||||
the Application Master (AM).
|
||||
|
||||
1. Requests that the NM on that node allocate a container and start the AM in it.
|
||||
|
||||
1. Communicates with the AM, so that the AM can request new containers and
|
||||
manipulate/release current ones, and to provide notifications about allocated
|
||||
and running containers.
|
||||
|
||||
The Node Managers:
|
||||
|
||||
1. *Localize* resources: Download from HDFS or other filesystem into a local directory. This
|
||||
is done using the delegation tokens attached to the container launch context. (For non-HDFS
|
||||
resources, using other credentials such as object store login details in cluster configuration
|
||||
files)
|
||||
|
||||
1. Start the application as the user.
|
||||
|
||||
1. Monitor the application and report failure to the RM.
|
||||
|
||||
To execute code in the cluster, a YARN application must:
|
||||
|
||||
1. Have a client-side application which sets up the `ApplicationSubmissionContext`
|
||||
detailing what is to be launched. This includes:
|
||||
|
||||
* A list of files in the cluster's filesystem to be "localized".
|
||||
* The environment variables to set in the container.
|
||||
* The commands to execute in the container to start the application.
|
||||
* Any security credentials needed by YARN to launch the application.
|
||||
* Any security credentials needed by the application to interact
|
||||
with any Hadoop cluster services and applications.
|
||||
|
||||
1. Have an Application Master which, when launched, registers with
|
||||
the YARN RM and listens for events. Any AM which wishes to execute work in
|
||||
other containers must request them off the RM, and, when allocated, create
|
||||
a `ContainerLaunchContext` containing the command to execute, the
|
||||
environment to execute the command, binaries to localize and all relevant
|
||||
security credentials.
|
||||
|
||||
1. Even with the NM handling the localization process, the AM must itself
|
||||
be able to retrieve the security credentials supplied at launch time so
|
||||
that it itself may work with HDFS and any other services, and to pass some or
|
||||
all of these credentials down to the launched containers.
|
||||
|
||||
### Acquiring and Adding tokens to a YARN Application
|
||||
|
||||
The delegation tokens which a YARN application needs must be acquired
|
||||
from a program executing as an authenticated user. For a YARN application,
|
||||
this means the user launching the application. It is the client-side part
|
||||
of the YARN application which must do this:
|
||||
|
||||
1. Log in via `UserGroupInformation`.
|
||||
1. Identify all tokens which must be acquired.
|
||||
1. Request these tokens from the specific Hadoop services.
|
||||
1. Marshall all tokens into a byte buffer.
|
||||
1. Add them to the `ContainerLaunchContext` within the `ApplicationSubmissionContext`.
|
||||
|
||||
Which tokens are required? Normally, at least a token to access HDFS.
|
||||
|
||||
An application must request a delegation token from every filesystem with
|
||||
which it intends to interact —including the cluster's main FS.
|
||||
`FileSystem.addDelegationTokens(renewer, credentials)` can be used to collect these;
|
||||
it is a no-op on those filesystems which do not issue tokens (including
|
||||
non-kerberized HDFS clusters).
|
||||
|
||||
Applications talking to other services, such as Apache HBase and Apache Hive,
|
||||
must request tokens from these services, using the libraries of these
|
||||
services to acquire delegation tokens. All tokens can be added to the same
|
||||
set of credentials, then saved to a byte buffer for submission.
|
||||
|
||||
The Application Timeline Server also needs a delegation token. This is handled
|
||||
automatically on AM launch.
|
||||
|
||||
### Extracting tokens within the AM
|
||||
|
||||
When the Application Master is launched and any of the UGI/Hadoop operations
|
||||
which trigger a user login invoked, the UGI class will automatically load in all tokens
|
||||
saved in the file named by the environment variable `HADOOP_TOKEN_FILE_LOCATION`.
|
||||
|
||||
This happens on an insecure cluster along with a secure one, and on a secure
|
||||
cluster even if a keytab is used by the application. Why? Because the
|
||||
AM/RM token needed to authenticate the application with the YARN RM is always
|
||||
supplied this way.
|
||||
|
||||
This means you have a relative similar workflow across secure and insecure clusters.
|
||||
|
||||
1. Suring AM startup, log in to Kerberos.
|
||||
A call to `UserGroupInformation.isSecurityEnabled()` will trigger this operation.
|
||||
|
||||
1. Enumerate the current user's credentials, through a call of
|
||||
`UserGroupInformation.getCurrentUser().getCredentials()`.
|
||||
|
||||
1. Filter out the AMRM token, resulting in a new set of credentials. In an
|
||||
insecure cluster, the list of credentials will now be empty; in a secure cluster
|
||||
they will contain
|
||||
|
||||
1. Set the credentials of all containers to be launched to this (possibly empty)
|
||||
list of credentials.
|
||||
|
||||
1. If the filtered list of tokens to renew, is non-empty start up a thread
|
||||
to renew them.
|
||||
|
||||
### Token Renewal
|
||||
|
||||
Tokens *expire*: they have a limited lifespan. An application wishing to
|
||||
use a token past this expiry date must *renew* the token before the token
|
||||
expires.
|
||||
|
||||
Hadoop automatically sets up a delegation token renewal thread when needed,
|
||||
the `DelegationTokenRenewer`.
|
||||
|
||||
It is the responsibility of the application to renew all tokens other
|
||||
than the AMRM and timeline tokens.
|
||||
|
||||
Here are the different strategies
|
||||
|
||||
1. Don't. Rely on the lifespan of the application being so short that token
|
||||
renewal is not needed. For applications whose life can always be measured
|
||||
in minutes or tens of minutes, this is a viable strategy.
|
||||
|
||||
1. Start a background thread/Executor to renew the tokens at a regular interval.
|
||||
This what most YARN applications do.
|
||||
|
||||
## Other Aspects of YARN Security
|
||||
|
||||
|
||||
### AM/RM Token Refresh
|
||||
|
||||
The AM/RM token is renewed automatically; the AM pushes out a new token
|
||||
to the AM within an `allocate` message. Consult the `AMRMClientImpl` class
|
||||
to see the process. *Your AM code does not need to worry about this process*
|
||||
|
||||
### Token Renewal on AM Restart
|
||||
|
||||
Even if an application is renewing tokens regularly, if an AM fails and is
|
||||
restarted, it gets restarted from that original
|
||||
`ApplicationSubmissionContext`. The tokens there may have expired, so localization
|
||||
may fail, even before the issue of credentials to talk to other services.
|
||||
|
||||
How is this problem addressed? The YARN Resource Manager gets a new token
|
||||
for the node managers, if needed.
|
||||
|
||||
More precisely
|
||||
|
||||
1. The token passed by the RM to the NM for localization is refreshed/updated as needed.
|
||||
1. Tokens in the app launch context for use by the application are *not* refreshed.
|
||||
That is, if it has an out of date HDFS token —that token is not renewed. This
|
||||
also holds for tokens for for Hive, HBase, etc.
|
||||
1. Therefore, to survive AM restart after token expiry, your AM has to get the
|
||||
NMs to localize the keytab or make no HDFS accesses until (somehow) a new token has been passed to them from a client.
|
||||
|
||||
This is primarily an issue for long-lived services (see below).
|
||||
|
||||
### Unmanaged Application Masters
|
||||
|
||||
Unmanaged application masters are not launched in a container set up by
|
||||
the RM and NM, so cannot automatically pick up an AM/RM token at launch time.
|
||||
The `YarnClient.getAMRMToken()` API permits an Unmanaged AM to request an AM/RM
|
||||
token. Consult `UnmanagedAMLauncher` for the specifics.
|
||||
|
||||
### Identity on an insecure cluster: `HADOOP_USER_NAME`
|
||||
|
||||
In an insecure cluster, the application will run as the identity of
|
||||
the account of the node manager, typically something such as `yarn`
|
||||
or `mapred`. By default, the application will access HDFS
|
||||
as that user, with a different home directory, and with
|
||||
a different user identified in audit logs and on file system owner attributes.
|
||||
|
||||
This can be avoided by having the client identify the identify of the
|
||||
HDFS/Hadoop user under which the application is expected to run. *This
|
||||
does not affect the OS-level user or the application's access rights
|
||||
to the local machine*.
|
||||
|
||||
When Kerberos is disabled, the identity of a user is picked up
|
||||
by Hadoop first from the environment variable `HADOOP_USER_NAME`,
|
||||
then from the OS-level username (e.g. the system property `user.name`).
|
||||
|
||||
YARN applications should propagate the user name of the user launching
|
||||
an application by setting this environment variable.
|
||||
|
||||
```java
|
||||
Map<String, String> env = new HashMap<>();
|
||||
String userName = UserGroupInformation.getCurrentUser().getUserName();
|
||||
env.put(UserGroupInformation.HADOOP_USER_NAME, userName);
|
||||
containerLaunchContext.setEnvironment(env);
|
||||
```
|
||||
|
||||
Note that this environment variable is picked up in all applications
|
||||
which talk to HDFS via the hadoop libraries. That is, if set, it
|
||||
is the identity picked up by HBase and other applications executed
|
||||
within the environment of a YARN container within which this environment
|
||||
variable is set.
|
||||
|
||||
### Oozie integration and `HADOOP_TOKEN_FILE_LOCATION`
|
||||
|
||||
Apache Oozie can launch an application in a secure cluster either by acquiring
|
||||
all relevant credentials, saving them to a file in the local filesystem,
|
||||
then setting the path to this file in the environment variable
|
||||
`HADOOP_TOKEN_FILE_LOCATION`. This is of course the same environment variable
|
||||
passed down by YARN in launched containers, as is similar content: a byte
|
||||
array with credentials.
|
||||
|
||||
Here, however, the environment variable is set in the environment
|
||||
executing the YARN client. This client must use the token information saved
|
||||
in the named file *instead of acquiring any tokens of its own*.
|
||||
|
||||
Loading in the token file is automatic: UGI does it during user login.
|
||||
|
||||
The client is then responsible for passing the same credentials into the
|
||||
AM launch context. This can be done simply by passing down the current
|
||||
credentials.
|
||||
|
||||
```java
|
||||
credentials = new Credentials(
|
||||
UserGroupInformation.getCurrentUser().getCredentials());
|
||||
```
|
||||
|
||||
### Timeline Server integration
|
||||
|
||||
The [Application Timeline Server](TimelineServer.html) can be deployed as a secure service
|
||||
—in which case the application will need the relevant token to authenticate with
|
||||
it. This process is handled automatically in `YarnClientImpl` if ATS is
|
||||
enabled in a secure cluster. Similarly, the AM-side `TimelineClient` YARN service
|
||||
class manages token renewal automatically via the ATS's SPNEGO-authenticated REST API.
|
||||
|
||||
If you need to prepare a set of delegation tokens for a YARN application launch
|
||||
via Oozie, this can be done via the timeline client API.
|
||||
|
||||
```java
|
||||
try(TimelineClient timelineClient = TimelineClient.createTimelineClient()) {
|
||||
timelineClient.init(conf);
|
||||
timelineClient.start();
|
||||
Token<TimelineDelegationTokenIdentifier> token =
|
||||
timelineClient.getDelegationToken(rmprincipal));
|
||||
credentials.addToken(token.getService(), token);
|
||||
}
|
||||
```
|
||||
|
||||
### Cancelling Tokens
|
||||
|
||||
Applications *may* wish to cancel tokens they hold when terminating their AM.
|
||||
This ensures that the tokens are no-longer valid.
|
||||
|
||||
This is not mandatory, and as a clean shutdown of a YARN application cannot
|
||||
be guaranteed, it is not possible to guarantee that the tokens will always
|
||||
be during application termination. However, it does reduce the window of
|
||||
vulnerability to stolen tokens.
|
||||
|
||||
## Securing Long-lived YARN Services
|
||||
|
||||
There is a time limit on all token renewals, after which tokens won't renew,
|
||||
causing the application to stop working. This is somewhere between seventy-two
|
||||
hours and seven days.
|
||||
|
||||
Any YARN service intended to run for an extended period of time *must* have
|
||||
a strategy for renewing credentials.
|
||||
|
||||
Here are the strategies:
|
||||
|
||||
### Pre-installed Keytabs for AM and containers
|
||||
|
||||
A keytab is provided for the application's use on every node.
|
||||
|
||||
This is done by:
|
||||
|
||||
1. Installing it in every cluster node's local filesystem.
|
||||
1. Providing the path to this in a configuration option.
|
||||
1. The application loading the credentials via
|
||||
`UserGroupInformation.loginUserFromKeytab()`.
|
||||
|
||||
The keytab must be in a secure directory path, where
|
||||
only the service (and other trusted accounts) can read it. Distribution
|
||||
becomes a responsibility of the cluster operations team.
|
||||
|
||||
This is effectively how all static Hadoop applications get their security credentials.
|
||||
|
||||
### Keytabs for AM and containers distributed via YARN
|
||||
|
||||
|
||||
1. A keytab is uploaded to HDFS.
|
||||
|
||||
1. When launching the AM, the keytab is listed as a resource to localize to
|
||||
the AM's container.
|
||||
|
||||
1. The Application Master is configured with the relative path to the keytab,
|
||||
and logs in with `UserGroupInformation.loginUserFromKeytab()`.
|
||||
|
||||
1. When the AM launches the container, it lists the HDFS path to the keytab
|
||||
as a resource to localize.
|
||||
|
||||
1. It adds the HDFS delegation token to the container launch context, so
|
||||
that the keytab and other application files can be localized.
|
||||
|
||||
1. Launched containers must themselves log in via
|
||||
`UserGroupInformation.loginUserFromKeytab()`. UGI handles the login, and
|
||||
schedules a background thread to relogin the user periodically.
|
||||
|
||||
1. Token creation is handled automatically in the Hadoop IPC and REST APIs,
|
||||
the containers stay logged in via kerberos for their entire duration.
|
||||
|
||||
This avoids the administration task of installing keytabs for specific services
|
||||
across the entire cluster.
|
||||
|
||||
It does require the client to have access to the keytab
|
||||
and, as it is uploaded to the distributed filesystem, must be secured through
|
||||
the appropriate path permissions/ACLs.
|
||||
|
||||
As all containers have access to the keytab, all code executing in the containers
|
||||
has to be trusted. Malicious code (or code escaping some form of sandbox)
|
||||
could read the keytab, and hence have access to the cluster until the keys
|
||||
expire or are revoked.
|
||||
|
||||
This is the strategy implemented by Apache Slider (incubating).
|
||||
|
||||
### AM keytab distributed via YARN; AM regenerates delegation tokens for containers.
|
||||
|
||||
1. A keytab is uploaded to HDFS by the client.
|
||||
|
||||
1. When launching the AM, the keytab is listed as a resource to localize to
|
||||
the AM's container.
|
||||
|
||||
1. The Application Master is configured with the relative path to the keytab,
|
||||
and logs in with `UserGroupInformation.loginUserFromKeytab()`. The UGI
|
||||
codepath will still automatically load the file references by
|
||||
`$HADOOP_TOKEN_FILE_LOCATION`, which is how the AMRM token is picked up.
|
||||
|
||||
1. When the AM launches a container, it acquires all the delegation tokens
|
||||
needed by that container, and adds them to the container's container launch context.
|
||||
|
||||
1. Launched containers must load the delegation tokens from `$HADOOP_TOKEN_FILE_LOCATION`,
|
||||
and use them (including renewals) until they can no longer be renewed.
|
||||
|
||||
1. The AM must implement an IPC interface which permits containers to request
|
||||
a new set of delegation tokens; this interface must itself use authentication
|
||||
and ideally wire encryption.
|
||||
|
||||
1. Before a delegation token is due to expire, the processes running in the containers
|
||||
must request new tokens from the Application Master over the IPC channel.
|
||||
|
||||
1. When the containers need the new tokens, the AM, logged in with a keytab,
|
||||
asks the various cluster services for new tokens.
|
||||
|
||||
(Note there is an alternative direction for refresh operations: from AM
|
||||
to the containers, again over whatever IPC channel is implemented between
|
||||
AM and containers). The rest of the algorithm: AM regenerated tokens passed
|
||||
to containers over IPC.
|
||||
|
||||
This is the strategy used by Apache Spark 1.5+, with a netty-based protocol
|
||||
between containers and the AM for token updates.
|
||||
|
||||
Because only the AM has direct access to the keytab, it is less exposed.
|
||||
Code running in the containers only has access to the delegation tokens.
|
||||
|
||||
However, those containers will have access to HDFS from the tokens
|
||||
passed in at container launch, so will have access to the copy of the keytab
|
||||
used for launching the AM. While the AM could delete that keytab on launch,
|
||||
doing so would stop YARN being able to successfully relaunch the AM after any
|
||||
failure.
|
||||
|
||||
### Client-side Token Push
|
||||
|
||||
This strategy may be the sole one acceptable to a strict operations team: a client process
|
||||
running on an account holding a Kerberos TGT negotiates with all needed cluster services
|
||||
for new delegation tokens, tokens which are then pushed out to the Application Master via
|
||||
some RPC interface.
|
||||
|
||||
This does require the client process to be re-executed on a regular basis; a cron or Oozie job
|
||||
can do this. The AM will need to implement an IPC API over which renewed
|
||||
tokens can be provided. (Note that as Oozie can collect the tokens itself,
|
||||
all the updater application needs to do whenever executed is set up an IPC
|
||||
connection with the AM and pass up the current user's credentials).
|
||||
|
||||
## Securing YARN Application Web UIs and REST APIs
|
||||
|
||||
YARN provides a straightforward way of giving every YARN application SPNEGO authenticated
|
||||
web pages: it implements SPNEGO authentication in the Resource Manager Proxy.
|
||||
|
||||
YARN web UI are expected to load the AM proxy filter when setting up its web UI; this filter
|
||||
will redirect all HTTP Requests coming from any host other than the RM Proxy hosts to an
|
||||
RM proxy, to which the client app/browser must re-issue the request. The client will authenticate
|
||||
against the principal of the RM Proxy (usually `yarn`), and, once authenticated, have its
|
||||
request forwared.
|
||||
|
||||
As a result, all client interactions are SPNEGO-authenticated, without the YARN application
|
||||
itself needing any kerberos principal for the clients to authenticate against.
|
||||
|
||||
Known weaknesses in this approach are:
|
||||
|
||||
1. As calls coming from the proxy hosts are not redirected, any application running
|
||||
on those hosts has unrestricted access to the YARN applications. This is why in a secure cluster
|
||||
the proxy hosts *must* run on cluster nodes which do not run end user code (i.e. not run YARN
|
||||
NodeManagers and hence schedule YARN containers, nor support logins by end users).
|
||||
|
||||
1. The HTTP requests between proxy and YARN RM Server are not currently encrypted.
|
||||
That is: HTTPS is not supported.
|
||||
|
||||
## Securing YARN Application REST APIs
|
||||
|
||||
YARN REST APIs running on the same port as the registered web UI of a YARN application are
|
||||
automatically authenticated via SPNEGO authentication in the RM proxy.
|
||||
|
||||
Any REST endpoint (and equally, any web UI) brought up on a different port does not
|
||||
support SPNEGO authentication unless implemented in the YARN application itself.
|
||||
|
||||
## Checklist for YARN Applications
|
||||
|
||||
Here is the checklist of core actions which a YARN application must do
|
||||
to successfully launch in a YARN cluster.
|
||||
|
||||
### Client
|
||||
|
||||
`[ ]` Client checks for security being enabled via `UserGroupInformation.isSecurityEnabled()`
|
||||
|
||||
In a secure cluster:
|
||||
|
||||
`[ ]` If `HADOOP_TOKEN_FILE_LOCATION` is unset, client acquires delegation tokens
|
||||
for the local filesystems, with the RM principal set as the renewer.
|
||||
|
||||
`[ ]` If `HADOOP_TOKEN_FILE_LOCATION` is unset, client acquires delegation tokens
|
||||
for all other services to be used in the YARN application.
|
||||
|
||||
`[ ]` If `HADOOP_TOKEN_FILE_LOCATION` is set, client uses the current user's credentials
|
||||
as the source of all tokens to be added to the container launch context.
|
||||
|
||||
`[ ]` Client sets all tokens on AM `ContainerLaunchContext.setTokens()`.
|
||||
|
||||
`[ ]` Recommended: if it is set in the client's environment,
|
||||
client sets the environment variable `HADOOP_JAAS_DEBUG=true`
|
||||
in the Container launch context of the AM.
|
||||
|
||||
In an insecure cluster:
|
||||
|
||||
`[ ]` Propagate local username to YARN AM, hence HDFS identity via the
|
||||
`HADOOP_USER_NAME` environment variable.
|
||||
|
||||
### App Master
|
||||
|
||||
`[ ]` In a secure cluster, AM retrieves security tokens from `HADOOP_TOKEN_FILE_LOCATION`
|
||||
environment variable (automatically done by UGI).
|
||||
|
||||
`[ ]` A copy the token set is filtered to remove the AM/RM token and any timeline
|
||||
token.
|
||||
|
||||
`[ ]` A thread or executor is started to renew threads on a regular basis.
|
||||
|
||||
`[ ]` Recommended: AM cancels tokens when application completes.
|
||||
|
||||
### Container Launch by AM
|
||||
|
||||
`[ ]` Tokens to be passed to containers are passed via
|
||||
`ContainerLaunchContext.setTokens()`.
|
||||
|
||||
`[ ]` In an insecure cluster, propagate the `HADOOP_USER_NAME` environment variable.
|
||||
|
||||
`[ ]` Recommended: AM sets the environment variable `HADOOP_JAAS_DEBUG=true`
|
||||
in the Container launch context if it is set in the AM's environment.
|
||||
|
||||
### Launched Containers
|
||||
|
||||
`[ ]` Call `UserGroupInformation.isSecurityEnabled()` to trigger security setup.
|
||||
|
||||
`[ ]` A thread or executor is started to renew threads on a regular basis.
|
||||
|
||||
### YARN service
|
||||
|
||||
`[ ]` Application developers have chosen and implemented a token renewal strategy:
|
||||
shared keytab, AM keytab or client-side token refresh.
|
||||
|
||||
`[ ]` In a secure cluster, the keytab is either already in HDFS (and checked for),
|
||||
or it is in the local FS of the client, in which case it must be uploaded and added to
|
||||
the list of resources to localize.
|
||||
|
||||
`[ ]` If stored in HDFS, keytab permissions should be checked. If the keytab
|
||||
is readable by principals other than the current user, warn,
|
||||
and consider actually failing the launch (similar to the normal `ssh` application.)
|
||||
|
||||
`[ ]` Client acquires HDFS delegation token and and attaches to the AM Container
|
||||
Launch Context,
|
||||
|
||||
`[ ]` AM logs in as principal in keytab via `loginUserFromKeytab()`.
|
||||
|
||||
`[ ]` (AM extracts AM/RM token from the `HADOOP_TOKEN_FILE_LOCATION` environment
|
||||
variable).
|
||||
|
||||
`[ ]` For launched containers, either the keytab is propagated, or
|
||||
the AM acquires/attaches all required delegation tokens to the Container Launch
|
||||
context alongside the HDFS delegation token needed by the NMs.
|
||||
|
||||
## Testing YARN applications in a secure cluster.
|
||||
|
||||
It is straightforward to be confident that a YARN application works in secure
|
||||
cluster. The process to do so is: test on a secure cluster.
|
||||
|
||||
Even a single VM-cluster can be set up with security enabled. If doing so,
|
||||
we recommend turning security up to its strictest, with SPNEGO-authenticated
|
||||
Web UIs (and hence RM Proxy), as well as IPC wire encryption. Setting the
|
||||
kerberos token expiry to under an hour will find kerberos expiry problems
|
||||
early —so is also recommended.
|
||||
|
||||
`[ ]` Application launched in secure cluster.
|
||||
|
||||
`[ ]` Launched application runs as user submitting job (tip: log `user.name`
|
||||
system property in AM).
|
||||
|
||||
`[ ]` Web browser interaction verified in secure cluster.
|
||||
|
||||
`[ ]` REST client interation (GET operations) tested.
|
||||
|
||||
`[ ]` Application continues to run after Kerberos Token expiry.
|
||||
|
||||
`[ ]` Application does not launch if user lacks Kerberos credentials.
|
||||
|
||||
`[ ]` If the application supports the timeline server, verify that it publishes
|
||||
events in a secure cluster.
|
||||
|
||||
`[ ]` If the application integrates with other applications, such as HBase or Hive,
|
||||
verify that the interaction works in a secure cluster.
|
||||
|
||||
`[ ]` If the application communicates with remote HDFS clusters, verify
|
||||
that it can do so in a secure cluster (i.e. that the client extracted any
|
||||
delegation tokens for this at launch time)
|
||||
|
||||
## Important
|
||||
|
||||
*If you don't test your YARN application in a secure Hadoop cluster,
|
||||
it won't work.*
|
||||
|
||||
And without those tests: *your users will be the ones to find out
|
||||
that your application doesn't work in a secure cluster.*
|
||||
|
||||
Bear that in mind when considering how much development effort to put into
|
||||
Kerberos support.
|
Loading…
Reference in New Issue
Block a user