HADOOP-18470. More in the 3.3.5 index.html about security (#5383)
Expands on the comments in cluster config to tell people they shouldn't be running a cluster without a private VLAN in cloud, that Knox is good here, and unsecured clusters without a VLAN are just computation-as-a-service to crypto miners Contributed by Steve Loughran
This commit is contained in:
parent
a2ceb09323
commit
cd2401d2cc
@ -35,6 +35,8 @@ These instructions do not cover integration with any Kerberos services,
|
||||
-everyone bringing up a production cluster should include connecting to their
|
||||
organisation's Kerberos infrastructure as a key part of the deployment.
|
||||
|
||||
See [Security](./SecureMode.html) for details on how to secure a cluster.
|
||||
|
||||
Prerequisites
|
||||
-------------
|
||||
|
||||
|
@ -24,7 +24,7 @@ Users are encouraged to read the full set of release notes.
|
||||
This page provides an overview of the major changes.
|
||||
|
||||
Azure ABFS: Critical Stream Prefetch Fix
|
||||
---------------------------------------------
|
||||
----------------------------------------
|
||||
|
||||
The abfs has a critical bug fix
|
||||
[HADOOP-18546](https://issues.apache.org/jira/browse/HADOOP-18546).
|
||||
@ -120,25 +120,76 @@ be vulnerable, and the ugprades should also reduce the number of false
|
||||
positives security scanners report.
|
||||
|
||||
We have not been able to upgrade every single dependency to the latest
|
||||
version there is. Some of those changes are just going to be incompatible.
|
||||
If you have concerns about the state of a specific library, consult the pache JIRA
|
||||
issue tracker to see whether a JIRA has been filed, discussions have taken place about
|
||||
version there is. Some of those changes are fundamentally incompatible.
|
||||
If you have concerns about the state of a specific library, consult the Apache JIRA
|
||||
issue tracker to see if an issue has been filed, discussions have taken place about
|
||||
the library in question, and whether or not there is already a fix in the pipeline.
|
||||
*Please don't file new JIRAs about dependency-X.Y.Z having a CVE without
|
||||
searching for any existing issue first*
|
||||
|
||||
As an open source project, contributions in this area are always welcome,
|
||||
As an open-source project, contributions in this area are always welcome,
|
||||
especially in testing the active branches, testing applications downstream of
|
||||
those branches and of whether updated dependencies trigger regressions.
|
||||
|
||||
|
||||
Security Advisory
|
||||
=================
|
||||
|
||||
Hadoop HDFS is a distributed filesystem allowing remote
|
||||
callers to read and write data.
|
||||
|
||||
Hadoop YARN is a distributed job submission/execution
|
||||
engine allowing remote callers to submit arbitrary
|
||||
work into the cluster.
|
||||
|
||||
Unless a Hadoop cluster is deployed with
|
||||
[caller authentication with Kerberos](./hadoop-project-dist/hadoop-common/SecureMode.html),
|
||||
anyone with network access to the servers has unrestricted access to the data
|
||||
and the ability to run whatever code they want in the system.
|
||||
|
||||
In production, there are generally three deployment patterns which
|
||||
can, with care, keep data and computing resources private.
|
||||
1. Physical cluster: *configure Hadoop security*, usually bonded to the
|
||||
enterprise Kerberos/Active Directory systems.
|
||||
Good.
|
||||
1. Cloud: transient or persistent single or multiple user/tenant cluster
|
||||
with private VLAN *and security*.
|
||||
Good.
|
||||
Consider [Apache Knox](https://knox.apache.org/) for managing remote
|
||||
access to the cluster.
|
||||
1. Cloud: transient single user/tenant cluster with private VLAN
|
||||
*and no security at all*.
|
||||
Requires careful network configuration as this is the sole
|
||||
means of securing the cluster..
|
||||
Consider [Apache Knox](https://knox.apache.org/) for managing
|
||||
remote access to the cluster.
|
||||
|
||||
*If you deploy a Hadoop cluster in-cloud without security, and without configuring a VLAN
|
||||
to restrict access to trusted users, you are implicitly sharing your data and
|
||||
computing resources with anyone with network access*
|
||||
|
||||
If you do deploy an insecure cluster this way then port scanners will inevitably
|
||||
find it and submit crypto-mining jobs. If this happens to you, please do not report
|
||||
this as a CVE or security issue: it is _utterly predictable_. Secure *your cluster* if
|
||||
you want to remain exclusively *your cluster*.
|
||||
|
||||
Finally, if you are using Hadoop as a service deployed/managed by someone else,
|
||||
do determine what security their products offer and make sure it meets your requirements.
|
||||
|
||||
|
||||
Getting Started
|
||||
===============
|
||||
|
||||
The Hadoop documentation includes the information you need to get started using
|
||||
Hadoop. Begin with the
|
||||
Hadoop. Begin with the
|
||||
[Single Node Setup](./hadoop-project-dist/hadoop-common/SingleCluster.html)
|
||||
which shows you how to set up a single-node Hadoop installation.
|
||||
Then move on to the
|
||||
[Cluster Setup](./hadoop-project-dist/hadoop-common/ClusterSetup.html)
|
||||
to learn how to set up a multi-node Hadoop installation.
|
||||
|
||||
Before deploying Hadoop in production, read
|
||||
[Hadoop in Secure Mode](./hadoop-project-dist/hadoop-common/SecureMode.html),
|
||||
and follow its instructions to secure your cluster.
|
||||
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user