HADOOP-13349. HADOOP_CLASSPATH vs HADOOP_USER_CLASSPATH (aw)

This commit is contained in:
Allen Wittenauer 2016-07-07 07:55:02 -07:00
parent ab092c56c2
commit a0035661c1
3 changed files with 30 additions and 25 deletions

View File

@ -115,29 +115,34 @@ esac
# #
# A note about classpaths. # A note about classpaths.
# #
# The classpath is configured such that entries are stripped prior # By default, Apache Hadoop overrides Java's CLASSPATH
# to handing to Java based either upon duplication or non-existence. # environment variable. It is configured such
# Wildcards and/or directories are *NOT* expanded as the # that it sarts out blank with new entries added after passing
# de-duplication is fairly simple. So if two directories are in # a series of checks (file/dir exists, not already listed aka
# the classpath that both contain awesome-methods-1.0.jar, # de-deduplication). During de-depulication, wildcards and/or
# awesome-methods-1.0.jar will still be seen by java. But if # directories are *NOT* expanded to keep it simple. Therefore,
# the classpath specifically has awesome-methods-1.0.jar from the # if the computed classpath has two specific mentions of
# same directory listed twice, the last one will be removed. # awesome-methods-1.0.jar, only the first one added will be seen.
# # If two directories are in the classpath that both contain
# awesome-methods-1.0.jar, then Java will pick up both versions.
# An additional, custom CLASSPATH. This is really meant for # An additional, custom CLASSPATH. Site-wide configs should be
# end users, but as an administrator, one might want to push # handled via the shellprofile functionality, utilizing the
# something extra in here too, such as the jar to the topology # hadoop_add_classpath function for greater control and much
# method. Just be sure to append to the existing HADOOP_USER_CLASSPATH # harder for apps/end-users to accidentally override.
# so end users have a way to add stuff. # Similarly, end users should utilize ${HOME}/.hadooprc .
# export HADOOP_USER_CLASSPATH="/some/cool/path/on/your/machine" # This variable should ideally only be used as a short-cut,
# interactive way for temporary additions on the command line.
# export HADOOP_CLASSPATH="/some/cool/path/on/your/machine"
# Should HADOOP_USER_CLASSPATH be first in the official CLASSPATH? # Should HADOOP_CLASSPATH be first in the official CLASSPATH?
# export HADOOP_USER_CLASSPATH_FIRST="yes" # export HADOOP_USER_CLASSPATH_FIRST="yes"
# If HADOOP_USE_CLIENT_CLASSLOADER is set, HADOOP_CLASSPATH along with the main # If HADOOP_USE_CLIENT_CLASSLOADER is set, the classpath along
# jar are handled by a separate isolated client classloader. If it is set, # with the main jar are handled by a separate isolated
# HADOOP_USER_CLASSPATH_FIRST is ignored. Can be defined by doing # client classloader when 'hadoop jar', 'yarn jar', or 'mapred job'
# is utilized. If it is set, HADOOP_CLASSPATH and
# HADOOP_USER_CLASSPATH_FIRST are ignored.
# export HADOOP_USE_CLIENT_CLASSLOADER=true # export HADOOP_USE_CLIENT_CLASSLOADER=true
# HADOOP_CLIENT_CLASSLOADER_SYSTEM_CLASSES overrides the default definition of # HADOOP_CLIENT_CLASSLOADER_SYSTEM_CLASSES overrides the default definition of

View File

@ -32,12 +32,14 @@ HADOOP_CLIENT_OPTS="-Xmx1g -Dhadoop.socks.server=localhost:4000" hadoop fs -ls /
will increase the memory and send this command via a SOCKS proxy server. will increase the memory and send this command via a SOCKS proxy server.
### `HADOOP_USER_CLASSPATH` ### `HADOOP_CLASSPATH`
NOTE: Site-wide settings should be configured via a shellprofile entry and permanent user-wide settings should be configured via ${HOME}/.hadooprc using the `hadoop_add_classpath` function. See below for more information.
The Apache Hadoop scripts have the capability to inject more content into the classpath of the running command by setting this environment variable. It should be a colon delimited list of directories, files, or wildcard locations. The Apache Hadoop scripts have the capability to inject more content into the classpath of the running command by setting this environment variable. It should be a colon delimited list of directories, files, or wildcard locations.
```bash ```bash
HADOOP_USER_CLASSPATH=${HOME}/lib/myjars/*.jar hadoop classpath HADOOP_CLASSPATH=${HOME}/lib/myjars/*.jar hadoop classpath
``` ```
A user can provides hints to the location of the paths via the `HADOOP_USER_CLASSPATH_FIRST` variable. Setting this to any value will tell the system to try and push these paths near the front. A user can provides hints to the location of the paths via the `HADOOP_USER_CLASSPATH_FIRST` variable. Setting this to any value will tell the system to try and push these paths near the front.
@ -53,8 +55,6 @@ For example:
# my custom Apache Hadoop settings! # my custom Apache Hadoop settings!
# #
HADOOP_USER_CLASSPATH=${HOME}/hadoopjars/*
HADOOP_USER_CLASSPATH_FIRST=yes
HADOOP_CLIENT_OPTS="-Xmx1g" HADOOP_CLIENT_OPTS="-Xmx1g"
``` ```

View File

@ -56,10 +56,10 @@ function hadoop_subproject_init
HADOOP_YARN_HOME="${HADOOP_YARN_HOME:-$HADOOP_HOME}" HADOOP_YARN_HOME="${HADOOP_YARN_HOME:-$HADOOP_HOME}"
# YARN-1429 added the completely superfluous YARN_USER_CLASSPATH # YARN-1429 added the completely superfluous YARN_USER_CLASSPATH
# env var. We're going to override HADOOP_USER_CLASSPATH to keep # env var. We're going to override HADOOP_CLASSPATH to keep
# consistency with the rest of the duplicate/useless env vars # consistency with the rest of the duplicate/useless env vars
hadoop_deprecate_envvar YARN_USER_CLASSPATH HADOOP_USER_CLASSPATH hadoop_deprecate_envvar YARN_USER_CLASSPATH HADOOP_CLASSPATH
hadoop_deprecate_envvar YARN_USER_CLASSPATH_FIRST HADOOP_USER_CLASSPATH_FIRST hadoop_deprecate_envvar YARN_USER_CLASSPATH_FIRST HADOOP_USER_CLASSPATH_FIRST
} }