f5a8815492
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1582868 13f79535-47bb-0310-9956-ffa450edef68
249 lines
10 KiB
Plaintext
249 lines
10 KiB
Plaintext
Build instructions for Hadoop
|
|
|
|
----------------------------------------------------------------------------------
|
|
Requirements:
|
|
|
|
* Unix System
|
|
* JDK 1.6+
|
|
* Maven 3.0 or later
|
|
* Findbugs 1.3.9 (if running findbugs)
|
|
* ProtocolBuffer 2.5.0
|
|
* CMake 2.6 or newer (if compiling native code)
|
|
* Internet connection for first build (to fetch all Maven and Hadoop dependencies)
|
|
|
|
----------------------------------------------------------------------------------
|
|
Maven main modules:
|
|
|
|
hadoop (Main Hadoop project)
|
|
- hadoop-project (Parent POM for all Hadoop Maven modules. )
|
|
(All plugins & dependencies versions are defined here.)
|
|
- hadoop-project-dist (Parent POM for modules that generate distributions.)
|
|
- hadoop-annotations (Generates the Hadoop doclet used to generated the Javadocs)
|
|
- hadoop-assemblies (Maven assemblies used by the different modules)
|
|
- hadoop-common-project (Hadoop Common)
|
|
- hadoop-hdfs-project (Hadoop HDFS)
|
|
- hadoop-mapreduce-project (Hadoop MapReduce)
|
|
- hadoop-tools (Hadoop tools like Streaming, Distcp, etc.)
|
|
- hadoop-dist (Hadoop distribution assembler)
|
|
|
|
----------------------------------------------------------------------------------
|
|
Where to run Maven from?
|
|
|
|
It can be run from any module. The only catch is that if not run from utrunk
|
|
all modules that are not part of the build run must be installed in the local
|
|
Maven cache or available in a Maven repository.
|
|
|
|
----------------------------------------------------------------------------------
|
|
Maven build goals:
|
|
|
|
* Clean : mvn clean
|
|
* Compile : mvn compile [-Pnative]
|
|
* Run tests : mvn test [-Pnative]
|
|
* Create JAR : mvn package
|
|
* Run findbugs : mvn compile findbugs:findbugs
|
|
* Run checkstyle : mvn compile checkstyle:checkstyle
|
|
* Install JAR in M2 cache : mvn install
|
|
* Deploy JAR to Maven repo : mvn deploy
|
|
* Run clover : mvn test -Pclover [-DcloverLicenseLocation=${user.name}/.clover.license]
|
|
* Run Rat : mvn apache-rat:check
|
|
* Build javadocs : mvn javadoc:javadoc
|
|
* Build distribution : mvn package [-Pdist][-Pdocs][-Psrc][-Pnative][-Dtar]
|
|
* Change Hadoop version : mvn versions:set -DnewVersion=NEWVERSION
|
|
|
|
Build options:
|
|
|
|
* Use -Pnative to compile/bundle native code
|
|
* Use -Pdocs to generate & bundle the documentation in the distribution (using -Pdist)
|
|
* Use -Psrc to create a project source TAR.GZ
|
|
* Use -Dtar to create a TAR with the distribution (using -Pdist)
|
|
|
|
Snappy build options:
|
|
|
|
Snappy is a compression library that can be utilized by the native code.
|
|
It is currently an optional component, meaning that Hadoop can be built with
|
|
or without this dependency.
|
|
|
|
* Use -Drequire.snappy to fail the build if libsnappy.so is not found.
|
|
If this option is not specified and the snappy library is missing,
|
|
we silently build a version of libhadoop.so that cannot make use of snappy.
|
|
This option is recommended if you plan on making use of snappy and want
|
|
to get more repeatable builds.
|
|
|
|
* Use -Dsnappy.prefix to specify a nonstandard location for the libsnappy
|
|
header files and library files. You do not need this option if you have
|
|
installed snappy using a package manager.
|
|
* Use -Dsnappy.lib to specify a nonstandard location for the libsnappy library
|
|
files. Similarly to snappy.prefix, you do not need this option if you have
|
|
installed snappy using a package manager.
|
|
* Use -Dbundle.snappy to copy the contents of the snappy.lib directory into
|
|
the final tar file. This option requires that -Dsnappy.lib is also given,
|
|
and it ignores the -Dsnappy.prefix option.
|
|
|
|
Tests options:
|
|
|
|
* Use -DskipTests to skip tests when running the following Maven goals:
|
|
'package', 'install', 'deploy' or 'verify'
|
|
* -Dtest=<TESTCLASSNAME>,<TESTCLASSNAME#METHODNAME>,....
|
|
* -Dtest.exclude=<TESTCLASSNAME>
|
|
* -Dtest.exclude.pattern=**/<TESTCLASSNAME1>.java,**/<TESTCLASSNAME2>.java
|
|
|
|
----------------------------------------------------------------------------------
|
|
Building components separately
|
|
|
|
If you are building a submodule directory, all the hadoop dependencies this
|
|
submodule has will be resolved as all other 3rd party dependencies. This is,
|
|
from the Maven cache or from a Maven repository (if not available in the cache
|
|
or the SNAPSHOT 'timed out').
|
|
An alternative is to run 'mvn install -DskipTests' from Hadoop source top
|
|
level once; and then work from the submodule. Keep in mind that SNAPSHOTs
|
|
time out after a while, using the Maven '-nsu' will stop Maven from trying
|
|
to update SNAPSHOTs from external repos.
|
|
|
|
----------------------------------------------------------------------------------
|
|
Protocol Buffer compiler
|
|
|
|
The version of Protocol Buffer compiler, protoc, must match the version of the
|
|
protobuf JAR.
|
|
|
|
If you have multiple versions of protoc in your system, you can set in your
|
|
build shell the HADOOP_PROTOC_PATH environment variable to point to the one you
|
|
want to use for the Hadoop build. If you don't define this environment variable,
|
|
protoc is looked up in the PATH.
|
|
----------------------------------------------------------------------------------
|
|
Importing projects to eclipse
|
|
|
|
When you import the project to eclipse, install hadoop-maven-plugins at first.
|
|
|
|
$ cd hadoop-maven-plugins
|
|
$ mvn install
|
|
|
|
Then, generate eclipse project files.
|
|
|
|
$ mvn eclipse:eclipse -DskipTests
|
|
|
|
At last, import to eclipse by specifying the root directory of the project via
|
|
[File] > [Import] > [Existing Projects into Workspace].
|
|
|
|
----------------------------------------------------------------------------------
|
|
Building distributions:
|
|
|
|
Create binary distribution without native code and without documentation:
|
|
|
|
$ mvn package -Pdist -DskipTests -Dtar
|
|
|
|
Create binary distribution with native code and with documentation:
|
|
|
|
$ mvn package -Pdist,native,docs -DskipTests -Dtar
|
|
|
|
Create source distribution:
|
|
|
|
$ mvn package -Psrc -DskipTests
|
|
|
|
Create source and binary distributions with native code and documentation:
|
|
|
|
$ mvn package -Pdist,native,docs,src -DskipTests -Dtar
|
|
|
|
Create a local staging version of the website (in /tmp/hadoop-site)
|
|
|
|
$ mvn clean site; mvn site:stage -DstagingDirectory=/tmp/hadoop-site
|
|
|
|
----------------------------------------------------------------------------------
|
|
|
|
Handling out of memory errors in builds
|
|
|
|
----------------------------------------------------------------------------------
|
|
|
|
If the build process fails with an out of memory error, you should be able to fix
|
|
it by increasing the memory used by maven -which can be done via the environment
|
|
variable MAVEN_OPTS.
|
|
|
|
Here is an example setting to allocate between 256 and 512 MB of heap space to
|
|
Maven
|
|
|
|
export MAVEN_OPTS="-Xms256m -Xmx512m"
|
|
|
|
----------------------------------------------------------------------------------
|
|
|
|
Building on OS/X
|
|
|
|
----------------------------------------------------------------------------------
|
|
|
|
A one-time manual step is required to enable building Hadoop OS X with Java 7
|
|
every time the JDK is updated.
|
|
see: https://issues.apache.org/jira/browse/HADOOP-9350
|
|
|
|
$ sudo mkdir `/usr/libexec/java_home`/Classes
|
|
$ sudo ln -s `/usr/libexec/java_home`/lib/tools.jar `/usr/libexec/java_home`/Classes/classes.jar
|
|
|
|
----------------------------------------------------------------------------------
|
|
|
|
Building on Windows
|
|
|
|
----------------------------------------------------------------------------------
|
|
Requirements:
|
|
|
|
* Windows System
|
|
* JDK 1.6+
|
|
* Maven 3.0 or later
|
|
* Findbugs 1.3.9 (if running findbugs)
|
|
* ProtocolBuffer 2.5.0
|
|
* Windows SDK or Visual Studio 2010 Professional
|
|
* Unix command-line tools from GnuWin32 or Cygwin: sh, mkdir, rm, cp, tar, gzip
|
|
* zlib headers (if building native code bindings for zlib)
|
|
* Internet connection for first build (to fetch all Maven and Hadoop dependencies)
|
|
|
|
If using Visual Studio, it must be Visual Studio 2010 Professional (not 2012).
|
|
Do not use Visual Studio Express. It does not support compiling for 64-bit,
|
|
which is problematic if running a 64-bit system. The Windows SDK is free to
|
|
download here:
|
|
|
|
http://www.microsoft.com/en-us/download/details.aspx?id=8279
|
|
|
|
----------------------------------------------------------------------------------
|
|
Building:
|
|
|
|
Keep the source code tree in a short path to avoid running into problems related
|
|
to Windows maximum path length limitation. (For example, C:\hdc).
|
|
|
|
Run builds from a Windows SDK Command Prompt. (Start, All Programs,
|
|
Microsoft Windows SDK v7.1, Windows SDK 7.1 Command Prompt.)
|
|
|
|
JAVA_HOME must be set, and the path must not contain spaces. If the full path
|
|
would contain spaces, then use the Windows short path instead.
|
|
|
|
You must set the Platform environment variable to either x64 or Win32 depending
|
|
on whether you're running a 64-bit or 32-bit system. Note that this is
|
|
case-sensitive. It must be "Platform", not "PLATFORM" or "platform".
|
|
Environment variables on Windows are usually case-insensitive, but Maven treats
|
|
them as case-sensitive. Failure to set this environment variable correctly will
|
|
cause msbuild to fail while building the native code in hadoop-common.
|
|
|
|
set Platform=x64 (when building on a 64-bit system)
|
|
set Platform=Win32 (when building on a 32-bit system)
|
|
|
|
Several tests require that the user must have the Create Symbolic Links
|
|
privilege.
|
|
|
|
All Maven goals are the same as described above with the exception that
|
|
native code is built by enabling the 'native-win' Maven profile. -Pnative-win
|
|
is enabled by default when building on Windows since the native components
|
|
are required (not optional) on Windows.
|
|
|
|
If native code bindings for zlib are required, then the zlib headers must be
|
|
deployed on the build machine. Set the ZLIB_HOME environment variable to the
|
|
directory containing the headers.
|
|
|
|
set ZLIB_HOME=C:\zlib-1.2.7
|
|
|
|
At runtime, zlib1.dll must be accessible on the PATH. Hadoop has been tested
|
|
with zlib 1.2.7, built using Visual Studio 2010 out of contrib\vstudio\vc10 in
|
|
the zlib 1.2.7 source tree.
|
|
|
|
http://www.zlib.net/
|
|
|
|
----------------------------------------------------------------------------------
|
|
Building distributions:
|
|
|
|
* Build distribution with native code : mvn package [-Pdist][-Pdocs][-Psrc][-Dtar]
|
|
|