Hadoop: CLI MiniCluster. ======================== * [Hadoop: CLI MiniCluster.](#Hadoop:_CLI_MiniCluster.) * [Purpose](#Purpose) * [Hadoop Tarball](#Hadoop_Tarball) * [Running the MiniCluster](#Running_the_MiniCluster) Purpose ------- Using the CLI MiniCluster, users can simply start and stop a single-node Hadoop cluster with a single command, and without the need to set any environment variables or manage configuration files. The CLI MiniCluster starts both a `YARN`/`MapReduce` & `HDFS` clusters. This is useful for cases where users want to quickly experiment with a real Hadoop cluster or test non-Java programs that rely on significant Hadoop functionality. Hadoop Tarball -------------- You should be able to obtain the Hadoop tarball from the release. Also, you can directly create a tarball from the source: $ mvn clean install -DskipTests $ mvn package -Pdist -Dtar -DskipTests -Dmaven.javadoc.skip **NOTE:** You will need [protoc 2.5.0](http://code.google.com/p/protobuf/) installed. The tarball should be available in `hadoop-dist/target/` directory. Running the MiniCluster ----------------------- From inside the root directory of the extracted tarball, you can start the CLI MiniCluster using the following command: $ HADOOP_CLASSPATH=share/hadoop/yarn/test/hadoop-yarn-server-tests-${project.version}-tests.jar bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-${project.version}-tests.jar minicluster -rmport RM_PORT -jhsport JHS_PORT In the example command above, `RM_PORT` and `JHS_PORT` should be replaced by the user's choice of these port numbers. If not specified, random free ports will be used. There are a number of command line arguments that the users can use to control which services to start, and to pass other configuration properties. The available command line arguments: $ -D Options to pass into configuration object $ -datanodes How many datanodes to start (default 1) $ -format Format the DFS (default false) $ -help Prints option help. $ -jhsport JobHistoryServer port (default 0--we choose) $ -namenode URL of the namenode (default is either the DFS $ cluster or a temporary dir) $ -nnport NameNode port (default 0--we choose) $ -nodemanagers How many nodemanagers to start (default 1) $ -nodfs Don't start a mini DFS cluster $ -nomr Don't start a mini MR cluster $ -rmport ResourceManager port (default 0--we choose) $ -writeConfig Save configuration to this XML file. $ -writeDetails Write basic information to this JSON file. To display this full list of available arguments, the user can pass the `-help` argument to the above command.