2015-02-10 21:39:57 +00:00
|
|
|
<!---
|
|
|
|
Licensed under the Apache License, Version 2.0 (the "License");
|
|
|
|
you may not use this file except in compliance with the License.
|
|
|
|
You may obtain a copy of the License at
|
|
|
|
|
|
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
|
|
|
|
Unless required by applicable law or agreed to in writing, software
|
|
|
|
distributed under the License is distributed on an "AS IS" BASIS,
|
|
|
|
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
|
|
See the License for the specific language governing permissions and
|
|
|
|
limitations under the License. See accompanying LICENSE file.
|
|
|
|
-->
|
|
|
|
|
|
|
|
Enabling Dapper-like Tracing in Hadoop
|
|
|
|
======================================
|
|
|
|
|
|
|
|
* [Enabling Dapper-like Tracing in Hadoop](#Enabling_Dapper-like_Tracing_in_Hadoop)
|
|
|
|
* [Dapper-like Tracing in Hadoop](#Dapper-like_Tracing_in_Hadoop)
|
|
|
|
* [HTrace](#HTrace)
|
|
|
|
* [SpanReceivers](#SpanReceivers)
|
|
|
|
* [Dynamic update of tracing configuration](#Dynamic_update_of_tracing_configuration)
|
|
|
|
* [Starting tracing spans by HTrace API](#Starting_tracing_spans_by_HTrace_API)
|
|
|
|
* [Sample code for tracing](#Sample_code_for_tracing)
|
2015-10-05 18:33:46 +00:00
|
|
|
* [Starting tracing spans by FileSystem Shell](#Starting_tracing_spans_by_FileSystem_Shell)
|
2015-05-08 19:30:03 +00:00
|
|
|
* [Starting tracing spans by configuration for HDFS client](#Starting_tracing_spans_by_configuration_for_HDFS_client)
|
|
|
|
|
|
|
|
|
2015-02-10 21:39:57 +00:00
|
|
|
Dapper-like Tracing in Hadoop
|
|
|
|
-----------------------------
|
|
|
|
|
|
|
|
### HTrace
|
|
|
|
|
2015-02-12 00:14:58 +00:00
|
|
|
[HDFS-5274](https://issues.apache.org/jira/browse/HDFS-5274) added support for tracing requests through HDFS,
|
|
|
|
using the open source tracing library,
|
2015-05-08 19:30:03 +00:00
|
|
|
[Apache HTrace](http://htrace.incubator.apache.org/).
|
2015-02-12 00:14:58 +00:00
|
|
|
Setting up tracing is quite simple, however it requires some very minor changes to your client code.
|
2015-02-10 21:39:57 +00:00
|
|
|
|
|
|
|
### SpanReceivers
|
|
|
|
|
2015-02-12 00:14:58 +00:00
|
|
|
The tracing system works by collecting information in structs called 'Spans'.
|
|
|
|
It is up to you to choose how you want to receive this information
|
2015-05-08 19:30:03 +00:00
|
|
|
by using implementation of [SpanReceiver](http://htrace.incubator.apache.org/#Span_Receivers)
|
|
|
|
interface bundled with HTrace or implementing it by yourself.
|
2015-02-10 21:39:57 +00:00
|
|
|
|
2015-05-08 19:30:03 +00:00
|
|
|
[HTrace](http://htrace.incubator.apache.org/) provides options such as
|
2015-02-10 21:39:57 +00:00
|
|
|
|
2015-05-08 19:30:03 +00:00
|
|
|
* FlumeSpanReceiver
|
|
|
|
* HBaseSpanReceiver
|
|
|
|
* HTracedRESTReceiver
|
|
|
|
* ZipkinSpanReceiver
|
|
|
|
|
2015-09-27 05:05:51 +00:00
|
|
|
See core-default.xml for a description of HTrace configuration keys. In some
|
|
|
|
cases, you will also need to add the jar containing the SpanReceiver that you
|
|
|
|
are using to the classpath of Hadoop on each node. (In the example above,
|
|
|
|
LocalFileSpanReceiver is included in the htrace-core4 jar which is bundled
|
|
|
|
with Hadoop.)
|
2015-02-10 21:39:57 +00:00
|
|
|
|
2015-05-08 19:30:03 +00:00
|
|
|
```
|
2015-09-27 05:05:51 +00:00
|
|
|
$ cp htrace-htraced/target/htrace-htraced-4.0.1-incubating.jar $HADOOP_HOME/share/hadoop/common/lib/
|
2015-05-28 19:00:55 +00:00
|
|
|
```
|
|
|
|
|
2015-02-10 21:39:57 +00:00
|
|
|
### Dynamic update of tracing configuration
|
|
|
|
|
2015-02-12 00:14:58 +00:00
|
|
|
You can use `hadoop trace` command to see and update the tracing configuration of each servers.
|
|
|
|
You must specify IPC server address of namenode or datanode by `-host` option.
|
|
|
|
You need to run the command against all servers if you want to update the configuration of all servers.
|
2015-02-10 21:39:57 +00:00
|
|
|
|
|
|
|
`hadoop trace -list` shows list of loaded span receivers associated with the id.
|
|
|
|
|
|
|
|
$ hadoop trace -list -host 192.168.56.2:9000
|
|
|
|
ID CLASS
|
2015-09-27 05:05:51 +00:00
|
|
|
1 org.apache.htrace.core.LocalFileSpanReceiver
|
2015-02-10 21:39:57 +00:00
|
|
|
|
|
|
|
$ hadoop trace -list -host 192.168.56.2:50020
|
|
|
|
ID CLASS
|
2015-09-27 05:05:51 +00:00
|
|
|
1 org.apache.htrace.core.LocalFileSpanReceiver
|
2015-02-10 21:39:57 +00:00
|
|
|
|
2015-02-12 00:14:58 +00:00
|
|
|
`hadoop trace -remove` removes span receiver from server.
|
|
|
|
`-remove` options takes id of span receiver as argument.
|
2015-02-10 21:39:57 +00:00
|
|
|
|
|
|
|
$ hadoop trace -remove 1 -host 192.168.56.2:9000
|
|
|
|
Removed trace span receiver 1
|
|
|
|
|
2015-02-12 00:14:58 +00:00
|
|
|
`hadoop trace -add` adds span receiver to server.
|
|
|
|
You need to specify the class name of span receiver as argument of `-class` option.
|
|
|
|
You can specify the configuration associated with span receiver by `-Ckey=value` options.
|
2015-02-10 21:39:57 +00:00
|
|
|
|
2015-09-29 16:25:11 +00:00
|
|
|
$ hadoop trace -add -class org.apache.htrace.core.LocalFileSpanReceiver -Chadoop.htrace.local.file.span.receiver.path=/tmp/htrace.out -host 192.168.56.2:9000
|
|
|
|
Added trace span receiver 2 with configuration hadoop.htrace.local.file.span.receiver.path = /tmp/htrace.out
|
2015-02-10 21:39:57 +00:00
|
|
|
|
|
|
|
$ hadoop trace -list -host 192.168.56.2:9000
|
|
|
|
ID CLASS
|
2015-09-27 05:05:51 +00:00
|
|
|
2 org.apache.htrace.core.LocalFileSpanReceiver
|
2015-02-10 21:39:57 +00:00
|
|
|
|
|
|
|
### Starting tracing spans by HTrace API
|
|
|
|
|
2015-02-12 00:14:58 +00:00
|
|
|
In order to trace, you will need to wrap the traced logic with **tracing span** as shown below.
|
|
|
|
When there is running tracing spans,
|
|
|
|
the tracing information is propagated to servers along with RPC requests.
|
2015-02-10 21:39:57 +00:00
|
|
|
|
2015-05-08 19:30:03 +00:00
|
|
|
```java
|
2015-02-10 21:39:57 +00:00
|
|
|
import org.apache.hadoop.hdfs.HdfsConfiguration;
|
2015-09-27 05:05:51 +00:00
|
|
|
import org.apache.htrace.core.Tracer;
|
|
|
|
import org.apache.htrace.core.TraceScope;
|
2015-02-10 21:39:57 +00:00
|
|
|
|
|
|
|
...
|
|
|
|
|
|
|
|
|
|
|
|
...
|
|
|
|
|
2015-09-27 05:05:51 +00:00
|
|
|
TraceScope ts = tracer.newScope("Gets");
|
2015-02-10 21:39:57 +00:00
|
|
|
try {
|
|
|
|
... // traced logic
|
|
|
|
} finally {
|
2015-09-27 05:05:51 +00:00
|
|
|
ts.close();
|
2015-02-10 21:39:57 +00:00
|
|
|
}
|
2015-05-08 19:30:03 +00:00
|
|
|
```
|
2015-02-10 21:39:57 +00:00
|
|
|
|
2015-05-08 19:30:03 +00:00
|
|
|
### Sample code for tracing by HTrace API
|
2015-02-10 21:39:57 +00:00
|
|
|
|
2015-02-12 00:14:58 +00:00
|
|
|
The `TracingFsShell.java` shown below is the wrapper of FsShell
|
|
|
|
which start tracing span before invoking HDFS shell command.
|
2015-02-10 21:39:57 +00:00
|
|
|
|
2015-05-08 19:30:03 +00:00
|
|
|
```java
|
2015-10-05 18:33:46 +00:00
|
|
|
import org.apache.hadoop.fs.FileSystem;
|
|
|
|
import org.apache.hadoop.fs.Path;
|
2015-02-10 21:39:57 +00:00
|
|
|
import org.apache.hadoop.conf.Configuration;
|
2015-10-05 18:33:46 +00:00
|
|
|
import org.apache.hadoop.conf.Configured;
|
2015-09-27 05:05:51 +00:00
|
|
|
import org.apache.hadoop.tracing.TraceUtils;
|
2015-10-05 18:33:46 +00:00
|
|
|
import org.apache.hadoop.util.Tool;
|
2015-02-10 21:39:57 +00:00
|
|
|
import org.apache.hadoop.util.ToolRunner;
|
2015-10-05 18:33:46 +00:00
|
|
|
import org.apache.htrace.core.Tracer;
|
2015-09-27 05:05:51 +00:00
|
|
|
import org.apache.htrace.core.TraceScope;
|
2015-10-05 18:33:46 +00:00
|
|
|
|
|
|
|
public class Sample extends Configured implements Tool {
|
|
|
|
@Override
|
|
|
|
public int run(String argv[]) throws Exception {
|
|
|
|
FileSystem fs = FileSystem.get(getConf());
|
|
|
|
Tracer tracer = new Tracer.Builder("Sample").
|
|
|
|
conf(TraceUtils.wrapHadoopConf("sample.htrace.", getConf())).
|
2015-09-27 05:05:51 +00:00
|
|
|
build();
|
2015-02-10 21:39:57 +00:00
|
|
|
int res = 0;
|
2015-10-05 18:33:46 +00:00
|
|
|
try (TraceScope scope = tracer.newScope("sample")) {
|
|
|
|
Thread.sleep(1000);
|
|
|
|
fs.listStatus(new Path("/"));
|
2015-02-10 21:39:57 +00:00
|
|
|
}
|
2015-09-27 05:05:51 +00:00
|
|
|
tracer.close();
|
2015-10-05 18:33:46 +00:00
|
|
|
return res;
|
|
|
|
}
|
|
|
|
|
|
|
|
public static void main(String argv[]) throws Exception {
|
|
|
|
ToolRunner.run(new Sample(), argv);
|
2015-02-10 21:39:57 +00:00
|
|
|
}
|
|
|
|
}
|
2015-05-08 19:30:03 +00:00
|
|
|
```
|
2015-02-10 21:39:57 +00:00
|
|
|
|
|
|
|
You can compile and execute this code as shown below.
|
|
|
|
|
2015-10-05 18:33:46 +00:00
|
|
|
$ javac -cp `hadoop classpath` Sample.java
|
|
|
|
$ java -cp .:`hadoop classpath` Sample \
|
|
|
|
-Dsample.htrace.span.receiver.classes=LocalFileSpanReceiver \
|
|
|
|
-Dsample.htrace.sampler.classes=AlwaysSampler
|
|
|
|
|
|
|
|
### Starting tracing spans by FileSystem Shell
|
|
|
|
|
|
|
|
The FileSystem Shell can enable tracing by configuration properties.
|
|
|
|
|
|
|
|
Configure the span receivers and samplers in `core-site.xml` or command line
|
|
|
|
by properties `fs.client.htrace.sampler.classes` and
|
|
|
|
`fs.client.htrace.spanreceiver.classes`.
|
|
|
|
|
|
|
|
$ hdfs dfs -Dfs.shell.htrace.span.receiver.classes=LocalFileSpanReceiver \
|
|
|
|
-Dfs.shell.htrace.sampler.classes=AlwaysSampler \
|
|
|
|
-ls /
|
2015-05-08 19:30:03 +00:00
|
|
|
|
|
|
|
### Starting tracing spans by configuration for HDFS client
|
|
|
|
|
|
|
|
The DFSClient can enable tracing internally. This allows you to use HTrace with
|
|
|
|
your client without modifying the client source code.
|
|
|
|
|
|
|
|
Configure the span receivers and samplers in `hdfs-site.xml`
|
2015-09-27 05:05:51 +00:00
|
|
|
by properties `fs.client.htrace.sampler.classes` and
|
|
|
|
`fs.client.htrace.spanreceiver.classes`. The value of
|
|
|
|
`fs.client.htrace.sampler.classes` can be NeverSampler, AlwaysSampler or
|
|
|
|
ProbabilitySampler.
|
2015-05-08 19:30:03 +00:00
|
|
|
|
|
|
|
* NeverSampler: HTrace is OFF for all requests to namenodes and datanodes;
|
|
|
|
* AlwaysSampler: HTrace is ON for all requests to namenodes and datanodes;
|
|
|
|
* ProbabilitySampler: HTrace is ON for some percentage% of requests to namenodes and datanodes
|
|
|
|
|
|
|
|
```xml
|
|
|
|
<property>
|
2015-09-29 16:25:11 +00:00
|
|
|
<name>hadoop.htrace.span.receiver.classes</name>
|
2015-05-08 19:30:03 +00:00
|
|
|
<value>LocalFileSpanReceiver</value>
|
|
|
|
</property>
|
|
|
|
<property>
|
2015-09-29 16:25:11 +00:00
|
|
|
<name>fs.client.htrace.sampler.classes</name>
|
2015-05-08 19:30:03 +00:00
|
|
|
<value>ProbabilitySampler</value>
|
|
|
|
</property>
|
|
|
|
<property>
|
2015-09-29 16:25:11 +00:00
|
|
|
<name>fs.client.htrace.sampler.fraction</name>
|
|
|
|
<value>0.01</value>
|
2015-05-08 19:30:03 +00:00
|
|
|
</property>
|
|
|
|
```
|