Skip to content

Commit

Permalink
KAFKA-2379: Add basic documentation for Kafka Connect.
Browse files Browse the repository at this point in the history
Author: Ewen Cheslack-Postava <[email protected]>

Reviewers: Gwen Shapira

Closes apache#475 from ewencp/kafka-2379-connect-docs
  • Loading branch information
ewencp authored and gwenshap committed Nov 10, 2015
1 parent 79bdc17 commit 83eaf32
Show file tree
Hide file tree
Showing 11 changed files with 428 additions and 8 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -41,3 +41,4 @@ tests/.ducktape
docs/producer_config.html
docs/consumer_config.html
docs/kafka_config.html
docs/connect_config.html
8 changes: 7 additions & 1 deletion build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -319,7 +319,7 @@ project(':core') {
standardOutput = new File('docs/kafka_config.html').newOutputStream()
}

task siteDocsTar(dependsOn: ['genProducerConfigDocs', 'genConsumerConfigDocs', 'genKafkaConfigDocs'], type: Tar) {
task siteDocsTar(dependsOn: ['genProducerConfigDocs', 'genConsumerConfigDocs', 'genKafkaConfigDocs', ':connect:runtime:genConnectConfigDocs'], type: Tar) {
classifier = 'site-docs'
compression = Compression.GZIP
from project.file("../docs")
Expand Down Expand Up @@ -818,6 +818,12 @@ project(':connect:runtime') {
configFile = new File(rootDir, "checkstyle/checkstyle.xml")
}
test.dependsOn('checkstyleMain', 'checkstyleTest')

tasks.create(name: "genConnectConfigDocs", dependsOn:jar, type: JavaExec) {
classpath = sourceSets.main.runtimeClasspath
main = 'org.apache.kafka.connect.runtime.distributed.DistributedConfig'
standardOutput = new File('docs/connect_config.html').newOutputStream()
}
}

project(':connect:file') {
Expand Down
2 changes: 1 addition & 1 deletion config/connect-console-sink.properties
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,4 @@
name=local-console-sink
connector.class=org.apache.kafka.connect.file.FileStreamSinkConnector
tasks.max=1
topics=test
topics=connect-test
2 changes: 1 addition & 1 deletion config/connect-console-source.properties
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,4 @@
name=local-console-source
connector.class=org.apache.kafka.connect.file.FileStreamSourceConnector
tasks.max=1
topic=test
topic=connect-test
2 changes: 1 addition & 1 deletion config/connect-file-sink.properties
Original file line number Diff line number Diff line change
Expand Up @@ -17,4 +17,4 @@ name=local-file-sink
connector.class=org.apache.kafka.connect.file.FileStreamSinkConnector
tasks.max=1
file=test.sink.txt
topics=test
topics=connect-test
2 changes: 1 addition & 1 deletion config/connect-file-source.properties
Original file line number Diff line number Diff line change
Expand Up @@ -17,4 +17,4 @@ name=local-file-source
connector.class=org.apache.kafka.connect.file.FileStreamSourceConnector
tasks.max=1
file=test.txt
topic=test
topic=connect-test
Original file line number Diff line number Diff line change
Expand Up @@ -184,4 +184,7 @@ public DistributedConfig(Map<String, String> props) {
super(CONFIG, props);
}

public static void main(String[] args) {
System.out.println(CONFIG.toHtmlTable());
}
}
5 changes: 4 additions & 1 deletion docs/configuration.html
Original file line number Diff line number Diff line change
Expand Up @@ -329,4 +329,7 @@ <h3><a id="consumerconfigs">3.3 Consumer Configs</a></h3>

<h3><a id="newconsumerconfigs">3.4 New Consumer Configs</a></h3>
Since 0.9.0.0 we have been working on a replacement for our existing simple and high-level consumers. The code can be considered beta quality. Below is the configuration for the new consumer:
<!--#include virtual="consumer_config.html" -->
<!--#include virtual="consumer_config.html" -->

<h3><a id="connectconfigs">3.5 Kafka Connect Configs</a></h3>
<!--#include virtual="connect_config.html" -->
328 changes: 328 additions & 0 deletions docs/connect.html

Large diffs are not rendered by default.

21 changes: 19 additions & 2 deletions docs/documentation.html
Original file line number Diff line number Diff line change
Expand Up @@ -30,20 +30,24 @@ <h1>Kafka 0.9.0 Documentation</h1>
<li><a href="#ecosystem">1.4 Ecosystem</a>
<li><a href="#upgrade">1.5 Upgrading</a>
</ul>
</li>
<li><a href="#api">2. API</a>
<ul>
<li><a href="#producerapi">2.1 Producer API</a>
<li><a href="#highlevelconsumerapi">2.2 High Level Consumer API</a>
<li><a href="#simpleconsumerapi">2.3 Simple Consumer API</a>
<li><a href="#newconsumerapi">2.4 New Consumer API</a>
</ul>
</li>
<li><a href="#configuration">3. Configuration</a>
<ul>
<li><a href="#brokerconfigs">3.1 Broker Configs</a>
<li><a href="#producerconfigs">3.2 Producer Configs</a>
<li><a href="#consumerconfigs">3.3 Consumer Configs</a>
<li><a href="#newconsumerconfigs">3.4 New Consumer Configs</a>
<li><a href="#connectconfigs">3.5 Kafka Connect Configs</a>
</ul>
</li>
<li><a href="#design">4. Design</a>
<ul>
<li><a href="#majordesignelements">4.1 Motivation</a>
Expand All @@ -55,6 +59,7 @@ <h1>Kafka 0.9.0 Documentation</h1>
<li><a href="#replication">4.7 Replication</a>
<li><a href="#compaction">4.8 Log Compaction</a>
</ul>
</li>
<li><a href="#implementation">5. Implementation</a>
<ul>
<li><a href="#apidesign">5.1 API Design</a>
Expand All @@ -64,6 +69,7 @@ <h1>Kafka 0.9.0 Documentation</h1>
<li><a href="#log">5.5 Log</a>
<li><a href="#distributionimpl">5.6 Distribution</a>
</ul>
</li>
<li><a href="#operations">6. Operations</a>
<ul>
<li><a href="#basic_ops">6.1 Basic Kafka Operations</a>
Expand Down Expand Up @@ -101,13 +107,22 @@ <h1>Kafka 0.9.0 Documentation</h1>
<li><a href="#zkops">Operationalization</a>
</ul>
</ul>
<li><a href="#security">7. Security</a></li>
</li>
<li><a href="#security">7. Security</a>
<ul>
<li><a href="#security_overview">7.1 Security Overview</a></li>
<li><a href="#security_ssl">7.2 Encryption and Authentication using SSL</a></li>
<li><a href="#security_sasl">7.3 Authentication using SASL</a></li>
<li><a href="#security_authz">7.4 Authorization and ACLs</a></li>
</ul>
</li>
<li><a href="#connect">8. Kafka Connect</a>
<ul>
<li><a href="#connect_overview">8.1 Overview</a></li>
<li><a href="#connect_user">8.2 User Guide</a></li>
<li><a href="#connect_development">8.3 Connector Development Guide</a></li>
</ul>
</li>
</ul>

<h2><a id="gettingStarted">1. Getting Started</a></h2>
Expand Down Expand Up @@ -140,5 +155,7 @@ <h2><a id="operations">6. Operations</a></h2>
<h2><a id="security">7. Security</a></h2>
<!--#include virtual="security.html" -->

<!--#include virtual="../includes/footer.html" -->
<h2><a id="connect">8. Kafka Connect</a></h2>
<!--#include virtual="connect.html" -->

<!--#include virtual="../includes/footer.html" -->
62 changes: 62 additions & 0 deletions docs/quickstart.html
Original file line number Diff line number Diff line change
Expand Up @@ -187,3 +187,65 @@ <h4>Step 6: Setting up a multi-broker cluster</h4>
my test message 2
<b>^C</b>
</pre>


<h4>Step 7: Use Kafka Connect to import/export data</h4>

Writing data from the console and writing it back to the console is a convenient place to start, but you'll probably want
to use data from other sources or export data from Kafka to other systems. For many systems, instead of writing custom
integration code you can use Kafka Connect to import or export data.

Kafka Connect is a tool included with Kafka that imports and exports data to Kafka. It is an extensible tool that runs
<i>connectors</i>, which implement the custom logic for interacting with an external system. In this quickstart we'll see
how to run Kafka Connect with simple connectors that import data from a file to a Kafka topic and export data from a
Kafka topic to a file.

First, we’ll start by creating some seed data to test with:

<pre>
&gt; <b>echo -e "foo\nbar" > test.txt</b>
</pre>

Next, we'll start two connectors running in <i>standalone</i> mode, which means they run in a single, local, dedicated
process. We provide three configuration files as parameters. The first is always the configuration for the Kafka Connect
process, containing common configuration such as the Kafka brokers to connect to and the serialization format for data.
The remaining configuration files each specify a connector to create. These files include a unique connector name, the connector
class to instantiate, and any other configuration required by the connector.

<pre>
&gt; <b>bin/connect-standalone.sh config/connect-standalone.properties config/connect-file-source.properties config/connect-file-sink.properties</b>
</pre>

These sample configuration files, included with Kafka, use the default local cluster configuration you started earlier
and create two connectors: the first is a source connector that reads lines from an input file and produces each to a Kafka topic
and the second is a sink connector that reads messages from a Kafka topic and produces each as a line in an output file.

During startup you'll see a number of log messages, including some indicating that the connectors are being instantiated.
Once the Kafka Connect process has started, the source connector should start reading lines from <pre>test.txt</pre> and
producing them to the topic <pre>connect-test</pre>, and the sink connector should start reading messages from the topic <pre>connect-test</pre>
and write them to the file <pre>test.sink.txt</pre>. We can verify the data has been delivered through the entire pipeline
by examining the contents of the output file:

<pre>
&gt; <b>cat test.sink.txt</b>
foo
bar
</pre>

Note that the data is being stored in the Kafka topic <pre>connect-test</pre>, so we can also run a console consumer to see the
data in the topic (or use custom consumer code to process it):

<pre>
&gt; <b>bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic connect-test --from-beginning</b>
{"schema":{"type":"string","optional":false},"payload":"foo"}
{"schema":{"type":"string","optional":false},"payload":"bar"}
...
</pre>

The connectors continue to process data, so we can add data to the file and see it move through the pipeline:

<pre>
&gt; <b>echo "Another line" >> test.txt</b>
</pre>

You should see the line appear in the console consumer output and in the sink file.

0 comments on commit 83eaf32

Please sign in to comment.