Integrating Apache Kafka with other systems in a reliable and scalable way is often a key part of a streaming platform. Fortunately, Apache Kafka includes the Connect API that enables streaming integration both in and out of Kafka. Like any technology, understanding its architecture and deployment patterns is key to successful use, as is knowing where to go looking when things aren’t working.
This talk will discuss the key design concepts within Kafka Connect and the pros and cons of standalone vs distributed deployment modes. We’ll do a live demo of building pipelines with Kafka Connect for streaming data in from databases, and out to targets including Elasticsearch. With some gremlins along the way, we’ll go hands-on in methodically diagnosing and resolving common issues encountered with Kafka Connect. The talk will finish off by discussing more advanced topics including Single Message Transforms, and deployment of Kafka Connect in containers.
Resources
-
☁️Confluent Cloud
Fully Managed Apache Kafka, Schema Registry, KSQL, and Connectors
-
📚Free eBooks
Free eBooks to download, including Kafka: The Definitive Guide.
- 💬 Confluent Community Slack group
-
👾 Demo code
All you need is Docker & Docker Compose!
- 🎥 Recording
- 📹Streaming data from Kafka to a Database with the JDBC Sink
- 📹 How to write streams & tables from ksqlDB to a database, enrich data, build aggregates, and more.
- 📹Tutorial : How to get data from Apache Kafka into S3 with Kafka Connect
-
🖼️No More Silos: Integrating Databases and Apache Kafka
The ins and outs of streaming data from RDBMS into Kafka, including how to choose between query-based CDC (JDBC Source connector) and log-based CDC (e.g. Debezium, GoldenGate, etc)
- 🖼️The Changing Face of ETL: Event-Driven Architectures for Data Engineers
- ✍️Kafka Connect Deep Dive – Converters and Serialization Explained
- ✍️Kafka Connect Deep Dive – Error Handling and Dead Letter Queues
- 📹How to install JDBC Driver for Kafka Connect JDBC connector