Kafka
Kafka Connector Configuration Guide
To integrate with Kafka, navigate to the Set up the source section, provide a desired name for the Kafka connection, and choose Kafka from the list of available source types.
You will need the following parameters to complete the setup:
- Bootstrap Servers: Specify a list of Kafka brokers using the host:port format. This is required to establish the initial connection with the Kafka cluster.
- Message Format: Define how incoming Kafka messages should be decoded. Available options:
- JSON
- AVRO – If selecting Avro, additional configurations are necessary:
- Deserialization Strategy: Set the
subject name strategy to determine how the schema is selected from the registry.
- Schema Registry URL: The address of the schema registry server.
- Authentication Credentials (optional): Username and password if authentication is required for schema registry access.
- Communication Protocol: Determines how the connector interacts with Kafka brokers:
- PLAINTEXT: No encryption or authentication.
- SASL_PLAINTEXT: Authentication without encryption; requires a valid SASL JAAS configuration.
- SASL_SSL: Encrypted communication with authentication.
- SASL JAAS Config: Specifies how the connector should authenticate with the brokers.
- SASL Mechanism: Choose the correct authentication method supported by the brokers.
- OAUTHBEARER Token Endpoint (optional): If using OAUTHBEARER mechanism, provide the token URL (this does not apply to schema registry authentication).
- Subscription Type: Define how the Kafka connector subscribes to topics:
- Manual Assignment: Provide a list of topic-partition pairs (e.g.,
topic1:0,topic1:1), where each partition maps to a separate stream.
- Topic Pattern Matching: Use a regex-style pattern to subscribe to multiple topics dynamically; each match is treated as an individual stream.
Recommended Optional Settings
- Client ID: A logical identifier sent with broker requests. Useful for tracking request sources beyond IP/port, especially in server-side logs.
- Group ID: Identifier used to differentiate consumer groups (e.g.,
group.id). Essential for load balancing and parallelism.
- Test Topic: Used to verify whether the Kafka source can successfully consume messages (e.g.,
test.topic).
- Polling Interval: Time in milliseconds the connector waits when polling Kafka for messages during each sync cycle.