Class KafkaBolt<K,​V>

  • All Implemented Interfaces:
    Serializable, IBolt, IComponent, IRichBolt

    public class KafkaBolt<K,​V>
    extends BaseTickTupleAwareRichBolt
    Bolt implementation that can send Tuple data to Kafka.

    Most configuration for this bolt should be through the various setter methods in the bolt. For backwards compatibility it supports the producer configuration and topic to be placed in the storm config under

    'kafka.broker.properties' and 'topic'

    respectively.

    See Also:
    Serialized Form
    • Constructor Detail

      • KafkaBolt

        public KafkaBolt()
    • Method Detail

      • withTopicSelector

        public KafkaBolt<K,​V> withTopicSelector​(String topic)
        Set the messages to be published to a single topic.
        Parameters:
        topic - the topic to publish to
        Returns:
        this
      • withProducerProperties

        public KafkaBolt<K,​V> withProducerProperties​(Properties producerProperties)
      • withProducerCallback

        public KafkaBolt<K,​V> withProducerCallback​(PreparableCallback producerCallback)
        Sets a user defined callback for use with the KafkaProducer.
        Parameters:
        producerCallback - user defined callback
        Returns:
        this
      • prepare

        public void prepare​(Map<String,​Object> topoConf,
                            TopologyContext context,
                            OutputCollector collector)
        Description copied from interface: IBolt
        Called when a task for this component is initialized within a worker on the cluster. It provides the bolt with the environment in which the bolt executes.

        This includes the:

        Parameters:
        topoConf - The Storm configuration for this bolt. This is the configuration provided to the topology merged in with cluster configuration on this machine.
        context - This object can be used to get information about this task's place within the topology, including the task id and component id of this task, input and output information, etc.
        collector - The collector is used to emit tuples from this bolt. Tuples can be emitted at any time, including the prepare and cleanup methods. The collector is thread-safe and should be saved as an instance variable of this bolt object.
      • mkProducer

        protected org.apache.kafka.clients.producer.Producer<K,​V> mkProducer​(Properties props)
        Intended to be overridden for tests. Make the producer with the given props
      • declareOutputFields

        public void declareOutputFields​(OutputFieldsDeclarer declarer)
        Description copied from interface: IComponent
        Declare the output schema for all the streams of this topology.
        Parameters:
        declarer - this is used to declare output stream ids, output fields, and whether or not each output stream is a direct stream
      • cleanup

        public void cleanup()
        Description copied from interface: IBolt
        Called when an IBolt is going to be shutdown. Storm will make a best-effort attempt to call this if the worker shutdown is orderly. The Config.SUPERVISOR_WORKER_SHUTDOWN_SLEEP_SECS setting controls how long orderly shutdown is allowed to take. There is no guarantee that cleanup will be called if shutdown is not orderly, or if the shutdown exceeds the time limit.

        The one context where cleanup is guaranteed to be called is when a topology is killed when running Storm in local mode.

        Specified by:
        cleanup in interface IBolt
        Overrides:
        cleanup in class BaseRichBolt
      • setFireAndForget

        public void setFireAndForget​(boolean fireAndForget)
        If set to true the bolt will assume that sending a message to kafka will succeed and will ack the tuple as soon as it has handed the message off to the producer API if false (the default) the message will be acked after it was successfully sent to kafka or failed if it was not successfully sent.
        Parameters:
        fireAndForget - whether the bolt should fire and forget
      • setAsync

        public void setAsync​(boolean async)
        If set to true(the default) the bolt will not wait for the message to be fully sent to Kafka before getting another tuple to send.
        Parameters:
        async - true to have multiple tuples in flight to kafka, else false.