Kafka源码分析及图解原理之Producer端- 学习笔记- 青岛软件培训-选择一家好的青岛软件培训学校，就要看教学质量和口碑

目录一.前言一.开发应用 1.1 一个栗子 1.2 重要参数二.源码分析及图解原理 2.1 RecordAccumulator 2.2 Sender 三.幂等性producer 回到顶部一.前言　　任何消息队列都是万变不离其宗都是3部分，消息生产者（Producer）、消息消费者（Consumer）和服务载体（在Kafka中用Broker指代）。那么本篇主要讲解Producer端，会有适当的图解帮助理解底层原理。　回到顶部一.开发应用　　首先介绍一下开发应用，如何构建一个KafkaProducer及使用，还有一些重要参数的简介。 1.1 一个栗子复制代码 1 /** 2 * Kafka Producer Demo实例类。 3 * 4 * @author GrimMjx 5 */ 6 public class ProducerDemo { 7 public static void main(String[] args) throws ExecutionException, InterruptedException { 8 Properties prop = new Properties(); 9 prop.put("client.id", "DemoProducer"); 10 11 // 以下三个参数必须指定 12 // 用于创建与Kafka broker服务器的连接，集群的话则用逗号分隔 13 prop.put("bootstrap.servers", "localhost:9092"); 14 // 消息的key序列化方式 15 prop.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer"); 16 // 消息的value序列化方式 17 prop.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer"); 18 19 // 以下参数为可配置选项 20 prop.put("acks", "-1"); 21 prop.put("retries", "3"); 22 prop.put("batch.size", "323840"); 23 prop.put("linger.ms", "10"); 24 prop.put("buffer.memory", "33554432"); 25 prop.put("max.block.ms", "3000"); 26 27 KafkaProducer producer = new KafkaProducer(prop); 28 try { 29 // 异步发送，继续发送消息不用等待。当有结果返回时，callback会被通知执行 30 producer.send(new ProducerRecord("test", "key1", "value1"), 31 new Callback() { 32 // 返回结果RecordMetadata记录元数据包括了which partition的which offset 33 public void onCompletion(RecordMetadata metadata, Exception e) { 34 // 发送成功 35 if (e == null) { 36 System.out.println("The offset of the record we just sent is: " + metadata.offset()); 37 38 // 发送失败 39 } else { 40 if (e instanceof RetriableException) { 41 // 处理可重试的异常，比如分区leader副本不可用 42 // 一般用retries参数来设置重置，毕竟这里也没有什么其他能做的，也是同样的重试发送消息 43 } else { 44 // 处理不可重试异常 45 } 46 } 47 } 48 } 49 ); 50 51 // 同步发送，send方法返回Future，然后get。在没有返回结果一直阻塞 52 producer.send(new ProducerRecord("test", "key1", "value1")).get(); 53 54 } finally { 55 // producer运行的时候占用系统额外资源，最后一定要关闭 56 producer.close(); 57 } 58 } 59 } 复制代码　　注释已经写得十分详细了，参数的下面会说，这里就只说一下异步发送和同步发送。我们先看下KafkaProducer.send方法，可以看到返回的是一个Future，那么如何实现同步阻塞和异步非阻塞呢？同步阻塞：send方法返回Future，然后get。在没有返回结果一直阻塞，无限等待异步非阻塞：send方法提供callback，调用send方法后可以继续发送消息不用等待。当有结果返回时，callback会被通知执行 1.2 重要参数　　这里分析一下broker端的重要参数，前3个是必要参数。Kafka的文档真的很吊，可以看这个类，每个参数和注释都解释的十分详细：org.apache.kafka.clients.producer.ProducerConfig bootstrap.server（必要）：broker服务器列表，如果集群的机器很多，不用全配，producer可以发现集群中所有broker key.serializer/value.serializer（必要）：key和value的序列化方式。这两个参数都必须是全限定类名，可以自定义拓展。 acks：有3个值，0、1和all（-1） 0：produce不关心broker端的处理结果，吞吐量最高 1：produce发送消息给leader broker端，broker端写入本地日志返回结果，折中方案 all(-1)：配合min.insync.replicas使用，控制写入isr中的多少副本才算成功思考：如果当前集群中ISR副本小于min.insync.replicas会发生什么，消费者还能正常消费吗？stack overflow地址：https://stackoverflow.com/questions/57231185/does-min-insync-replicas-property-effects-consumers-in-kafka buffer.memory：producer启动会创建一个内存缓冲区保存待发送的消息，这部分的内存大小就是这个参数来控制的 commpression.type：压缩算法的选择，目前有GZIP、Snappy和LZ4。目前结合LZ4性能最好 retries：重试次数，0.11.0.0版本之前可能导致消息重发 batch.size：相同分区多条消息集合叫batch，当batch满了则发送给broker linger.ms：难道batch没满就不发了么？当然不是，不满则等linger.ms时间再发。延时权衡行为 max.request.size：控制发送请求的大小 request.timeout.ms：超过时间则会在回调函数抛出TimeoutException异常 partitioner.class：分区机制，可自定义，默认分区器的处理是：有key则用murmur2算法计算key的哈希值，对总分区取模算出分区号，无key则轮询 enable.idempotence：Apache Kafka 0.11.0.0版本用于实现EOS的利器回到顶部二.源码分析及图解原理 2.1 RecordAccumulator 　　上面介绍的参数中buffer.memory是缓冲区的大小，RecordAccmulator就是承担了缓冲区的角色。默认是32MB。　　还有上面介绍的参数中batch.size提到了batch的概念，在kafka producer中，消息不是一条一条发给broker的，而是多条消息组成一个ProducerBatch，然后由Sender一次性发出去，这里的batch.size并不是消息的条数（凑满多少条即发送），而是一个大小。默认是16KB，可以根据具体情况来进行优化。　　在RecordAccumulator中，最核心的参数就是： private final ConcurrentMap> batches; 　　它是一个ConcurrentMap，key是TopicPartition类，代表一个topic的一个partition。value是一个包含ProducerBatch的双端队列。等待Sender线程发送给broker。画张图来看下：　　再从源码角度来看如何添加到缓冲区队列里的，主要看这个方法：org.apache.kafka.clients.producer.internals.RecordAccumulator#append：　　注释写的十分详细了，这里需要思考一点，为什么分配内存的代码没有放在synchronized同步块里？看起来这里很多余，导致下面的synchronized同步块中还要tryAppend一下，因为这时候可能其他线程已经创建好RecordBatch了。造成多余的内存申请。但是仔细想想，如果把分配内存放在synchronized同步块会有什么问题？　　内存申请不到线程会一直等待，如果放在同步块中会造成一直不释放Deque队列的锁，那其他线程将无法对Deque队列进行线程安全的同步操作。那不是走远了？复制代码 1 /** 2 * Add a record to the accumulator, return the append result 3 *

4 * The append result will contain the future metadata, and flag for whether the appended batch is full or a new batch is created 5 *

6 * 7 * @param tp The topic/partition to which this record is being sent 8 * @param timestamp The timestamp of the record 9 * @param key The key for the record 10 * @param value The value for the record 11 * @param headers the Headers for the record 12 * @param callback The user-supplied callback to execute when the request is complete 13 * @param maxTimeToBlock The maximum time in milliseconds to block for buffer memory to be available 14 */ 15 public RecordAppendResult append(TopicPartition tp, 16 long timestamp, 17 byte[] key, 18 byte[] value, 19 Header[] headers, 20 Callback callback, 21 long maxTimeToBlock) throws InterruptedException { 22 // We keep track of the number of appending thread to make sure we do not miss batches in 23 // abortIncompleteBatches(). 24 appendsInProgress.incrementAndGet(); 25 ByteBuffer buffer = null; 26 if (headers == null) headers = Record.EMPTY_HEADERS; 27 try { 28 // check if we have an in-progress batch 29 // 其实就是一个putIfAbsent操作的方法，不展开分析 30 Deque dq = getOrCreateDeque(tp); 31 // batches是线程安全的，但是Deque不是线程安全的 32 // 已有在处理中的batch 33 synchronized (dq) { 34 if (closed) 35 throw new IllegalStateException("Cannot send after the producer is closed."); 36 RecordAppendResult appendResult = tryAppend(timestamp, key, value, headers, callback, dq); 37 if (appendResult != null) 38 return appendResult; 39 } 40 41 // we don't have an in-progress record batch try to allocate a new batch 42 // 创建一个新的ProducerBatch 43 byte maxUsableMagic = apiVersions.maxUsableProduceMagic(); 44 // 分配一个内存 45 int size = Math.max(this.batchSize, AbstractRecords.estimateSizeInBytesUpperBound(maxUsableMagic, compression, key, value, headers)); 46 log.trace("Allocating a new {} byte message buffer for topic {} partition {}", size, tp.topic(), tp.partition()); 47 // 申请不到内存 48 buffer = free.allocate(size, maxTimeToBlock); 49 synchronized (dq) { 50 // Need to check if producer is closed again after grabbing the dequeue lock. 51 if (closed) 52 throw new IllegalStateException("Cannot send after the producer is closed."); 53 54 // 再次尝试添加，因为分配内存的那段代码并不在synchronized块中 55 // 有可能这时候其他线程已经创建好RecordBatch了，finally会把分配好的内存还回去 56 RecordAppendResult appendResult = tryAppend(timestamp, key, value, headers, callback, dq); 57 if (appendResult != null) { 58 // 作者自己都说了，希望不要总是发生，多个线程都去申请内存，到时候还不是要还回去？ 59 // Somebody else found us a batch, return the one we waited for! Hopefully this doesn't happen often... 60 return appendResult; 61 } 62 63 // 创建ProducerBatch 64 MemoryRecordsBuilder recordsBuilder = recordsBuilder(buffer, maxUsableMagic); 65 ProducerBatch batch = new ProducerBatch(tp, recordsBuilder, time.milliseconds()); 66 FutureRecordMetadata future = Utils.notNull(batch.tryAppend(timestamp, key, value, headers, callback, time.milliseconds())); 67 68 dq.addLast(batch); 69 // incomplete是一个Set集合，存放不完整的batch 70 incomplete.add(batch); 71 72 // Don't deallocate this buffer in the finally block as it's being used in the record batch 73 buffer = null; 74 75 // 返回记录添加结果类 76 return new RecordAppendResult(future, dq.size() > 1 || batch.isFull(), true); 77 } 78 } finally { 79 // 释放要还的内存 80 if (buffer != null) 81 free.deallocate(buffer); 82 appendsInProgress.decrementAndGet(); 83 } 84 } 复制代码　　附加tryAppend()方法，不多说，都在代码注释里：复制代码 1 /** 2 * Try to append to a ProducerBatch. 3 * 4 * If it is full, we return null and a new batch is created. We also close the batch for record appends to free up 5 * resources like compression buffers. The batch will be fully closed (ie. the record batch headers will be written 6 * and memory records built) in one of the following cases (whichever comes first): right before send, 7 * if it is expired, or when the producer is closed. 8 */ 9 private RecordAppendResult tryAppend(long timestamp, byte[] key, byte[] value, Header[] headers, Callback callback, Deque deque) { 10 // 获取最新加入的ProducerBatch 11 ProducerBatch last = deque.peekLast(); 12 if (last != null) { 13 FutureRecordMetadata future = last.tryAppend(timestamp, key, value, headers, callback, time.milliseconds()); 14 if (future == null) 15 last.closeForRecordAppends(); 16 else 17 // 记录添加结果类包含future、batch是否已满的标记、是否是新batch创建的标记 18 return new RecordAppendResult(future, deque.size() > 1 || last.isFull(), false); 19 } 20 // 如果这个Deque没有ProducerBatch元素，或者已经满了不足以加入本条消息则返回null 21 return null; 22 } 复制代码　　以上代码见图解： 2.2 Sender 　　Sender里最重要的方法莫过于run()方法，其中比较核心的方法是org.apache.kafka.clients.producer.internals.Sender#sendProducerData 　　其中pollTimeout需要认真读注释，意思是最长阻塞到至少有一个通道在你注册的事件就绪了。返回0则表示走起发车了复制代码 1 private long sendProducerData(long now) { 2 // 获取当前集群的所有信息 3 Cluster cluster = metadata.fetch(); 4 // get the list of partitions with data ready to send 5 // @return ReadyCheckResult类的三个变量解释 6 // 1.Set readyNodes 准备好发送的节点 7 // 2.long nextReadyCheckDelayMs 下次检查节点的延迟时间 8 // 3.Set unknownLeaderTopics 哪些topic找不到leader节点 9 RecordAccumulator.ReadyCheckResult result = this.accumulator.ready(cluster, now); 10 // if there are any partitions whose leaders are not known yet, force metadata update 11 // 如果有些topic不知道leader信息，更新metadata 12 if (!result.unknownLeaderTopics.isEmpty()) { 13 // The set of topics with unknown leader contains topics with leader election pending as well as 14 // topics which may have expired. Add the topic again to metadata to ensure it is included 15 // and request metadata update, since there are messages to send to the topic. 16 for (String topic : result.unknownLeaderTopics) 17 this.metadata.add(topic); 18 this.metadata.requestUpdate(); 19 } 20 21 // 去除不能发送信息的节点 22 // remove any nodes we aren't ready to send to 23 Iterator iter = result.readyNodes.iterator(); 24 long notReadyTimeout = Long.MAX_VALUE; 25 while (iter.hasNext()) { 26 Node node = iter.next(); 27 if (!this.client.ready(node, now)) { 28 iter.remove(); 29 notReadyTimeout = Math.min(notReadyTimeout, this.client.connectionDelay(node, now)); 30 } 31 } 32 33 // 获取将要发送的消息 34 // create produce requests 35 Map> batches = this.accumulator.drain(cluster, result.readyNodes, 36 this.maxRequestSize, now); 37 38 // 保证发送消息的顺序 39 if (guaranteeMessageOrder) { 40 // Mute all the partitions drained 41 for (List batchList : batches.values()) { 42 for (ProducerBatch batch : batchList) 43 this.accumulator.mutePartition(batch.topicPartition); 44 } 45 } 46 47 // 过期的batch 48 List expiredBatches = this.accumulator.expiredBatches(this.requestTimeout, now); 49 boolean needsTransactionStateReset = false; 50 // Reset the producer id if an expired batch has previously been sent to the broker. Also update the metrics 51 // for expired batches. see the documentation of @TransactionState.resetProducerId to understand why 52 // we need to reset the producer id here. 53 if (!expiredBatches.isEmpty()) 54 log.trace("Expired {} batches in accumulator", expiredBatches.size()); 55 for (ProducerBatch expiredBatch : expiredBatches) { 56 failBatch(expiredBatch, -1, NO_TIMESTAMP, expiredBatch.timeoutException()); 57 if (transactionManager != null && expiredBatch.inRetry()) { 58 needsTransactionStateReset = true; 59 } 60 this.sensors.recordErrors(expiredBatch.topicParti