Kafka paper

这篇文章主要内容是kafka论文里的设计和实现。

1. Kafka Architecture and Design Principles

paper section 3
To balance load, a topic is divided into multiple partitions and each broker stores one or more of those partitions.

 1.1 单个partition的效率

1.1.1 存储

Each partition of a topic corresponds to a logical log. Physically, a log is implemented as a set of segment files of approximately the same size (e.g., 1GB).
broker将新产生的消息添加到最后的segment file,segment file flush到磁盘的操作在两种情况下进行:

  1. 接收到一定数量的消息
  2. 过了一定的时间后
消息只有在flush到磁盘后才会暴露给消费者。
Each broker keeps in memory a sorted list of offsets, including the offset of the first message in every segment file.

1.1.2 文件传输

文中说对于consumer请求的文件,从磁盘传输到socket要用4个步骤

(1) read data from the storage media to the page cache in an OS, (2) copy data in the page cache to an application buffer, (3) copy application buffer to another kernel buffer, (4) send the kernel buffer to the socket 

然后其中包括了四次数据copy和两次系统调用,为了效率更高,使用了Linux系统的库函数sendfile(),省去了2)3)步骤。这个函数不是很常用的吗?......

1.1.3 stateless broker

broker 不会知道每一个consumer消费了多少消息,新的消息有没有被所有的costumer消费。
broker通常在保存消息一段时间后会将它删除,通常是一周。

1.2 distributed coordination


Each producer can publish a message to either a randomly selected partition or a partition semantically determined by a partitioning key and a partitioning function. 



each message is delivered to only one of the consumers within the group. 



Our first decision is to make a partition within a topic the smallest unit of parallelism. This means that at any given time, all messages from one partition are consumed only by a single consumer within each consumer group.  Had we allowed multiple consumers to simultaneously consume a single partition, they would have to coordinate who consumes what messages, which necessitates locking and state maintenance overhead. In contrast, in our design consuming processes only need co-ordinate when the consumers rebalance the load, an infrequent event. In order for the load to be truly balanced, we require many more partitions in a topic than the consumers in each group.

The second decision that we made is to not have a central “master” node, but instead let consumers coordinate among themselves in a decentralized fashion.

Kafka uses Zookeeper for the following tasks: (1) detecting the addition and the removal of brokers and consumers, (2) triggering a rebalance process in each consumer when the above events happen, and (3) maintaining the consumption relationship and keeping track of the consumed offset of each partition.  

1.3 delivery guarantees

Kafka guarantees that messages from a single partition are delivered to a consumer in order. However, there is no guarantee on the ordering of messages coming from different partitions. 


评论