这篇文章主要内容是kafka论文里的设计和实现。 1. Kafka Architecture and Design Principles paper section 3 To balance load, a topic is divided into multiple partitions and each broker stores one or more of those partitions. 1.1 单个partition的效率 1.1.1 存储 Each partition of a topic corresponds to a logical log. Physically, a log is implemented as a set of segment files of approximately the same size (e.g., 1GB). broker将新产生的消息添加到最后的segment file,segment file flush到磁盘的操作在两种情况下进行: 接收到一定数量的消息 过了一定的时间后 消息只有在flush到磁盘后才会暴露给消费者。 Each broker keeps in memory a sorted list of offsets, including the offset of the first message in every segment file. 1.1.2 文件传输 文中说对于consumer请求的文件,从磁盘传输到socket要用4个步骤 (1) read data from the storage media to the page cache in an OS, (2) copy data in the page cache to an application buffer, (3) copy application buffer to another kernel buffer, (4) send the kernel buffer to the socket 然后其中包括了四次数据copy和两次系统调用,为了效率更高,...