Apache BookKeeper

1. Basics

  • a distributed log storage service.

1.1. Key Features

  • provides high durability and availability for storing logs (write-ahead logs, transaction logs).
  • offers a scalable, fault-tolerant, and low-latency storage solution.
  • ensures consistent, ordered, and replicated logs.

1.2. Working Mechanism

bookkeeper separates storage and serving roles:

  • bookies: individual storage nodes (similar to datanodes in hadoop) responsible for storing log fragments.
  • metadata storage: tracks bookie locations and log segment metadata (zookeeper is commonly used).
  • clients: write and read entries to/from bookkeeper (e.g.,apache pulsar).

1.3. Use Cases

  • distributed messaging systems: providing durable and replicated storage for message streams.
  • write-ahead logging: ensuring data consistency and recovery for databases and other systems.
  • ledger storage: offering a reliable and performant foundation for distributed ledgers.

1.4. Advantages

  • high throughput and low latency: designed for high-performance logging operations.
  • scalability: easily scales horizontally by adding more bookie nodes.
  • durability and availability: replicated storage ensures data durability and availability.