Apache BookKeeper
Table of Contents
1. Basics
- a distributed log storage service.
1.1. Key Features
- provides high durability and availability for storing logs (write-ahead logs, transaction logs).
- offers a scalable, fault-tolerant, and low-latency storage solution.
- ensures consistent, ordered, and replicated logs.
1.2. Working Mechanism
bookkeeper separates storage and serving roles:
- bookies: individual storage nodes (similar to datanodes in hadoop) responsible for storing log fragments.
- metadata storage: tracks bookie locations and log segment metadata (zookeeper is commonly used).
- clients: write and read entries to/from bookkeeper (e.g.,apache pulsar).
1.3. Use Cases
- distributed messaging systems: providing durable and replicated storage for message streams.
- write-ahead logging: ensuring data consistency and recovery for databases and other systems.
- ledger storage: offering a reliable and performant foundation for distributed ledgers.
1.4. Advantages
- high throughput and low latency: designed for high-performance logging operations.
- scalability: easily scales horizontally by adding more bookie nodes.
- durability and availability: replicated storage ensures data durability and availability.