What is CDC?
Change data capture records inserts, updates, and deletes from a source system so downstream systems can stay in sync.
- Reads transaction log or events
- Supports near-real-time sync
- Needs ordering and deduplication
Quick recall
Change data capture records inserts, updates, and deletes from a source system so downstream systems can stay in sync.
Late arriving data shows up after the expected processing window and can break simple incremental assumptions.
Schema evolution is how a system handles added, removed, or changed fields over time without breaking downstream consumers.
Idempotent jobs can rerun safely without duplicating or corrupting data.
In a network partition, a distributed system must choose between strong consistency and availability.
Indexes speed up common queries, but they cost extra storage and slower writes.
Sharding splits data across multiple databases so one machine does not hold or serve everything.
Queues decouple producers from consumers so work can be retried, smoothed, and processed asynchronously.
A primary handles writes while replicas serve copied data, often improving read scale and availability.
Rate limiting controls how many requests a client can make in a window to protect system stability and fairness.
Scalability is how well a system handles growth, while latency is how quickly one request gets a response.
A CDN caches content closer to users so static or cacheable responses arrive with lower latency.