Quick recall mode

Keywords first, details second.

Use this mode when you want memory triggers only: title, summary, tags, and bullet anchors without long answers getting in the way.

Clear

Results update as you type. Press / to jump straight into search.

Quick recall

374 cards

ETL / Data Engineering Easy Theory

ETL vs ELT

ETL transforms before loading, while ELT loads raw data first and transforms inside the warehouse later.

  • ETL fits strict downstream schemas
  • ELT fits scalable warehouses
  • Tooling and cost model differ
ETL / Data Engineering Medium Theory

Kafka basics

Kafka is a distributed log used for durable event streaming and decoupled producers and consumers.

  • Topics store ordered partitions
  • Consumers track offsets
  • Great for event-driven pipelines
ETL / Data Engineering Easy Theory

Parquet vs CSV vs JSON

CSV is simple but weakly typed, JSON is flexible but verbose, and Parquet is compressed columnar storage optimized for analytics.

  • CSV is easy to inspect
  • JSON handles nested structure
  • Parquet is best for warehouse scans
ETL / Data Engineering Easy Theory

What is a data lake?

A data lake stores large volumes of raw or semi-structured data cheaply for later processing.

  • Raw and flexible storage
  • Schema can be applied later
  • Needs governance to avoid becoming messy