[Algorithms] Revolutionary Compression in TimescaleDB: Ac...

TimescaleDB can achieve compression of up to 98% for typical time-series data. This process requires a fundamentally different approach than the general-purpose algorithms used in OLTP databases. In TimescaleDB, this is managed by the hypercore engine — a hybrid row-columnar engine that employs specialized algorithms: delta encoding, delta-of-delta, Gorilla XOR, and run-length encoding.

Differences Between TimescaleDB Compression and PostgreSQL TOAST

PostgreSQL has a built-in mechanism called TOAST (The Oversized-Attribute Storage Technique), but TimescaleDB compression addresses a fundamentally different issue. TOAST deals with individual large values (long strings, jsonb, bytea), while TimescaleDB compression optimizes cross-row patterns in time-series data. The two mechanisms complement each other; TimescaleDB even uses TOAST internally as a fallback for certain data types.

Feature Comparison

Feature	TOAST (Vanilla PostgreSQL)	TimescaleDB Hypercore
Design Goal	Individual values	Cross-row patterns
Trigger	Row exceeds TOAST_TUPLE_THRESHOLD (~2 KB)	Per-chunk policy (e.g., older than 7 days)
Supported Types	Variable-length only (text, jsonb, bytea, numeric)	All data types
Compression Algorithms	pglz (default), lz4 (since PG14, opt-in)	Combination: delta encoding, delta-of-delta, simple-8b, run-length encoding, XOR-based, dictionary compression
Compression Granularity	Per value (1 value = 1 byte stream)	Per batch (~1000 rows together)
Exploiting Data Structure	No	Yes

Hypercore Engine and Columnar Compression

In TimescaleDB, compression is handled by an engine called hypercore. New data lands in Postgres row-based chunks (fast INSERTs and UPDATEs), while older chunks are automatically converted to a columnar, compressed format. Analytical queries that read this compressed data read fewer bytes and run faster. This conversion enables compression of up to 98%, significantly lowering storage costs in long data retention projects.

Example of Compression Using Delta Encoding

With delta encoding, you only need to store how much each value changed relative to the previous data point. For instance:

time	machine_id	sensor_type	value
12:00:00	MACHINE_001	temp	72.5
12:00:00	MACHINE_001	speed	2.0
12:00:05	MACHINE_001	temp	72.7
12:00:05	MACHINE_001	speed	2.1

In time-series data, certain values may repeat for a period. For example, if a temperature sensor reads 72.5 degrees for 10 minutes, delta-of-delta encoding can be applied.

How to Implement

To configure columnstore for IoT sensor monitoring:

ALTER TABLE iot_sensor_data SET (
  timescaledb.compress,
  timescaledb.segmentby = 'machine_id',
  timescaledb.orderby = 'time DESC'
);

Blogger's Review: