TimescaleDB can achieve compression of up to 98% for typical time-series data. This process requires a fundamentally different approach than the general-purpose algorithms used in OLTP databases. In TimescaleDB, this is managed by the hypercore engine — a hybrid row-columnar engine that employs specialized algorithms: delta encoding, delta-of-delta, Gorilla XOR, and run-length encoding.
Differences Between TimescaleDB Compression and PostgreSQL TOAST
PostgreSQL has a built-in mechanism called TOAST (The Oversized-Attribute Storage Technique), but TimescaleDB compression addresses a fundamentally different issue. TOAST deals with individual large values (long strings, jsonb, bytea), while TimescaleDB compression optimizes cross-row patterns in time-series data. The two mechanisms complement each other; TimescaleDB even uses TOAST internally as a fallback for certain data types.
Feature Comparison
| Feature | TOAST (Vanilla PostgreSQL) | TimescaleDB Hypercore |
|---|---|---|
| Design Goal | Individual values | Cross-row patterns |
| Trigger | Row exceeds TOAST_TUPLE_THRESHOLD (~2 KB) | Per-chunk policy (e.g., older than 7 days) |
| Supported Types | Variable-length only (text, jsonb, bytea, numeric) | All data types |
| Compression Algorithms | pglz (default), lz4 (since PG14, opt-in) | Combination: delta encoding, delta-of-delta, simple-8b, run-length encoding, XOR-based, dictionary compression |
| Compression Granularity | Per value (1 value = 1 byte stream) | Per batch (~1000 rows together) |
| Exploiting Data Structure | No | Yes |
Hypercore Engine and Columnar Compression
In TimescaleDB, compression is handled by an engine called hypercore. New data lands in Postgres row-based chunks (fast INSERTs and UPDATEs), while older chunks are automatically converted to a columnar, compressed format. Analytical queries that read this compressed data read fewer bytes and run faster. This conversion enables compression of up to 98%, significantly lowering storage costs in long data retention projects.
Example of Compression Using Delta Encoding
With delta encoding, you only need to store how much each value changed relative to the previous data point. For instance:
| time | machine_id | sensor_type | value |
|---|---|---|---|
| 12:00:00 | MACHINE_001 | temp | 72.5 |
| 12:00:00 | MACHINE_001 | speed | 2.0 |
| 12:00:05 | MACHINE_001 | temp | 72.7 |
| 12:00:05 | MACHINE_001 | speed | 2.1 |
In time-series data, certain values may repeat for a period. For example, if a temperature sensor reads 72.5 degrees for 10 minutes, delta-of-delta encoding can be applied.
How to Implement
To configure columnstore for IoT sensor monitoring:
ALTER TABLE iot_sensor_data SET (
timescaledb.compress,
timescaledb.segmentby = 'machine_id',
timescaledb.orderby = 'time DESC'
);
Blogger's Review:
The compression techniques employed by TimescaleDB provide robust support for handling large volumes of time-series data. The combination of flexible algorithms and columnar storage significantly enhances storage efficiency and query performance, making it especially crucial for IoT and big data analytics applications.