Skip to content
Tool

Redis Memory Sizer

Estimate what your keyspace actually costs: per-key overhead, encoding cliffs, fragmentation, fork headroom, replicas. An honest ±30% planning number — the page tells you how to measure the real one.

Dataset shape

The dominant top-level type. Mixed workloads: run the sizer per type and add.

Top-level keys (DBSIZE).

e.g. "session:8f3a…" ≈ 40 B.

≤44 B embeds in the object (embstr).

EXPIRE adds an expires-dict entry per key.

Deployment

Full copy each.

INFO memory; 1.2–1.5 typical.

Fork COW + growth; ≥30.

Estimate

Fits comfortably
Memory per instance (RSS + headroom)
722 MB
1.41 GB across the fleet with 1 replica
Dataset (≈ used_memory)
404 MB
Expected RSS (× fragmentation)
505 MB
Per key, all overheads in
424 B
Modelled encoding
n/a

~424 B per key (string) × 1,000,000 keys ≈ 0.39 GB dataset. With 1.25× fragmentation and 30% headroom you need ~0.71 GB per instance, 1.41 GB across the fleet. Estimate only (±30%) — sample real keys with MEMORY USAGE.

Operational advisories

  • Each of the 1 replica(s) holds a full dataset copy plus replication buffers — the fleet figure includes them, your cloud bill will too.
  • If this is a cache, set maxmemory near the dataset estimate and pick an eviction policy (allkeys-lru is the usual default). Out of the box Redis uses noeviction and returns write errors when full.

How the math works — and how wrong it can be

Unlike Postgres connections or Kafka partitions, Redis memory has no published closed-form formula. Real usage depends on jemalloc size classes, encoding transitions, and version. What Redis does document is the structure[Redis memory optimization]: every key carries dictionary-entry and object-header bookkeeping, small collections live in compact listpacks, and large ones convert to pointer-heavy encodings. The sizer models exactly that, with every constant visible:

per_key  = 64 B bookkeeping              (dict entry + robj + SDS headers)
         + key name bytes
         + 48 B if the key has a TTL    (expires-dict entry)
         + value cost

value cost:
  string      value ≤ 44 B → embedded (embstr), free
              value > 44 B → value + 16 B (raw SDS)
  collection  listpack:  elements × (11 B + element)
              hashtable: elements × (48–80 B + element)

dataset  = keys × per_key            (≈ used_memory — what maxmemory limits)
rss      = dataset × fragmentation   (jemalloc: typically 1.2–1.5)
instance = rss / (1 − headroom%)     (fork copy-on-write envelope)
fleet    = instance × (1 + replicas)

These constants are estimates, and the output is a ±30% planning number. The authoritative answer comes from your own server:

  • MEMORY USAGE key [SAMPLES n] — actual bytes for a real key, overheads included. Sample a few hundred representative keys and multiply.
  • INFO memoryused_memory vs used_memory_rss gives your real fragmentation ratio; stop guessing at 1.25.
  • DEBUG OBJECT / OBJECT ENCODING — confirms which encoding your collections actually use.

The two cliffs worth knowing by heart

  • The encoding cliff. A hash with 128 short fields is one contiguous listpack — cheap. At 129 fields (or one field over 64 B) the whole key converts to a hashtable where every field is a separate allocation with its own dict entry[Redis memory optimization]. Same data, several times the memory. The thresholds (hash-max-listpack-entries, hash-max-listpack-value, and the set/zset equivalents) are tunable — teams with object-cache workloads routinely raise them after measuring.
  • The fork cliff. BGSAVE, AOF rewrite, and replica full-sync all fork. Copy-on-write means the child shares pages until the parent writes to them — so under heavy write load, memory transiently grows toward 2×. Instances sized to fit the dataset exactly meet the OOM killer during their first background save.

Why small values have embarrassing overhead

The fixed ~112 B of per-key bookkeeping (64 B + 48 B TTL) doesn't care how small your value is. A 40-byte session token with a TTL costs ~190 B — the metadata outweighs the data 3:1. This is why the standard space optimisation is aggregating many small keys into hashes that stay listpack-encoded: the per-key overhead is paid once per hash instead of once per item[Redis memory optimization].

When this sizer is wrong

  • Modules — RedisJSON, RediSearch, Bloom filters, and vector indexes have their own memory models entirely. This sizer covers core data types only.
  • Redis Cluster — add per-slot bookkeeping and key-to-slot metadata; small per node, real at scale. Size each shard with this model, then verify on one shard before multiplying by sixteen.
  • Client and replication buffers — a slow replica or a SUBSCRIBE fan-out with lagging consumers can hold gigabytes in output buffers. That's workload, not keyspace; the sizer can't see it.
  • Version drift — encodings improved repeatedly (ziplist→listpack, embedded entries). Numbers here reflect current stable defaults; a 6.x cluster differs in the details.
  • Mixed workloads — one dominant type is assumed. Run the sizer per type and sum, or just sample with MEMORY USAGE per logical keyspace prefix.

Further reading

About this tool

This sizer is part of BackendBytes' reference tools collection. The model constants live in an open, unit-tested source file — when your MEMORY USAGE samples disagree with the estimate, trust the samples and tell us where the model drifted.